frn.sh

How does the Kernel handle executables?

When you type ./foo at a shell, the krnel needs to figure out what kind of file it’s dealing with and how to execute it. The process differs significantly between binaries and scripts.

execve()

Both ELF binaries and scripts start the same way. The execve() system call lands in the kernel and eventually calls search_binary_handler() (here is an interesting article about it). This function iterates through registered binary format handlers until it finds one that can handle the file.

Let’s see what handlers are available:

➜ ~ ls /proc/sys/fs/binfmt_misc/
register  status

The kernel tries each handler in sequence, passing the file’s first 128 bytes to help with format detection. In this case, only the built-in handlers are active (like binfmt_elf and binfmt_script). The binfmt_misc system allows registering additional handlers at runtime.

Detection: the magic bytes

Both formats use magic bytes for identification, but they serve very different purposes.

For ELF binaries:

➜ ~ hexdump -C /bin/ls | head -1
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|

The 0x7f ELF signature tells the kernel: “I’m a native binary format that you can execute directly.”

A simple compiled hello C program looks the same:

#include <unistd.h>

int main() {
        write(1, "hello", 5);
        return 0;
}

After compiling it:

➜ ~  hexdump -C hello | head -1
00000000  7f 45 4c 46 02 01 01 03  00 00 00 00 00 00 00 00  |.ELF............|

For scripts:

➜ ~ head -1 ./hello.sh
#!/bin/bash

The #! signature tells the kernel: “I’m not native code, and you need an interpreter to run me.”

ELF files are executed directly

When binfmt_elf1 recognizes an ELF file, it performs direct execution:

  1. Parse the ELF headers to understand the binary structure
  2. Set up memory mappings for code, data, and stack segments
  3. Load the program into memory at the correct virtual addresses
  4. Configure CPU registers with the entry point and stack pointer
  5. Jump to the program’s entry point

The kernel speaks ELF natively. It knows exactly how to transform the file into a running process.

We can observe this process:

➜ ~ strace -e execve ./hello
execve("./hello", ["./hello"], 0x7fff8b5c7870 /* 67 vars */) = 0
hello
+++ exited with 0 +++

One execve() call, direct execution.

Scripts are executed in recursion

Scripts follow a more complex path. When binfmt_script detects a shebang, it doesn’t execute the script directly. Instead, it:

  1. Parses the shebang line to extract the interpreter path
  2. Modifies the argument list - removes the original argv[0] and inserts:
    • The interpreter program name
    • The script filename
    • Original arguments (shifted down)
  3. Updates the binary descriptor to point to the interpreter instead of the script
  4. Calls search_binary_handler() recursively to handle the interpreter

Let’s trace this:

➜ ~ strace -e execve ./hello.sh
execve("./hello.sh", ["./hello.sh"], 0x7fff8b5c7870 /* 67 vars */) = 0
hello
+++ exited with 0 +++

Still one execve() from user space perspective, but the kernel did extra work internally.

How does the recursion work?

Here’s what happens inside the kernel when you run a Python script:

alt

Why this design?

The kernel could theoretically implement interpreters for every scripting language directly. Instead, it uses composition:

This keeps the kernel simple while providing unlimited extensibility. Want to run Java bytecode? Write an ELF binary (JVM) that interprets it. Want to run WebAssembly? Write an ELF binary (WASM runtime) that handles it.