How does the Kernel handle executables?
When you type ./foo
at a shell, the krnel needs to figure out what kind of file it’s dealing with and how to execute it. The process differs significantly between binaries and scripts.
execve()
Both ELF binaries and scripts start the same way. The execve()
system call lands in the kernel and eventually calls search_binary_handler()
(here is an interesting article about it). This function iterates through registered binary format handlers until it finds one that can handle the file.
Let’s see what handlers are available:
➜ ~ ls /proc/sys/fs/binfmt_misc/
register status
The kernel tries each handler in sequence, passing the file’s first 128 bytes to help with format detection. In this case, only the built-in handlers are active (like binfmt_elf and binfmt_script). The binfmt_misc system allows registering additional handlers at runtime.
Detection: the magic bytes
Both formats use magic bytes for identification, but they serve very different purposes.
For ELF binaries:
➜ ~ hexdump -C /bin/ls | head -1
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
The 0x7f ELF
signature tells the kernel: “I’m a native binary format that you can execute directly.”
A simple compiled hello C program looks the same:
#include <unistd.h>
int main() {
write(1, "hello", 5);
return 0;
}
After compiling it:
➜ ~ hexdump -C hello | head -1
00000000 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00 |.ELF............|
For scripts:
➜ ~ head -1 ./hello.sh
#!/bin/bash
The #!
signature tells the kernel: “I’m not native code, and you need an interpreter to run me.”
ELF files are executed directly
When binfmt_elf
1 recognizes an ELF file, it performs direct execution:
- Parse the ELF headers to understand the binary structure
- Set up memory mappings for code, data, and stack segments
- Load the program into memory at the correct virtual addresses
- Configure CPU registers with the entry point and stack pointer
- Jump to the program’s entry point
The kernel speaks ELF natively. It knows exactly how to transform the file into a running process.
We can observe this process:
➜ ~ strace -e execve ./hello
execve("./hello", ["./hello"], 0x7fff8b5c7870 /* 67 vars */) = 0
hello
+++ exited with 0 +++
One execve()
call, direct execution.
Scripts are executed in recursion
Scripts follow a more complex path. When binfmt_script
detects a shebang, it doesn’t execute the script directly. Instead, it:
- Parses the shebang line to extract the interpreter path
- Modifies the argument list - removes the original
argv[0]
and inserts:- The interpreter program name
- The script filename
- Original arguments (shifted down)
- Updates the binary descriptor to point to the interpreter instead of the script
- Calls search_binary_handler() recursively to handle the interpreter
Let’s trace this:
➜ ~ strace -e execve ./hello.sh
execve("./hello.sh", ["./hello.sh"], 0x7fff8b5c7870 /* 67 vars */) = 0
hello
+++ exited with 0 +++
Still one execve()
from user space perspective, but the kernel did extra work internally.
How does the recursion work?
Here’s what happens inside the kernel when you run a Python script:
Why this design?
The kernel could theoretically implement interpreters for every scripting language directly. Instead, it uses composition:
- ELF binaries: The kernel executes them directly because it understands the format
- Scripts and miscelaneous: Delegated to ELF binary interpreters that understand specific formats
This keeps the kernel simple while providing unlimited extensibility. Want to run Java bytecode? Write an ELF binary (JVM) that interprets it. Want to run WebAssembly? Write an ELF binary (WASM runtime) that handles it.