System Calls
A system call is the controlled gateway between user code and kernel code. It’s the only legitimate way a process can ask the OS to do something privileged.
What Is a System Call?
System calls are the interface between user programs and the kernel. When your program
calls read(), write(), fork(), or mmap(), it’s ultimately issuing a system call —
a request to the kernel to perform a privileged operation on its behalf.
The Mechanism on x86-64
On modern x86-64 Linux, system calls use the syscall instruction:
; Example: write(1, "hi\n", 3)
mov rax, 1 ; syscall number (SYS_write)
mov rdi, 1 ; fd = stdout
mov rsi, msg ; pointer to buffer
mov rdx, 3 ; length
syscall ; trap to kernel
; rax = return value (bytes written, or -errno)
The syscall instruction switches the CPU to ring 0, saves user state (RIP, RFLAGS, RSP),
and jumps to the kernel’s syscall entry point (stored in the LSTAR MSR).
The Linux ABI
The x86-64 Linux ABI specifies:
- Syscall number in
rax - Arguments in
rdi,rsi,rdx,r10,r8,r9(in order) - Return value in
rax(negative errno on error)
The syscall instruction itself clobbers rcx and r11 (used internally for RIP and RFLAGS).
What the Kernel Does
On entry to the kernel’s syscall handler, it validates the syscall number,
looks up the handler in sys_call_table[], and dispatches to it.
The handler runs in kernel context with full privileges.
The Cost
A bare syscall round-trip costs roughly 100–300 ns on modern hardware —
much more than a function call. This is why vDSO exists: for read-only operations
like gettimeofday(), Linux maps kernel data directly into user space to
avoid the trap entirely.
Open Questions
- How does
seccompfilter syscalls, and what’s its overhead? - What changed in the syscall path after Spectre/Meltdown (KPTI)?