This section covers first the mechanisms provided by the 386for handling system calls, and then shows how Linux uses thosemechanisms. This is not a reference to the individual systemcalls: There are very many of them, new ones are addedoccasionally, and they are documented in man pages that shouldbe on your Linux system.
The 386 recognizes two event classes: exceptions andinterrupts. Both cause a forced context switch to new aprocedure or task. Interrupts can occur at unexpected timesduring the execution of a program and are used to respond tosignals from hardware. Exceptions are caused by the executionof instructions.
Two sources of interrupts are recognized by the 386:Maskable interrupts and Nonmaskable interrupts. Two sources ofexceptions are recognized by the 386: Processor detectedexceptions and programmed exceptions.
Each interrupt or exception has a number, which is referredto by the 386 literature as the vector. The NMI interrupt andthe processor detected exceptions have been assigned vectors inthe range 0 through 31, inclusive. The vectors for maskableinterrupts are determined by the hardware. External interruptcontrollers put the vector on the bus during theinterrupt-acknowledge cycle. Any vector in the range 32through 255, inclusive, can be used for maskable interrupts orprogrammed exceptions. Here is a listing of all the possibleinterrupts and exceptions:
0 | divide error |
---|---|
1 | debug exception |
2 | NMI interrupt |
3 | Breakpoint |
4 | INTO-detected Overflow |
5 | BOUND range exceeded |
6 | Invalid opcode |
7 | coprocessor not available |
8 | double fault |
9 | coprocessor segment overrun |
10 | invalid task state segment |
11 | segment not present |
12 | stack fault |
13 | general protection |
14 | page fault |
15 | reserved |
16 | coprocessor error |
17-31 | reserved |
32-255 | maskable interrupts |
The priority of simultaneous interrupts and exceptions is:
HIGHEST | Faults except debug faults |
---|---|
. | Trap instructions INTO, INT n, INT 3 |
. | Debug traps for this instruction |
. | Debug traps for next instruction |
. | NMI interrupt |
LOWEST | INTR interrupt |
Under Linux the execution of a system call is invoked by amaskable interrupt orexception class transfer, causedby the instructionint 0x80. We use vector 0x80 totransfer control to the kernel. This interrupt vector isinitialized during system startup, along with other importantvectors like the system clock vector.
iBCS2 requries anlcall 0,7 instruction, whichLinux can send to the iBCS2 compatibility module appropriate ifan iBCS2-compliant binary is being executed. In fact, Linuxwill assume that an iBCS2-compliant binary is being executed ifanlcall 0,7 call is executed, and will automaticallyswitch modes.
As of version 0.99.2 of Linux, there are 116 system calls.Documentation for these can be found in the man (2) pages. Whena user invokes a system call, execution flow is as follows:
For example, the setuid system call is coded as
_syscall1(int,setuid,uid_t,uid);
which will expand to:
_setuid: subl $4,%exp pushl %ebx movzwl 12(%esp),%eax movl %eax,4(%esp) movl $23,%eax movl 4(%esp),%ebx int $0x80 movl %eax,%edx testl %edx,%edx jge L2 negl %edx movl %edx,_errno movl $-1,%eax popl %ebx addl $4,%esp retL2: movl %edx,%eax popl %ebx addl $4,%esp retThe macro definition for thesyscallX() macroscan be found in /usr/include/linux/unistd.h, and the user-spacesystem call library code can be found in /usr/src/libc/syscall/
Actual code forsystem_call entry point can be foundin /usr/src/linux/kernel/sys_call.S Actual code for many of thesystem calls can be found in /usr/src/linux/kernel/sys.c, andthe rest are found elsewhere.find is your friend.
Thestartup_32() code found in/usr/src/linux/boot/head.S starts everything off by callingsetup_idt(). This routine sets up an IDT (InterruptDescriptor Table) with 256 entries. No interrupt entry pointsare actually loaded by this routine, as that is done only afterpaging has been enabled and the kernel has been moved to0xC0000000. An IDT has 256 entries, each 4 bytes long, for atotal of 1024 bytes.Whenstart_kernel() (found in /usr/src/linux/init/main.c) iscalled it invokestrap_init() (found in/usr/src/linux/kernel/traps.c).trap_init() sets up the IDT viathe macroset_trap_gate() (found in /usr/include/asm/system.h).trap_init() initializes the interrupt descriptor table as shownhere:
0 | divide_error |
---|---|
1 | debug |
2 | nmi |
3 | int3 |
4 | overflow |
5 | bounds |
6 | invalid_op |
7 | device_not_available |
8 | double_fault |
9 | coprocessor_segment_overrun |
10 | invalid_TSS |
11 | segment_not_present |
12 | stack_segment |
13 | general_protection |
14 | page_fault |
15 | reserved |
16 | coprocessor_error |
17 | alignment_check |
18-48 | reserved |
Copyright (C) 1993, 1996 Michael K. Johnson, johnsonm@redhat.com.
Copyright (C) 1993 Stanley Scalsky
wrong file for system_call code by Tim Bird
would be nice to explain syscall macros by Tim Bird
wrong file for syscallX() macro by Tim Bird
the directory /usr/src/libc/syscall/ byvijay gupta
...no longer exists. byMichael K. Johnson
the solution to the problem by Vijay Gupta