Thursday, March 4, 2010

LINUX KERNEL BOOTUP PROCESS:

Did you ever wondered what actually happens between the time we power on our PC to the time a login prompt is displayed? This time is the so called “booting process”or “bootstrapping”. You might have heard of that term before but here we are going to see in details what actually happens during bootstrapping.

Before we start with the booting process it is necessary to understand a few terms. The first of them is Real and Protected mode of the CPU. Early computers( Intel 8086) with 20 bit addresses where able to access only 1MB (2^20) of physical RAM. They used Intel's segmented memory addressing with 16-bit registers where each register can by itself refer to just 64KB of memory. The 20-bit memory address can be accessed only by using a segment register and offset register together . This is called real mode segmented memory model with only 1MB of addressable memory.
With 32-bit CPU's (starting Intel's 80386) memory address became 32-bit (total of 4GB of memory) long and the registers inside the CPU also became 32-bit so that each register can (but not allowed) by themselves refer to any 32-bit memory location. But here the registers are not allowed to directly refer to any memory addresses to enforce protection of memory. This is called protected mode flat model of memory with 4GB of addressable memory.

Bootstrapping includes loading of kernel image into RAM, initializing kernel data structures, executing the kernel and passing control to it.
Here for explanation we use the IBM PC with Intel X386 CPU and try to boot a Linux 2.2 kernel from a hard drive.

When we press the power on button a hardware circuit sets the RESET pin of the CPU , which causes some predefined values to be put in CS and EIP registers causing CPU to start executing the code present at 0xfffffff0. This address is mapped by the hardware to a ROM containing what is called BIOS (basic i/o system) which contain low-level procedures to interact with various devices like hard disk, display etc. BIOS is run in real mode where each memory address is a pair represented as segment:offset.
BIOS first performs a number of test (called POST, Power-On-Self-Test ) to probe which devices are present in the system. After POST it initializes various hardware devices thus found. Then it searches for operating system to boot depending upon the setting it starts with floppy , CDROM or hard disk.
Each hard drive in its first sector called master boot record (MBR) has a partition table and a small program that can load the first sector of the partition that contains the operating system to be booted.

A boot loader is a program that can load the kernel image of the OS to the RAM (may be from hard disk or CDROM) and prepare it for execution. For Linux on Intel system the most famous is LILO (LInux Loader) others include GRUB (GRand Unified Boot loader). LILO boot loader is a two-part program with the first part replacing the small program of MBR. The BIOS loads the small program of MBR ( here first part of LILIO) starting at 0x000070c00 and jumps at it. This program moves itself at 0x0009a000 , sets up the real mode stack and loads the second part of LILO starting at 0x0009b000.
This program searches all the operating systems present in the hard drive and provides a user options to boot into the available OS's.
Assuming Linux is to be booted , LILO will display the “Loading...” message using the BIOS routines, calls a BIOS routine to load the setup() code from the kernel image at 0x00090200. Invokes BIOS routines to load the rest of the kernel starting at either low address 0x00010000 for small kernel image (zImage) or high address 0x00100000 for big kernel image (bImage). Then it jumps to the setup() code.

Setup() function reinitializes all devices in the system even though they where previously initialized by BIOS as Linux does not rely on BIOS functions. Setup() finds out the amount of RAM available (using a BIOS function) , sets keyboard delay rate , initializes video adapter. If the kernel image was loaded low in the RAM ,then it moves it to 0x00001000. Sets up Global Descriptor Table (GTD) and Local DT ( LDT) (used for protected mode memory addressing) , sets up Interrupt Descriptor Table (IDT).
Switches the CPU from real mode to protected mode by setting a bit (PE) in the CPU status register (cr0).Jumps to a assembly language function startup_32(), in the source code hierarchy coded in the file arch/i386/boot/compressed/head.S.

Startup_32() function initializes segmentation registers and the uninitialized data area of the kernel executable ,it then executes the function decompress_kernel() which prints the message “Uncompressing Linux...” and the decompressed kernel is placed at the address 0x00100000. It then jumps to that address.
The address 0x00100000 is the executable code of another startup_32() function which is coded in the file arch/i386/kernel/head.S. This startup_32() function sets up the kernel mode stack , loads the registers gdtr and ldtr with the address of GDT and LDT. It then jumps to start_kernel() function.

Finally all kernel components are initialized by this start_kernel() routine. It initializes page tables ( call to paging_init() ), IDT by calling trap_init() and init_IRQ(), creates kernel thread for process 1 by calling kernel_thread() which in turn executes /sbin/init program . At the end the login prompt is displayed and by that time Linux kernel is up and running. Here the “bootstrapping” process is complete.
Now go play with your PC !!!

Mohan Gupta
CSE Final Year
NIT Jalandhar

1 comment:

  1. Its a great post... especially describing how Linux does not use BIOS routines and that is why it completely writes its own IDT...
    But could you kindly explain this to me..
    When an interrupt is needed, like from a device, it makes the IRQ line high and the CPU looks up and executes the BIOS routine(in case of DOS, which relies on BIOS for some interrupts). The IDT is present in the BIOS area so the CPU knows where to go..
    But in Linux, how does the kernel intercept an IRQ and makes the processor look-up its IDT, when the processor already has its own... I hope my question is clear...

    ReplyDelete