FreeBSD Boot and Initialization Sequence by xfz11675

VIEWS: 28 PAGES: 54

									FreeBSD Boot and Initialization
         Sequence


           Bobby Bhattacharjee
            David Hovemeyer
             Rob Sherwood
   http://www.cs.umd.edu/projects/shrug
                      Outline
/   Background
/   Boot code
/   Low level initialization - locore.s
/   Kernel initialization - init_main.c
/   Conclusion
                     Notes
/   Some of this information might be wrong
/   Please correct me if you spot an error
/   Please ask questions
/   Information pertains to FreeBSD 3.4-
    RELEASE unless otherwise noted
                Background
/   x86 registers and assembly
/   x86 real mode
/   x86 protected mode
/   ELF
        x86 registers and assembly
/   General registers
    /   eax - used for return value
    /   ebx - "base register"
    /   ecx - used for count in string and loop instructions
    /   edx - destination for string operations
/   Each may accessed as 8 or 16 bit registers
    /   E.g.: al, ah, ax
        x86 registers and assembly
/   General registers (continued)
    /   esi - source index for string instructions
    /   edi - desination index for string instructions
    /   ebp - base pointer (i.e., frame pointer)
    /   esp - stack pointer
        x86 registers and assembly
/   Segment registers
    /   cs - code segment
    /   ds - data segment
    /   es, fs, gs - extra data segments
    /   ss - stack segment
/   Other registers
    /   eflags - flags register
    /   eip - instruction pointer
        x86 registers and assembly
/   System registers
    /   cr0 - system flags register (enable paging, cache
        control)
    /   cr1 - reserved (on 486 anyway)
    /   cr2 - page fault linear address
    /   cr3 - page directory base address
    /   cr4 - virtual memory extensions
        x86 registers and assembly
/   System registers (continued)
    /   Descriptor tables
         /   gdtr - address and limit of GDT
         /   idtr - address and limit of IDT
         /   ldtr - index in GDT of LDT descriptor
         /   tr - index in GDT of current task descriptor
    /   These registers only used in protected mode; will
        discuss later
        x86 registers and assembly
/   Instruction formats
    /   Most instructions modify one of the operands
    /   Most instructions support
         /   register/register
         /   register/memory
         /   register/immediate
         /   memory/immediate
    /   Lots of complex addressing modes
        x86 registers and assembly
/   Assembly syntax
    /   FreeBSD uses the GNU assembler
    /   Mnemonics are slightly different than those
        defined by Intel
         /   Use of suffix to designate operand size: addb, addw,
             addl
    /   Destination operand is last
    /   Registers preceded by `%'
    /   Constants preceded by `$'
        x86 registers and assembly
/   Assembly syntax (continued)
    /   Indirection through register specified by parens
    /   Indirection is assumed for symbols (must use `$'
        prefix to get an address)
     x86 registers and assembly
/   Examples:
addl $1, %eax     # add 1 to contents of %eax
addl $1, (%eax)   # add 1 to dword addressed by %eax
addl (%eax), %ebx # add contents of dword addressed
          #     by %eax to %ebx
addl $1, 10(%eax) # add 1 to dword at address %eax+10
addl var, %eax    # add contents of var to %eax
movl $var, %eax    # store address of var in %eax
        x86 registers and assembly
/   Special instructions
    /   cli - disable interrupts
    /   sti - enable interrupts
    /   cld - clear direction flag
         /   direction flag used in string instructions
    /   bswap - byte swapping (endian conversion)
    /   cmpxchg - atomic compare and exchange
           x86 registers and assembly
/    Test and conditional instructions
       /   Tests modify the flags register as a side-effect
            /   e.g., carry bit, zero bit
       /   Conditional branches use a flag bit as input
       /   Example:
    cmpl %eax, var       # compare %eax with variable
    jl $label       # branch to label if %eax is less
                  x86 real mode
/   Real mode refers to compatibility with 16 bit
    Intel CPUs (8086, 80286)
/   All x86 CPUs start in real mode
    /   The system BIOS only works in real mode
    /   So, the boot code has to work in real mode
                 x86 real mode
/   Segmented memory
    /   Memory addresses formed by 16 bit segment and
        offset: address = (segment<<4) + offset
    /   Effectively gives a 20 bit address space
    /   Awkward to work with objects larger than a
        segment (64KB)
    /   No paging or memory protection
                  x86 real mode
/   Segment registers
    /   cs - code segment
    /   ds - data segment (default segment for data
        accesses)
    /   es, fs, gs - "extra" data segments, may be
        accessed through a "segment override"
    /   ss - stack segment
                   x86 real mode
/   The GNU assembler has very limited support
    for targeting real mode
    /   Can do so to a limited extent by putting a prefix
        code in front of every instruction
    /   FreeBSD boot code uses special hacks to work
        around this limitation (m4 macros to "hand-
        assemble" some instructions)
             x86 protected mode
/   The 386 and higher CPUs have a "protected
    mode"
    /   32 bit memory addresses
    /   VM paging
    /   memory protection
/   The 286 has a 16 bit protected mode, but
    nobody cares
              x86 protected mode
/   Protected mode changes the meaning of the
    segment registers
    /   Instead of a segment address, the segment
        address is (essentially) an index into a descriptor
        table (GDT or LDT)
    /   Contents of segment registers referred to as
        "selectors"
                x86 protected mode
/   Descriptor tables
       /   GDT - Global Descriptor Table (required)
       /   LDT - Local Descriptor Table (optional)
/   The GDT and LDT can contain several types
    of descriptors
       /   memory segments (code, data, or stack)
            /   may define base address and limit
            /   most OSes just set base=0 and limit=4GB (flat addressing)
       /   task segments (a.k.a. TSS)
       /   task gates, call gates
                x86 protected mode
/   Privileges
    /   There is a fancy privilege model that determines
        which segments may be accessed
    /   Current privilege level (CPL) == privilege level of
        current code segment
         /   0 is most privileged, 3 is least
         /   Most OSes use 0 for kernel code, 3 for user code
         /   Privilege levels 1 and 2 could be used for more fine-
             grained protection (e.g., device drivers)
                 x86 protected mode
/   Interrupt handling in protected mode
    /   Interrupts can result from:
         /   hardware devices
         /   software interrupts (i.e., the int instruction)
         /   processor exceptions and faults
    /   In protected mode, the IDTR stores the address
        and size of the interrupt table
    /   Each entry in the table contains a selector and 32
        bit offset
                x86 protected mode
/   Interrupt handling in protected mode
    (continued)
    /   The selector refers to an entry in the GDT
         /   Generally, it will be a code segment (with the offset
             specifying the ISR)
         /   However, it could also reference an interrupt gate, trap
             gate, or task
                x86 protected mode
/   Entering protected mode
    /   Construct a valid GDT and IDT, load GDTR and
        IDTR
         /   GDT must define code, data, and stack segments
    /   Set PE bit in CR0 register
    /   Jump to a valid code address in a code segment
                x86 protected mode
/   Paging
    /   386 and higher CPUs support paging
    /   Enable by setting PG bit in CR0 register
    /   Three levels
         /   page directory: physical addresses of page tables
             (CR3 register)
         /   page table: physical addresses of pages
         /   pages
                                ELF
/   ELF - Executable and Linking Format
    /   Defines format for object files, shared libraries,
        and executables
    /   The default format for FreeBSD (3.0 and up?)
    /   ELF files consist of sections and segments
         /   Sections are parts of the file with a distinct purpose
             (code, data, read-only data, init code, etc.)
         /   Segments describe how the sections are to be loaded
             into memory by the program loader
                                ELF
/   Why ELF?
    /   FreeBSD (along with Linux and most other Unix
        variants) started out with a.out
    /   a.out has some shortcomings
         /   no standard way of defining initialization and cleanup
             code (e.g., C++ static constructors and destructors)
         /   no standard shared library mechanism
    /   ELF fixes these problems
    /   ELF is well-supported by the GNU compiler tools
                          ELF
/   Structure of an ELF file
    /   ELF header
    /   Program header (optional)
    /   Sections
    /   Section table (optional)
                            ELF
/   Kinds of sections
    /   .text - program code
    /   .data - read/write initialized data
    /   .rodata - read-only initialized data
    /   .bss - uninitialized (zero-filled) data
    /   .init, .fini - initialization and cleanup code
    /   also symbol tables and relocation sections
    /   the format allows arbitrary sections
                                ELF
/   Segments
    /   Provide a concise way to specify how an
        executable file should be loaded into memory
    /   Defined by the ELF program header
    /   Typically allows an executable to be loaded and
        mapped in two operations
         /   Text and read-only data
         /   Read/write data and uninitialized data
                Boot code
/   Issues
/   FreeBSD boot loader
    /   boot0
    /   boot1
    /   boot2
    /   boot3
                      Boot code
/   Tasks of the boot loader
    /   load the kernel image into memory
    /   perform enough hardware initialization to get it
        running
/   Issues
    /   In real mode, only lowest 1MB memory can be
        addressed
    /   BIOS can only be used in real mode
    /   kernel images often larger than 1MB
                     Boot code
/   FreeBSD boot process
    /   Four stages
    /   Each stage successively more complex and
        powerful
    /   Represents a very general solution to the
        problem of OS loading
                         Boot code
/   boot0
    /   1 sector in size (512 bytes)
    /   real mode assembly code
    /   installed in master boot record
         /   Presents a menu, reads and executes boot sector of
             chosen partition
    /   sys/boot/i386/boot0/boot0.s
                     Boot code
/   boot1
    /   1 sector in size (512 bytes)
    /   real mode assembly code
    /   Loads second stage boot loader (boot2) and
        executes it. boot2 assumed to reside in
        consecutive sectors.
    /   sys/boot/i386/boot2/boot1.s
                           Boot code
/   boot2
    /   16 sectors, 8KB in size
    /   Located in the partition's disklabel
    /   32 bit protected mode code
         /   Built on top of BTX, the "Boot Time eXecutive" - a mini
             operating system
         /   Allows switches to real mode to access BIOS (VM86)
    /   Can load files directly from the filesystem
         /   Can load kernel directly, or (typically) the third stage
             loader
    /   sys/boot/i386/boot2/boot2.c (and other files)
                           Boot code
/   boot3 (a.k.a. "/boot/loader")
    /   Can be any executable in the filesystem
         /   no size restriction
    /   32 bit code, written in C
         /   Like boot2, implemented on top of BTX
    /   Implemented as a forth interpreter
         /   the loader is really forth code making calls down to C
             and BTX
         /   Can load kernel image and kernel modules from the
             filesystem
         /   sys/boot/i386/loader/main.c (and other files)
Low level initialization - locore.s
/   locore.s is the first code executed that is
    actually part of the kernel
    /   sys/i386/i386/locore.s
/   Preconditions:
    /   The boot code has loaded the kernel image into
        memory, and entered protected mode
    /   reasonable GDT and IDT exist
         /   code, data, and stack segments set up for 32 bit flat
             addressing (physical memory - no VM paging)
    /   interrupts are disabled
Low level initialization - locore.s
/   What is locore.s trying to do?
    /   Perform low-level hardware initialization
         /   initialize virtual memory / page tables
    /   Set up hardware context for process 0
         /   the proc structure
         /   the user structure
Low level initialization - locore.s
/   Memory issues:
    /   The boot code loads the kernel into physical
        memory at address 1MB, but the kernel is linked
        at a high address
    /   So, locore.s must translate kernel addresses into
        physical addresses until VM can be initialized (to
        map the kernel at the expected address)
    /   The purpose of the R() macro (line 161) is to
        perform this translation
Low level initialization - locore.s
/   Overview of locore.s
    /   Put sane values in registers (line 239)
    /   Identifies the CPU (line 285)
    /   Creates page tables (for transition to VM paging,
        line 321), enables paging (line 343)
         /   Now kernel is running at the expected (virtual) address
Low level initialization - locore.s
/   Overview of locore.s (continued)
/   Creates process 0 user data structure (line
    365)
/   call init386() (line 376)
    /   Defined in machdep.c
    /   Builds GDT, IDT, and LDT
         /   Nice description of the GDT at machdep.c, line 956
    /   Probes physical memory
    /   Creates process 0's proc struct
Low level initialization - locore.s
/   Creates a "fork template" stack frame (line
    381)
    /   This is used to fork all other processes
    /   The frame pointer is passed to the main() routine
    /   details?
/   Postconditions of locore.s:
    /   VM and hardware state initialized
    /   some of process 0 state initialized
    /   Now we can call main()! (line 390)
Kernel initialization - init_main.c
/   init_main.c overview
    /   Gets the kernel running by initializing all
        subsystems
    /   Creates process 0 (scheduler)
    /   Mounts the root filesystem
    /   Creates process 1 (init)
Kernel initialization - init_main.c
/   Sysinits
    /   A FreeBSD (and other BSD?) abstraction for
        system initialization tasks
    /   Three kinds:
         /   functions
         /   kernel threads
         /   processes
Kernel initialization - init_main.c
/   Sysinits (continued)
    /   Defined by the SYSINIT(), SYSINIT_KT(), and
        SYSINIT_KP( ) macros (in sys/sys/kernel.h)
    /   The linker collects all of the sysinits into a "linker
        set" (a big array of ptrs to structures)
    /   The main() routine sorts the sysinits and executes
        them
         /   Sorting ensures that init tasks happen in the correct
             order
    /   sysinit_sub_id enumeration (kernel.h, line 104)
         /   Gives an overview of the order of initialization tasks
Kernel initialization - init_main.c
/   Process 0
    /   Must be created before any other processes can
        be forked
    /   Becomes the scheduler process
    /   proc0_init() (line 352), proc0_post() (line 488)
Kernel initialization - init_main.c
/   Process 1- init
    /   This is the first user mode process
    /   Parent of all user mode processes
    /   Typically created from "/sbin/init" in the root
        filesystem
    /   start_init() (line 608)
Kernel initialization - init_main.c
/   main() routine postconditions
    /   All kernel initialization has taken place
    /   All that remains is to start the scheduler (which
        can then start executing user processes)
    /   I'm not sure exactly how and where this happens
                Conclusions
/   This has been a high level overview of x86
    architecture, FreeBSD booting and
    initialization
                   Conclusions
/   Ideas for future SHRUG talks (volunteers?):
    /   the PC BIOS
    /   PC interrupts and DMA
    /   MMU, VM, paging
    /   scheduling, task switch mechanics
    /   Linux booting and initialization
                   Conclusions
/   References:
    /   McKusick et. al., The Design and Implementation
        of the 4.4 BSD Operating System
    /   Tom Shanley, Protected Mode Software
        Architecture
    /   Intel, Intel Architecture Software Developer's
        Manual
    /   John Levine, Linkers and Loaders
    /   The FreeBSD source code
    /   http://www.cs.umd.edu/projects/shrug

								
To top