CS 498 Lecture 4 An Overview of Linux Kernel Linux Kernel How to memory space Memory Management Kernel Structure PDF archives DNS server University of Illinois Urbana search engine

Document Sample
scope of work template
							         CS 498 Lecture 4
An Overview of Linux Kernel Structure
                    Jennifer Hou
       Department of Computer Science
   University of Illinois at Urbana-Champaign

  Reading: Chapter 1-2, The Linux Networking Architecture: Design
    and Implementation of Network Protocols in the Linux Kernel
             Outline
Overview of the Kernel Structure
Activities in the Linux Kernel
Locking
Kernel Modules
/proc File System
Memory Management
Timing
        Structure of Linux Kernel
                                                                                  User
                       Applications and tools
                                                                                  space
                            System calls
  Process
                                                                                component
                Memory            File            Device
management                      systems
                                                                Network
               management                         drivers
Multitasking     Virtual                                        Network         Functionality
                             Files, directories Device access   functionality
                 memory
                              File system                       Network          Software
 Scheduler
Architecture   Memory            types          Character       protocols         support
  specific     manager                           devices
   code                         Block                            Network         Hardware
                               devices                           drivers          support

                              Hard disk                         Network
   CPU          RAM                             Terminals                        Hardware
                              CD, floppy                        adapter
Overview of the Kernel Structure
Process management
   The scheduler handles all the active, waiting, and
    blocked processes.
Memory management
   Is responsible for allocating memory to each
    process and for protecting allocated memory
    against access by other processes.
File system
   In UNIX, almost everything is handled over the file
    system interface.
   Device drivers can be addressed as files
   /proc file system allows us to access data and
    parameters in the kernel
Overview of the Kernel Structure
Device drivers
   Abstract from the underlying hardware and allow
    us to access the hardware with well-defined APIs
   The use of kernel modules allow device drivers to
    be dynamically loaded/unloaded
Networks
   Incoming packets are asynchronous events and
    have to be collected and identified, before a
    process can handle them.
   Most network operations cannot be allocated to a
    specific process. Instead, interrupts and timers
    are used extensively.
          Features of Linux Kernel
Is a Monolithic kernel
   The entire functionality is contained in one kernel.
   In contrast, in microkernels (e.g., Mach kernel and
    Windows NT), only memory management, IPC, and
    other hardware-related functions are contained in the
    kernel. The remaining functionality is moved to
    independent processes/threads running outside the OS.
   + accessing resources directly from within the kernel,
    avoiding expensive system calls and context switches.
   - OS becomes quite complex.
   - The development of new drivers is difficult because of
    the lack of appropriate interface definitions.
         Feature of Linux Kernel
A cure is the use of kernel modules
   Linux allows kernel modules to be dynamically
    loaded into (removed from) the kernel at run time.
   This is achieved with the use of well-defined
    interfaces, e.g., register_netdev(), register_chrdev(),
    register_blkdev().
   The components shown on dark backgrounds
    provide interfaces for dynamically registering new
    functionality.
   The run-time performance is guaranteed by
    having modules run in protected kernel mode.
Activities in the Linux Kernel
Activities – Processes and System Calls
 Processes operate exclusively in the user address
 space, and can only access the memory allocated to
 them.
    Violation leads to exceptions.
 When a process wants to access devices or use a
 functionality in the kernel  system call.
    The control is transferred to the kernel, which executes the
     system call on behalf of the user process.
 Processes can be interrupted voluntarily (wait on
 semaphore or sleep) or involuntarily (interrupt).
  Other Forms of Activities
Hardware interrupts (hardware IRQs)
Software interrupts (software IRQs)
Tasklets
                      Same Activity        Different
                                           Activities
 HW IRQ               No                   Yes
 Soft IRQ             Yes                  Yes
 Tasklet              No                   Yes

  Can a single activity or multiple instances of an activity be
  Execuated on multiple processors?
Can an Activity be Interrupted
    by Other Activities?
                 HW-IRQ   Soft-IRQ   Tasklet
  Hardware IRQ +/-        -          -
  Software IRQ   +        -          -
  Tasklet        +        -          -
  System call    +        +          +
  Process        +        +          +
Interrupts – Hardware IRQs
Peripherals use hardware interrupts to inform OS of events
(e.g., a packet has arrived at the network adapter)  an
interrupt handling routine is called.
The handling routine for a specific interrupt can be registered
(de-registered) by register_irq() (free_irq()).
Fast interrupts
    have a very short handling routine (that cannot be interrupted).
    Are specified by the flag SA_INTERRUPT in request_irq().
Slow interrupts
    Have a longer handling routine and can be interrupted by other
     interrupts during their execution.
in_irq() (include/asm/hardirq.h) can be used to check whether or
not the current activity is an interrupt-handling routine.
          Software Interrupts
Not every operation that needs to be executed in an
interrupt can be completed in a few instructions (e.g.,
a packet that arrives at a network adapter).
To keep interrupt handling short, the routine is
usually divided into two parts:
   Top-half: handles the most important tasks (e.g., copying the
    arrived packet to a kernel buffer queue waiting for detailed handling
    later)
   Bottom-half: handles non-time critical operations. It is being
    scheduled for execution right after the top half is executed (e.g.,
    when a packet arrives, the bottom half is run as a software interrupt
    NET_RX_SOFTIRQ).
          Software Interrupts
When a system call or a hardware interrupt terminates, the
scheduler calls do_softirq().
do_softirq() schedules software interrupts for execution.
A maximum of 32 software interrupts can be defined in Linux.
  NET_RX_SOFTIRQ and NET_TX_SOFTIRQ are two
    software interrupts.
Multiple software interrupts can run concurrently, and hence
need to be reentrant  If critical sections exist in a software
interrupt, they have to be portected by locks.
in_softirq() (include/asm/softirq.h) can be used to check whether
or not the current activity is a software interrupt.
                         Tasklets
A more formal mechanism of scheduling software
interrupts (and other tasks).
  The macro DECLARE_TASKLET(name, func,data)
        name: a name for the tasklet_struct data structure
        func: the tasklet’s handling routine.
        data: a pointer to private data to be passed to func().
   tasklet_schedule(&tasklet_struct) schedules a
    tasklet for execution.
   tasklet_disable() stops a tasklet from running, even
    if it has been scheduled for execution.
   tasklet_enable() reactivates a deactivated tasklet.
                     Tasklet Example
#include <linux/interrupt.h>
/* Handling routine of new tasklet */
void test_func(unsigned long);
/* Data of new tasklet */
char test_data[] = “Hello, I am a test tasklet”;
DECLARE_TASKLET(test_tasklet, test_func, (unsigned long) &test_data);
void test_func(unsigned long data)
{
         printk(KERN_DEBUG, “%s\n”, (char *) data);
}
….
tasklet_schedule(&test_tasklet);
Locking
            Bit Operations
test_and_set_bit(nr, void *addr) sets the bit in
position nr in the unsigned long variable pointed to by
addr. The previous value of the bit is returned.
test_and_clear_bit(nr, void *addr) clears the bit in
position nr in the variable pointed to by addr.
test_and_change_bit(nr, void *addr)
set_bit(nr, void *addr)
clear_bit(nr, void *addr)
change_bit(nr, void *addr)
test_bit(nr, void *addr)
                Locking -- spinlock
A mechanism for busy wait locks.
   spin_lock_init(&my_spinlock)
   spin_lock (spinlock_t *my_spinlock)
        Tries to set the spinlock my_spinlock. If it is not free, then
         wait or test until the lock is released.
   spin_unlock(spinlock_t *my_spinlock)
        Releases a lock.
   spin_is_lock(spinlock_t *my_lock) returns the
    current value of the lock (non-zero value  lock is
    set)
   spin_trylock(spinlock_t *my_lock) sets the
    spinlock, if it is currently unlocked; otherwise, the
    function returns a non-zero value.
                  Spinlock Example
#include <linux/spinlock.h>
spin_lock_init(&my_spinlock);
// One thread
spin_lock(&my_spinlock);
// Critical section
spin_unlock(&my_spinlock);
….
// Another thread
spin_lock(&my_spinlock);
// Critical section
spin_unlock(&my_spinlock);
          Read-Write Spinlocks
Some data structure, such as the list of registered
network devices (dev_base), does not change
frequently, but is subject to many read accesses 
use of read-write spinlock to improve run-time
performance.
read_lock(): if there is no lock or only read lock, then
the critical section can be immediately accessed. If
there is a write lock, then we have to wait.
read_unlock(): A read activity leaves the critical
section. If a write activity is waiting and there exists
no other read activity, it gains access.
write_lock(): if there is a (read/write) lock, we have to
wait; otherwise, we put an exclusive lock.
write_unlock()
Kernel Modules
                  Kernel Modules
   Each kernel module implements init_module() and
    cleanup_module().
   To load a kernel module into the kernel space
    manually, use insmod modulename.o [argument].
    In turns the following system calls are called:
        sys_create_module() allocates memory to
         accommodate the module in the kernel space.
        sys_get_kernel_syms() returns the kernel’s symbol
         table to resolve the missing references within the module
         to kernel symbols.
        sys_init_module() copies the module’s object code into
         the kernel address space and calls the module’s
         init_module().
        Insmod wvlan_cs eth=1 network_name=“mywavelan”
                Kernel Modules
rmmod modulename
   Removes the specified module from the kernel
    address space. In turn, the system call
    sys_delete_module() is called, which in turn calls
    cleanup_module().
lsmod lists all currently loaded modules and
their dependencies and reference counts.
modinfo gives the information about a
module. The information is set by the macros
MODULE_DESCRIPTION,
MODULE_AUTHOR in the module’s source.
                  #include
#include <linux/module.h> // Needed by all modules
#include <linux/kernel.h> // Needed for KERN_ALERT
#include <linux/init.h> // Needed for the macros
init_module and clear_module
 init_module(): runs all initialization tasks such
 as reserving memory, creating entries in the
 /proc directory, initializing data structures,
 registering and unregistering the functionality.
 cleanup_module() cleans up the work
 environment of the module (unregister the
 module’s functionality, free the memory it
 allocated, and remove the dependencies
 between the module and other parts of the
 kernel.
    Need to ensure the reference count for the module
     is zero.
       Module Usage Count
Linux keeps a usage count for every module
in order to determine whether the module can
be safely removed.
Three macros are defined in
<linux/module.h>
   MOD_INC_USE_COUNT: increment the count for
    the current module
   MOD_DEC_USE_COUNT: decrement the count
   MOD_IN_USE: evaluates to be true if the count is
    not zero.
Passing Module Parameters
MODULE_PARM(var, type) designates the variable
var as a parameter of the module, and a value can be
assigned to this parameter during loading. Possible
types are:
  b: byte; h: short (two bytes);
   i: integer; l: long; s: string.
MODULE_PARM_DESC(var, desc) adds a
description (desc) for the parameter var.
MODULE_DESCRIPTION(desc) contains a
description of the module.
EXPORT_SYMBOL(name) exports and adds a
function or variable of the kernel to the symbol table.
  Makefile for Kernel Modules
TARGET := mymodule
WARN := -W -Wall
INCLUDE := -isystem /lib/modules/`uname -r`/build/include
CFLAGS := -O -DMODULE -D__KERNEL__ ${WARN} ${INCLUDE}
CC := gcc-3.0
${TARGET}.o: ${TARGET}.c
.PHONY: clean
clean: rm -rf ${TARGET}.o


Check out the Linux Kernel Module Programming Guide
  for details
  http://www.tldp.org/LDP/lkmpg/2.4/html/index.html
                              printk
printk(KERN_INFO,”I am in trobule, guping up on %p\n”, ptr);
There are 8 possible log level:
    KERN_EMERG: Used for emergency messages.
    KERN_ALERT: A situation requiring immediate action.
    KERN_CRIT: Critical conditions related to serious hardware/software
     failure.
    KERN_ERR: Used to report error condition
    KERN_WARNING: Warnings about problematic situations that do not
     create serious problems.
    KERN_NOTICE: Situations that are normal, but still worthy of note.
    KERN_INFO: Informational message.
    KERN_DEBUG: Used for debugging message.
If the priority is less than the integer variable, console_loglevel,
the message is displayed on the console.
If both klogd and syslogd are running, kernel messages are
appended to /var/log/messages, independent of
console_loglevel.
Memory Management
Reserving/Releasing Memory In the Kernel

   kmalloc(size,priority): attempts to reserve
   consecutive memory space with a size of size
   bytes in the kernel memory.
       GFS_KERNEL: is used when the requesting
        activity can be interrupted during the reservation.
       GFS_ATOMIC: is used when the memory request
        should be atomic.
   kfree(objp): releases the memory space
   reserved at address objp
Reserving/Releasing Memory In the
             Kernel
copy_from_user(to, from, count) copies count
bytes from the address from in the user
address space to the address to in the kernel
address space.
copy_to_user(to,from,count) copies count
bytes from the address from in the kernel
address space to the address to in the user
address space.
access_ok() confirms the corresponding
virtual memory page is actually residing in the
physical memory.
                Memory Caches
Linux allows us to create a cache with memory
spaces of specific sizes  slab caches.
   kmem_cache_create(name, size, offset, flags, ctor,
    dtor) creates a slab cache of memory spaces with
    sizes in size bytes.
       name points to a string containing the name of the slab
        cache; offset is usually set to null.
       flags specifies additional options, e.g.,
        SLAB_HWCACHE_ALIGN (aligns to the size of the first
        level cache in the CPU)
       ctor, dtor: specifies a constructor and a destructor for the
        memory spaces used to initialize or clean up the reserved
        memory spaces.
   Example: skbuff_head_cache = kmem_cache_create
    (“skbuffer_head_cache”, sizeof(struct sk_buff), 0,
    SLAB_HWCACHE_ALIGN, skb_headerinit, NULL).
          Memory Caches
kmem_cache_destroy(cachep): releases the slab
cache cachep.
kmem_cache_shrink(cachep): is called by the kernel
when the kernel itself requires memory space and
has to reduce the cache.
kmem_cache_alloc(cachep,flags): is used to request
a memory space from the slab cache, cachep. If the
slab cache is empty, then kmalloc() is used to
reserve new memory space.
kmem_cache_free(cachep, ptr): frees the meory
space that begins at ptr, and gives it back to the
cache, cachep.

						
Related docs