CS 498 Lecture 4 An Overview of Linux Kernel Linux Kernel How to memory space Memory Management Kernel Structure PDF archives DNS server University of Illinois Urbana search engine
Document Sample


CS 498 Lecture 4
An Overview of Linux Kernel Structure
Jennifer Hou
Department of Computer Science
University of Illinois at Urbana-Champaign
Reading: Chapter 1-2, The Linux Networking Architecture: Design
and Implementation of Network Protocols in the Linux Kernel
Outline
Overview of the Kernel Structure
Activities in the Linux Kernel
Locking
Kernel Modules
/proc File System
Memory Management
Timing
Structure of Linux Kernel
User
Applications and tools
space
System calls
Process
component
Memory File Device
management systems
Network
management drivers
Multitasking Virtual Network Functionality
Files, directories Device access functionality
memory
File system Network Software
Scheduler
Architecture Memory types Character protocols support
specific manager devices
code Block Network Hardware
devices drivers support
Hard disk Network
CPU RAM Terminals Hardware
CD, floppy adapter
Overview of the Kernel Structure
Process management
The scheduler handles all the active, waiting, and
blocked processes.
Memory management
Is responsible for allocating memory to each
process and for protecting allocated memory
against access by other processes.
File system
In UNIX, almost everything is handled over the file
system interface.
Device drivers can be addressed as files
/proc file system allows us to access data and
parameters in the kernel
Overview of the Kernel Structure
Device drivers
Abstract from the underlying hardware and allow
us to access the hardware with well-defined APIs
The use of kernel modules allow device drivers to
be dynamically loaded/unloaded
Networks
Incoming packets are asynchronous events and
have to be collected and identified, before a
process can handle them.
Most network operations cannot be allocated to a
specific process. Instead, interrupts and timers
are used extensively.
Features of Linux Kernel
Is a Monolithic kernel
The entire functionality is contained in one kernel.
In contrast, in microkernels (e.g., Mach kernel and
Windows NT), only memory management, IPC, and
other hardware-related functions are contained in the
kernel. The remaining functionality is moved to
independent processes/threads running outside the OS.
+ accessing resources directly from within the kernel,
avoiding expensive system calls and context switches.
- OS becomes quite complex.
- The development of new drivers is difficult because of
the lack of appropriate interface definitions.
Feature of Linux Kernel
A cure is the use of kernel modules
Linux allows kernel modules to be dynamically
loaded into (removed from) the kernel at run time.
This is achieved with the use of well-defined
interfaces, e.g., register_netdev(), register_chrdev(),
register_blkdev().
The components shown on dark backgrounds
provide interfaces for dynamically registering new
functionality.
The run-time performance is guaranteed by
having modules run in protected kernel mode.
Activities in the Linux Kernel
Activities – Processes and System Calls
Processes operate exclusively in the user address
space, and can only access the memory allocated to
them.
Violation leads to exceptions.
When a process wants to access devices or use a
functionality in the kernel system call.
The control is transferred to the kernel, which executes the
system call on behalf of the user process.
Processes can be interrupted voluntarily (wait on
semaphore or sleep) or involuntarily (interrupt).
Other Forms of Activities
Hardware interrupts (hardware IRQs)
Software interrupts (software IRQs)
Tasklets
Same Activity Different
Activities
HW IRQ No Yes
Soft IRQ Yes Yes
Tasklet No Yes
Can a single activity or multiple instances of an activity be
Execuated on multiple processors?
Can an Activity be Interrupted
by Other Activities?
HW-IRQ Soft-IRQ Tasklet
Hardware IRQ +/- - -
Software IRQ + - -
Tasklet + - -
System call + + +
Process + + +
Interrupts – Hardware IRQs
Peripherals use hardware interrupts to inform OS of events
(e.g., a packet has arrived at the network adapter) an
interrupt handling routine is called.
The handling routine for a specific interrupt can be registered
(de-registered) by register_irq() (free_irq()).
Fast interrupts
have a very short handling routine (that cannot be interrupted).
Are specified by the flag SA_INTERRUPT in request_irq().
Slow interrupts
Have a longer handling routine and can be interrupted by other
interrupts during their execution.
in_irq() (include/asm/hardirq.h) can be used to check whether or
not the current activity is an interrupt-handling routine.
Software Interrupts
Not every operation that needs to be executed in an
interrupt can be completed in a few instructions (e.g.,
a packet that arrives at a network adapter).
To keep interrupt handling short, the routine is
usually divided into two parts:
Top-half: handles the most important tasks (e.g., copying the
arrived packet to a kernel buffer queue waiting for detailed handling
later)
Bottom-half: handles non-time critical operations. It is being
scheduled for execution right after the top half is executed (e.g.,
when a packet arrives, the bottom half is run as a software interrupt
NET_RX_SOFTIRQ).
Software Interrupts
When a system call or a hardware interrupt terminates, the
scheduler calls do_softirq().
do_softirq() schedules software interrupts for execution.
A maximum of 32 software interrupts can be defined in Linux.
NET_RX_SOFTIRQ and NET_TX_SOFTIRQ are two
software interrupts.
Multiple software interrupts can run concurrently, and hence
need to be reentrant If critical sections exist in a software
interrupt, they have to be portected by locks.
in_softirq() (include/asm/softirq.h) can be used to check whether
or not the current activity is a software interrupt.
Tasklets
A more formal mechanism of scheduling software
interrupts (and other tasks).
The macro DECLARE_TASKLET(name, func,data)
name: a name for the tasklet_struct data structure
func: the tasklet’s handling routine.
data: a pointer to private data to be passed to func().
tasklet_schedule(&tasklet_struct) schedules a
tasklet for execution.
tasklet_disable() stops a tasklet from running, even
if it has been scheduled for execution.
tasklet_enable() reactivates a deactivated tasklet.
Tasklet Example
#include <linux/interrupt.h>
/* Handling routine of new tasklet */
void test_func(unsigned long);
/* Data of new tasklet */
char test_data[] = “Hello, I am a test tasklet”;
DECLARE_TASKLET(test_tasklet, test_func, (unsigned long) &test_data);
void test_func(unsigned long data)
{
printk(KERN_DEBUG, “%s\n”, (char *) data);
}
….
tasklet_schedule(&test_tasklet);
Locking
Bit Operations
test_and_set_bit(nr, void *addr) sets the bit in
position nr in the unsigned long variable pointed to by
addr. The previous value of the bit is returned.
test_and_clear_bit(nr, void *addr) clears the bit in
position nr in the variable pointed to by addr.
test_and_change_bit(nr, void *addr)
set_bit(nr, void *addr)
clear_bit(nr, void *addr)
change_bit(nr, void *addr)
test_bit(nr, void *addr)
Locking -- spinlock
A mechanism for busy wait locks.
spin_lock_init(&my_spinlock)
spin_lock (spinlock_t *my_spinlock)
Tries to set the spinlock my_spinlock. If it is not free, then
wait or test until the lock is released.
spin_unlock(spinlock_t *my_spinlock)
Releases a lock.
spin_is_lock(spinlock_t *my_lock) returns the
current value of the lock (non-zero value lock is
set)
spin_trylock(spinlock_t *my_lock) sets the
spinlock, if it is currently unlocked; otherwise, the
function returns a non-zero value.
Spinlock Example
#include <linux/spinlock.h>
spin_lock_init(&my_spinlock);
// One thread
spin_lock(&my_spinlock);
// Critical section
spin_unlock(&my_spinlock);
….
// Another thread
spin_lock(&my_spinlock);
// Critical section
spin_unlock(&my_spinlock);
Read-Write Spinlocks
Some data structure, such as the list of registered
network devices (dev_base), does not change
frequently, but is subject to many read accesses
use of read-write spinlock to improve run-time
performance.
read_lock(): if there is no lock or only read lock, then
the critical section can be immediately accessed. If
there is a write lock, then we have to wait.
read_unlock(): A read activity leaves the critical
section. If a write activity is waiting and there exists
no other read activity, it gains access.
write_lock(): if there is a (read/write) lock, we have to
wait; otherwise, we put an exclusive lock.
write_unlock()
Kernel Modules
Kernel Modules
Each kernel module implements init_module() and
cleanup_module().
To load a kernel module into the kernel space
manually, use insmod modulename.o [argument].
In turns the following system calls are called:
sys_create_module() allocates memory to
accommodate the module in the kernel space.
sys_get_kernel_syms() returns the kernel’s symbol
table to resolve the missing references within the module
to kernel symbols.
sys_init_module() copies the module’s object code into
the kernel address space and calls the module’s
init_module().
Insmod wvlan_cs eth=1 network_name=“mywavelan”
Kernel Modules
rmmod modulename
Removes the specified module from the kernel
address space. In turn, the system call
sys_delete_module() is called, which in turn calls
cleanup_module().
lsmod lists all currently loaded modules and
their dependencies and reference counts.
modinfo gives the information about a
module. The information is set by the macros
MODULE_DESCRIPTION,
MODULE_AUTHOR in the module’s source.
#include
#include <linux/module.h> // Needed by all modules
#include <linux/kernel.h> // Needed for KERN_ALERT
#include <linux/init.h> // Needed for the macros
init_module and clear_module
init_module(): runs all initialization tasks such
as reserving memory, creating entries in the
/proc directory, initializing data structures,
registering and unregistering the functionality.
cleanup_module() cleans up the work
environment of the module (unregister the
module’s functionality, free the memory it
allocated, and remove the dependencies
between the module and other parts of the
kernel.
Need to ensure the reference count for the module
is zero.
Module Usage Count
Linux keeps a usage count for every module
in order to determine whether the module can
be safely removed.
Three macros are defined in
<linux/module.h>
MOD_INC_USE_COUNT: increment the count for
the current module
MOD_DEC_USE_COUNT: decrement the count
MOD_IN_USE: evaluates to be true if the count is
not zero.
Passing Module Parameters
MODULE_PARM(var, type) designates the variable
var as a parameter of the module, and a value can be
assigned to this parameter during loading. Possible
types are:
b: byte; h: short (two bytes);
i: integer; l: long; s: string.
MODULE_PARM_DESC(var, desc) adds a
description (desc) for the parameter var.
MODULE_DESCRIPTION(desc) contains a
description of the module.
EXPORT_SYMBOL(name) exports and adds a
function or variable of the kernel to the symbol table.
Makefile for Kernel Modules
TARGET := mymodule
WARN := -W -Wall
INCLUDE := -isystem /lib/modules/`uname -r`/build/include
CFLAGS := -O -DMODULE -D__KERNEL__ ${WARN} ${INCLUDE}
CC := gcc-3.0
${TARGET}.o: ${TARGET}.c
.PHONY: clean
clean: rm -rf ${TARGET}.o
Check out the Linux Kernel Module Programming Guide
for details
http://www.tldp.org/LDP/lkmpg/2.4/html/index.html
printk
printk(KERN_INFO,”I am in trobule, guping up on %p\n”, ptr);
There are 8 possible log level:
KERN_EMERG: Used for emergency messages.
KERN_ALERT: A situation requiring immediate action.
KERN_CRIT: Critical conditions related to serious hardware/software
failure.
KERN_ERR: Used to report error condition
KERN_WARNING: Warnings about problematic situations that do not
create serious problems.
KERN_NOTICE: Situations that are normal, but still worthy of note.
KERN_INFO: Informational message.
KERN_DEBUG: Used for debugging message.
If the priority is less than the integer variable, console_loglevel,
the message is displayed on the console.
If both klogd and syslogd are running, kernel messages are
appended to /var/log/messages, independent of
console_loglevel.
Memory Management
Reserving/Releasing Memory In the Kernel
kmalloc(size,priority): attempts to reserve
consecutive memory space with a size of size
bytes in the kernel memory.
GFS_KERNEL: is used when the requesting
activity can be interrupted during the reservation.
GFS_ATOMIC: is used when the memory request
should be atomic.
kfree(objp): releases the memory space
reserved at address objp
Reserving/Releasing Memory In the
Kernel
copy_from_user(to, from, count) copies count
bytes from the address from in the user
address space to the address to in the kernel
address space.
copy_to_user(to,from,count) copies count
bytes from the address from in the kernel
address space to the address to in the user
address space.
access_ok() confirms the corresponding
virtual memory page is actually residing in the
physical memory.
Memory Caches
Linux allows us to create a cache with memory
spaces of specific sizes slab caches.
kmem_cache_create(name, size, offset, flags, ctor,
dtor) creates a slab cache of memory spaces with
sizes in size bytes.
name points to a string containing the name of the slab
cache; offset is usually set to null.
flags specifies additional options, e.g.,
SLAB_HWCACHE_ALIGN (aligns to the size of the first
level cache in the CPU)
ctor, dtor: specifies a constructor and a destructor for the
memory spaces used to initialize or clean up the reserved
memory spaces.
Example: skbuff_head_cache = kmem_cache_create
(“skbuffer_head_cache”, sizeof(struct sk_buff), 0,
SLAB_HWCACHE_ALIGN, skb_headerinit, NULL).
Memory Caches
kmem_cache_destroy(cachep): releases the slab
cache cachep.
kmem_cache_shrink(cachep): is called by the kernel
when the kernel itself requires memory space and
has to reduce the cache.
kmem_cache_alloc(cachep,flags): is used to request
a memory space from the slab cache, cachep. If the
slab cache is empty, then kmalloc() is used to
reserve new memory space.
kmem_cache_free(cachep, ptr): frees the meory
space that begins at ptr, and gives it back to the
cache, cachep.
Related docs
Get documents about "