Deivce Drivers

Document Sample
Deivce Drivers Powered By Docstoc
					                             Device Driver Programming
This unit introduces the Device Driver and issues on their programming:

You will learn:

       Device Drivers
       Device Classes
       The /dev directory
       Major/Minor Numbers
       Registering a device
       Basic Device driver
       Fops tables
       Data Read/Write
       Transferring data
       Critical Areas
       Blocking I/O

Imagine you purchased a card that does some I/O to peripheral devices like printers, plotters,
analog devices etc. Normally you have to write „C‟ or Assembly code for programming the
registers of this card that is linked to your program code. But every program that uses this piece
of code has to be linked with it. The most elegant way to avoid this is to extract the hardware
specific code from your program and make it to the part of your operating system. The specific
code is called via well defined interface routines that look very similar for each problem. With
this method the specific code is invisible to the user program.

The most operating system uses this concept for their system-calls: The arguments of a call are
put on the stack and then a “trap” routine is called. The operating system jumps to the so called
“trap handler” that takes this arguments from the stack and does something with it.

Device drivers take on a special role in the kernel. They are distinct “black boxes” that make a
particular piece of hardware respond to a well-defined internal programming interface; they hide
completely the details of how the device works. User activities are performed by means of a set
of standardized calls that are independent of the specific driver; mapping those calls to device
specific operations that act on a real hardware is the role of device driver. This programming
interface is such that drivers can be built separately from the rest of the kernel, and „plugged in‟
at runtime when needed.

Linux uses the file system for this purpose. The driver looks like a normal file that you can open,
read from and write to. The kernel sees this operation as special requests and maps it to the
appropriate calls in the driver code.

Splitting the Kernel

Linux kernel‟s role is split into various tasks. The following diagram shows the position of
device driver. For each type of device, the device driver directly interacts with the kernel. A
driver usually implements at least the open, close, read and write systems calls.

                         A split view of the kernel

                               The System Call Interface
                               The System Call Interface

       Process              Memory
                             Memory                         Device
                                                            Device                         Kernel
                                         File System
                                          File System                      Networking
                                                                           Networking      subsystems
    management             management
                           management                       control

      Concurrency,            Virtual    Files and dirs:   Ttys &           Connectivity   Features
      multitasking            memory        the VFS        device access                   implemented
                                           File system
                                             File system    Character         Network
                                                             Character         Network
                                                Types        devices         subsystem
                                                              devices         subsystem
                                         Block devices
                                          Block devices                      IF drivers
                                                                              IF drivers


                              Memory     Disks & CDs       Consoles,        Network
      CPU                                                                   interfaces
       features implemented as modules

Classes of Devices

There are mainly three classes of devices:

Character Devices: A character device is one that can be accessed as a stream of bytes (like a
file), a char driver is in charge of implementing this behavior. Console, serial ports, parallel port
etc are examples of char devices.

Block Devices: Block devices are accessed by filesystem nodes. A block device is something
that can host a file system, such as disk. HDD, FDD, CD-ROM etc. are example of block

Network Devices: A network transaction is made through an interface that is, a device that is
able to exchange data with other hosts. For example, Network Interface Card (NIC) is a
hardware device.

The Driver and the file system

From the user‟s side of view the driver looks like an ordinary file. If you operate on this file via
open, close, read or write requests the kernel looks up the appropriate functions in your driver
All drivers provide a set of routines. Each device has a struct char_fops that holds the pointers to
this routine. At init time of the driver this struct is hooked into another table where the kernel can
find it. The index that is used to dereference this set of routines is called MAJOR number and is
unique so that a definite distinction is possible. The special inode file for the driver gets its
MAJOR number at creation time via the mknod command. The driver code gets its MAJOR by
the register_chrdev kernel routine.
Every time a file routine is called on a driver-inode, the inode and the file struct is passed to this
routine, the kernel dereferences the appropriate routine from the char_fops struct by its MAJOR
number and calls it. The inode and the file struct itself are passed to the called fops-routines and
can be used to get specific information of the caller-process for example. The MINOR number,
that is also set by the mknod command, can be used to configure a special behaviour of the
driver (e.g. rewind or norewind on tapes) or to distinct sub-devices (as it is done by the tty

The general look of a Driver

The structure of a driver is similar for each peripheral device:

   You have an init routine that is for initializing your hardware, perhaps setting memory from
    the kernel and hooking your driver-routines into the kernel.
   You have a char_fops struct that is initialized with those routines that you will provide for
    your device. This struct is the key to the kernel it is „registered‟ by the register_chrdev
   Mostly you have open and release routines that are called whenever you perform a open or
    close on your special inode.
   You can have routines for reading and writing data from or to your driver, a ioctl routine that
    can perform special commands to your driver like config requests or options.
   You have the possibility to readout the kernel environment string to configure your driver via
   An interrupt routine can be registered if your hardware support this.

Compile your Driver into kernel code

The most drivers in Linux are linked to the kernel at compile time. That means if you want to
add a driver you have to put your .c and .h files directly somewhere in the kernel source path and
rebuild the kernel.

Dynamically loaded drivers

The kernel module can be loaded into the kernel at runtime. That means loaded and removed at
any time after the boot process. The only differences between a loadable Module and a kernel
linked driver are a special init() routine that is called when the module is loaded into the kernel
and a cleanup routine that is called when the module is removed.
The typical purpose of this two routines is to register the device and irq or get memory
(init_module) and release/free it if the module is removed from memory (cleanup_module).
The programs for loading and removing Modules are present in the modutils package.

Character Device Drivers
A character device is one that can be accessed as a stream of bytes (like a file), a char driver is in
charge of implementing this behavior. Such a driver usually implements at least open, read, and
write systems calls. Console, serial ports, parallel port etc are examples of char devices.

Driver Registration

int register_chrdev (unsigned int major, const char *name,
                     struct file_operations &fops);

int unregister_chrdev(unsigned int major, const char* name);

Important Files: /proc/devices

Major & Minor Numbers

Char devices are accessed through names in the file system called special devices or device files
or nodes.

Conventionally located in /dev directory.

Identified by „c‟ in the first column of ls -l.

Two comma separated numbers in file size field denote major and minor.

major number identifies the driver associated with the device.

minor number is only used by the driver to differentiate several devices controlled / managed
/driven by it.

Excerpt from the LINUX major numbers list

Major                       Character Devices                          Block Devices
0            unnamed for NFS,network and so on
1                           Memory devices (mem)               RAM disk
2                                                              Floppy disks (fd*)
3                                                              IDE hard disks
4                               Terminals
5                               Terminals & AUX
6                               Parallel interfaces
7                               Virtual consoles (vcs*)
8                                                              SCSI hard disks (sd*)
9                               SCSI tapes
10                              Bus mice (bm, psaux)

11                                                             SCSI CD-ROM
12                               QIC02 tape
13                               PC speaker driver                 XT 8-bit hard disks(xd*)
14                               Sound cards                       BIOS hard disk support
15                               Joystick                          Cdu31a/33a CD-ROM
16, 17, 18            not used
19                               Cyclades drivers                   Double compressing driver
20                               Cyclades drivers
21                               SCSI generic
22                                                             2nd IDE interface driver
23                                                             Mitsumi CD-ROM (mcd*)
24                                                             Sony 535 CD-ROM
25                                                             Matsushita CD-ROM 1
26                                                             Matsushita CD-ROM 2
27                               QIC117 tape                   Matsushita CD-ROM 3
28                                                             Matsushita CD-ROM 4
29                               Frame buffer drivers          Other CD-ROMs
30                               iCBS2                         Philips LMS-205 CD-ROM

mknod creates device nodes in file system tree.

Syntax: mknod PATH/DEVICE TYPE                 MAJOR MINOR

Major numbers can be allocated dynamically or statically


Capabilities of a device ( supported by a driver) are made known by filling a „file-operations‟
struct and passing it to „register_chrdev‟.

This is an array of function pointers.

Kernel uses this „struct‟ to access driver‟s functions

„struct file-operations‟ members are :

loff_t (*llseek) ( struct file*, loff_t, int );
ssize_t (*read) ( struct file*, char*, size_t, loff_t*);
ssize_t (*write) ( struct file*, const char *, size_t, loff_t*);
int    (*ioctl) ( struct inode*, struct file*,unsigned int,unsigned long);
                - to handle device specific commands .
int     (*open) (struct inode *, struct file *);
int     (*release) (struct inode *, struct file *);
        - after fork or dup, release will be invoked after all copies of fd are closed.

struct module *owner ;

… and more

Use tagged structure initialization

eg: struct file_operations my_fops = {

            llseek : my_llseek,
             read : my_read,
             write : my_write,
              ioctl : my_control,
              open : my_open,
             release : my_close,

            owner :     THIS_MODULE,

The File Structure

C Library FILE appears in user space programs.

Struct file is a kernel file structure that never appears in user space.

Every open file in the system has an associated „struct file‟ in kernel space, created by the kernel
on „open‟ and is passed to any function or method that operates on the file.

most important fields are :
      mode_t       f_mode;
      loof_t       f_pos;
      unsigned int f_flags;
      struct file_operations &f_op;
      void          *private_data;
      struct        dentry &f_dentry;

How To Get Device Number
The combined device number resides in the field i_rdev of the inode structure.

MAJOR ( kdev_t dev); Extract the major number from a kdev_t structure.
MINOR (kdev_t dev) ; Extract the minor number
MKDEV( int ma, int mi ) ; create a kdev_t type from major & minor number

Data exchange between Application & Driver

Application runs in user space where as driver in kernel space.

User-space addresses cannot be used directly in kernel space.

When the kernel accesses a user space pointer, the associated page may not be present in
memory (swapped out).

To deal with like conditions, following functions are provided:
   unsigned long copy_from_user( void *to, const void *from, unsigned long count);
   unsigned long copy_to_user( void *to, const void *from, unsigned long count);

#include <uaccess.h>

                    The arguments to read

 ssize_t dev_read( struct file *file, ,       char *buf,    size_t count,, loff_t *ppos );

      struct file                Buffer                                 Buffer
                            (in the driver)                            (in the
      f_count                                                          application
     f_flags                                                           or libc)
     f_mode                                     copy_to_user( )

             Kernel Space                                            User Space
                (nonswappable)                                        (swappable)

The Open Method

In most drivers, open should perform the following tasks:

• Increment usage count
• Check for device specific errors

• Initialize the device, if it is being opened for first time
• Identify the minor number and update the f_op pointer
• Allocate and fill any data structure to be put in filp->private_data
The Release Method

The release method should perform the following tasks:

• Deallocate anything that open allocated in filp->private_data
• Shut down the device on last close
• Decrement the usage count
The Read Method

•   If the value equals the count argument passed to the read system call, the requested number
of bytes has been transferred.

• If the value is positive, but smaller than count, only part of the data has been transferred.
• If the value is 0 end of file is reached
• A negative value means there was error. These errors look like -EINTR (interrupted system
call) or -EFAULT (bad address)

The Write Method

• If the value equals the count argument passed to the write system call, the requested number
of bytes has been transferred.

• If the value is positive, but smaller than count, only part of the data has been transferred.
• If the value is 0, nothing was written.
• A negative value means there was error. These errors are defined in <linux/errno.h>.
The Ioctl method

   Perform various types of hardware control via device driver (ioctl method)
   Device specific entry point for the driver to handle the “commands”
   Allows to access features unique to hardware : configuring the device or Enter / Exit
    operating modes
   ioctl system call :
            ioctl (int fd, int cmd, char *argp)

    Driver method :
         int (*ioctl) (struct inode *inode, struct file *filep, unsigned int cmd, unsigned long
       arg );
   Choose the ioctl command a unique number
   Check in include/asm/ioctl.h & Documentation/ioctl-number.txt

Hardware Basics

•    How to use the device‟s control and status registers

•    What causes the device to generate interrupt

•    How the device transfers data

•    Whether the device uses any dedicated memory

•    Whether the device can be autoconfigured

Device Registers

•    Device Registers: command, status and data buffer

•    Accessing device Registers: (i) address of the device‟s first register (ii) address space
     where these registers live

•    I/O Space Registers: (i) inb() Read a single value from I/O Port (ii) outb() Write a single
     value to I/O Port

•    Memory-mapped registers: (i) readb() Read a single value from I/O register (ii) writeb()
     Write a single value to I/O register.

          Memory-mapped device registers and I/O space ports

    Memory                                                            I/O Space

    Register                      Register                              Register

            LOAD/STORE                                     IN/OUT

Device Interrupts

•    Device Interrupts- device interrupts for (i)has completed I/O operation (ii) A buffer or
     FIFO is full (iii) encounters some kind of error

•    Interrupt priorities

•    Interrupt vectors

•    Signalling mechanism: edge triggered(or latched) interrupts and level-sensitive (or level
     triggered(level-triggered) interrupt

•    Processor affinity: multiprocessor platform contains special interrupt routing hardware

Data transfer mechanisms

•   Three basic options: Programmed I/O, Direct memory access, Shared buffers

•   PIO - need help of CPU for each byte transfer

•   DMA- special hardware DMA Controller, System DMA and Bus Master DMA

•   Device Dedicated Memory - such as video adapter board.

 Paths followed by data in DMA and
 programmed I/O transfers
                                                         DMA Controller
                                                           Count Register
                                                            Address Register

        I/O                                                       Device
       Buffer                                              Data Register


Auto configuration

•   Ports, IRQs and DMA channel assignment

•   Device resource lists: Manufacturer id, device type id, I/O space, interrupt requirement,
    DMA channels, device memory

•   No jumpers or switches - change resource dynamically

•   Change Notification

I/O Buses

•   Bus is a collection of data, address and control lines that allows a peripheral device to
    communicate with memory and CPU

•   ISA - Industry Standard Architecture

•   MCA - Micro Channel Architecture

•   EISA - Extended Industry Standard Architecture

•   PCI - Peripheral component Interconnect


                     Layout of an ISA system

            Local Bus

    Memory                  CPU                 PIC                   DMAC

                                     ISA Bus

                ISA Card                    ISA Card                ISA Card

The ISA bus clock rate is 8.33 MHz, the maximum transfer rate is about 8MB/sec.

The I/O Address range is 0x0000 to 0x03FF.

ISA Bus supports 16 interrupts. Multiple cards can not share the interrupt.

Upto 8 DMA channels, Can access below 16 MB of memory.

Any device dedicated memory must live with in 16 MB space.

No Auto-Configuration facility, set by DIP switches and jumpers

The main difference between the ISA Bus and PCI is the complete separation of the Bus system
from the memory subsystem. The CPU communicates with the PCI subsystem using a special
chipset known as PCI-Bridge.

                   Layout of a PCI Bus system

                                 Cache Controller &
              CPU                   PCI Bridge                  Memory

                                       PCI Bus

                    PCI                         PCI                   Bridge
                    Device                      Device
                                   . . .                             EISA Bus
                 Function 0                 Function 0
                      .                           .
                      .                           .                 EISA       ...
                 Function 7                 Function 7              Slot 0

All devices that are known to Linux, you will see at /proc/pci. Try getting the PCI bus
configuration from the kernel with cat /proc/pci. You will see the PCI configuration information
starting with message “PCI devices found:”

PCI Addressing

Each PCI peripheral is identified by a „Bus‟ number, a „device‟ number, a „function‟ number.

PCI specification permits a system to host upto 256 buses, each bus hosts upto 32 devices and
each device can be a multifunction board within a maximum of eight functions.

Thus each PCI peripheral can be identified by a 16- bit (8 + 5 + 3) address.

PCI Configuration space

The PCI configuration space consists of 256 bytes for each device function.

First 64 bytes are standardized and device independent.

Always little – endian.

Accessing PCI configuration space

int pci_read_config_byte ( struct pci_dev *dev , int where, u8 *ptr);
int pci_read_config_word (struct pci_dev *dev, int where , u16 *ptr);
int pci_read_config_dword (struct pci_dev *dev, int where, u32 *ptr);
int pci_write_config_byte (struct pci_dev *dev , int where , u8 *ptr);
int pci_write_config_word (struct pci_dev *dev , int where , u16 *ptr);
int pci_write_config_dword(struct pci_dev *dev , int where , u32 *ptr);
struct pci_dev *pci_find_device (UINT vendor, UINT device, const struct pci_dev *from ) ;
struct pci_dev *pci_find_class ( UINT class, const struct pci_dev *from);

                       Command Status   Rev                 Cache Latency Header
    Vendor ID Device ID Reg    Reg      ID
                                               Class Code   Line Timer Type BIST

     Base Address 0 Base Address 1 Base Adress 2              Base Address 3

    Base Address 4 Base Address 5 CardBus CIS pointer Subsystem            Subsystem
                                                             Vendor ID     Device ID
    Expansion ROM                                           IRQ IRQ
                                 Reserved                                  Min   Max
    Base Add                                                Line Pin       Gnt   Lat

I/O Ports and I/O Memory

The peripheral devices are controlled by writing and reading their registers.

They are accessed at consecutive addresses, either in the memory address space or in the I/O
address space.

Some CPU manufacturers implement a single address space in their chips.

Some processors (x86 etc) have separate „read‟ and „write‟ electrical lines for I / O ports and
special CPU instructions to access ports.

Linux implements the concept of I / O ports on all computer platforms it runs on.

Even if the processor or peripheral bus has a separate address space for I/O ports, not all devices
map their registers to I / O ports.

Use of I / O ports is common for ISA peripheral boards.

Most PCI devices map registers into a memory address region.

Architecture that support memory mapped I / O registers fake port I /O by mapping port
addresses and the kernel hides the details from the driver.

Allocation of I / O ports
I / O ports must be allocated before being used by the driver.

Linux provides following functions for this purpose:
    #include <linux/ioport.h>
     int check_region ( unsigned long start, unsigned long len) ;
     struct resource *request_region (unsigned long start,
                     unsigned long len, char *name) ;
     void release_region ( unsigned long start , unsigned long len) ;

IMPORTANT FILE: /proc/ioports

Reading / Writing I / O Ports

Reading :      unsigned inb ( unsigned port ) ;
               unsigned inw ( unsigned port ) ;
               unsigned inl ( unsigned port ) ;
Writing :       void outb ( unsigned char byte, unsigned port ) ;
                void outw ( unsigned short word, unsigned port ) ;
                void outl ( unsigned long word, unsigned port ) ;

Most h/w differentiate between 8-bit, 16- bit & 32 bit ports. Not to mix them like we do with
system memory access. In 64-bit architectures the port address space uses a 32-bit maximum.

These functions can also be used from user space (at least on PC-class) provided the ioperm or
iopl are used to obtain Permissions

I / O registers and conventional Memory

Be careful to avoid being tricked by CPU and / or compiler Optimizations

I / O operations have side effects, while memory operations have none.

Only effect of memory write is storing a value to a location, and a memory read returns the last
value written there.

Some times these values are cached and read / write instructions are reordered.

The compiler can cache data values into CPU registers without writing them to memory. Read &
Write operations can operate on cache memory without ever reaching physical RAM.

Reordering can happen both at compiler level and at h / w level.

Hence a driver must therefore ensure that no caching is performed and no read or write takes
place when accessing registers.

H / W Caching
The underlying hardware is already configured (either automatically or by Linux initialization
code) to disable any hardware cache when accessing I / O Regions. (Memory / Port)
Compiler optimization and h/w reordering :
Place a “ memory barrier” between operations that must be visible to the h/w in a particular

Linux provides four macros for this purpose.

   #include < linux/ kernel.h >
    void barrier ( void ) – compiler barrier , no effect
                            on h / w
   #include < asm / system.h >
     void rmb ( void ) ; - h / w memory barriers
     void wmb ( void ) ;
     void    mb ( void ) ;

barrier – Compiled code will store to memory all values that are currently modified and resident
in CPU registers , and will reveal them later when they are needed.

rmb – Any reads appearing before the barrier are completed prior to the execution of any
subsequent read.
wmb - Any writes appearing before the barrier are completed prior to the execution of any
subsequent write.
mb – does both.

CAUTION: Memory barriers affect performance, they should only be used where really needed.

String operations for Port I / O

Transfer a sequence of bytes, words, or longs to and from a single I / O Port of the same size.

Linux provides following macros :
    void insb ( unsigned port, void *addr, unsigned long count);
    void insw ( unsigned port, void *addr, unsigned long count);
    void insl ( unsigned port, void *addr, unsigned long count);
    void outsb ( unsigned port, void *addr, unsigned long count);
    void outsw ( unsigned port, void *addr, unsigned long count);
    void outsl ( unsigned port, void *addr, unsigned long count);

Pausing I / O

If device misses some data, or if you fear it might miss some, pausing functions can be used.
Pausing functions are exactly like those listed previously, but their names end in _p.

Platform dependencies in I / O Ports

Arch Support               Port Address      Mapping         Str functions
IA- 32      All           unsigned short
IA- 64     All            unsigned long     memory mapped        in C
Alpha      All            unsigned long     memory mapped        in C
ARM         All           unsigned int      memory mapped         in C
M68K       Byte           unsigned char     memory mapped        NO

I/ O Memory

It is the main mechanism used to communicate with devices through memory-mapped registers
and device memory.

It is simply a region of RAM- like locations that the device makes available to the processor over
the bus.

According to the computer platform and bus being used, I / O memory may or may not be
accessed through page tables.

If no page tables are needed, then I / O memory locations look pretty much like I / O Ports.
Direct use of pointers to I / O memory is not a good Practice

Allocation of I/O memory
I / O memory must be allocated before being used by the driver. Linux provides the following
functions for this purpose

int check_mem_region ( unsigned long start, unsigned long);
void request_mem_region ( unsigned long start, unsigned long len,
                            char *name ) ;
void release_mem_ region ( unsigned long start, unsigned long len ) ;

IMPORTANT FILE : /proc / iomem

Directly Mapped I / O Memory

Some computer platforms reserve a part of their memory address space for I / O locations and
automatically disable memory management for any (virtual) address in that memory area.

Some platforms also bypass „caching‟ these regions.

Software Mapped I / O Memory

Devices live at well-known physical addresses, but the CPU has no predefined virtual address to
access them.

For software to access I/ O memory, there must be a way to assign a virtual address to the

Linux provides the following functions:
#include < asm / io.h >
void *ioremap ( unsigned long phys_addr, unsigned long size ) ;
void *ioremap_nocache ( unsigned long phys_addr,unsigned long size);
void *iounmap ( void *addr ) ;

Reading / Writing I / O Memory

Reading:   unsigned readb ( address ) ;
           unsigned readw ( address ) ;
           unsigned readl ( address ) ;

Writing:     void writeb (unsigned value, address ) ;
             void writew ( unsigned value, address ) ;
             void writel ( unsigned value, address ) ;
Neither the reading nor the writing functions check the validity of address. Reading and writing
functions are provided by some platforms for 64- bit

Interrupt Handling
An interrupt is simply a signal that the hardware can send when it wants the processor‟s
attention. A module is expected to request an interrupt channel before using it, and to release it
when it is done.

Linux provides following functions:
#include < linux / sched.h >

int request_irq ( unsigned int irq, void ( *handler ) ( int , void *, struct pt_regs ),
            unsigned long flags, const char *dev_name , void *dev_id ) ;

void free_irq ( unsigned int irq, void *dev_id );

Irq  Interrupt number
Handler  Pointer to the interrupt handler function.
Flags 
       SA_INTERRUPT - „fast‟ interrupt handler
       SA_SHIRQ           - interrupt can be shared

dev_id It is a unique identifier that is used when the interrupt line is freed and that may also
be used by the driver.when no sharing is in force , dev_id can be set to NULL

Interrupt handler can be installed either at driver initialization or when the device is first opened.

IMPORTANT FILES: /proc/ interrupts

Control of Interrupts

The use of functions cli and sti to disable and enable interrupts has been used.
To disable interrupts, it is better to use the following calls:

unsigned long flags;

  /* this code runs with interrupts disabled */

The macros save_flags() and restore_flags() must be called from the same function.

Fast & Slow Handlers
Fast interrupts are those that could be handled very quickly, whereas handling slow interrupts
take significantly longer.

A Fast handler runs with interrupt reporting disabled in the microprocessor, and the interrupt
being serviced is disabled in the interrupt controller.

A slow handler runs with interrupt reporting enabled in the processor, and the interrupt being
serviced is disabled in the interrupt controller.

Fast handlers should not be used generally

Implementing a Handler
It‟s ordinary C code with some restrictions
   Can‟t transfer data to or from user space, because it does not execute in the context of process.
   Cannot do anything that would sleep .
   Cannot allocate memory with anything other than GFP_ATOMIC.
   Cannot lock a semaphore
   Cannot call schedule.
   Give feedback to device about interrupt reception (by clearing a bit or byte on the interface
         board “ Interrupt Pending”)
Write a routine that executes in a minimum of time. If a long computation needs to be
performed, use a tasklet.

Concurrency and Race conditions

Kernel code does not run in simple environment like application s /w and must be written with
the idea that many things can be happening at once.

Multiple processes can be trying to use the „driver‟ at the same time that our „driver‟ is trying to
do something else.

In addition several other s /w abstractions ( such as kernel timers ) run asynchronously.

Lastly (but not least) Linux run on SMP, with the result that your driver could be executing
concurrently on more than one CPU.

So Linux kernel, including driver code, must be reentrant – it must be capable of running in more
than one context at the same time.

Bottom Half Processing

An interrupt handler should execute in a minimum of time and not to keep interrupts blocked for
long. Often a substantial amount of work must be done in response to device interrupt. This can
be accomplished by splitting the interrupt handler into two halves.
Top Half: is the routine that actually responds to the interrupt .
Bottom Half: is a routine that is scheduled by the top half to be executed later, at a safer time
Difference: All interrupts are enabled during execution of bottom half - that‟s why it runs at a “
Safer time”
Restrictions: All of the restrictions that apply to interrupt handlers also apply to bottomhalfs

There are two different mechanisms that may be used to Implement bottom-half processing.

Tasklets – Preferred way, introduced late in the 2-3 development series.

BH – Older one (Present kernels BH is implemented using tasklets).


A tasklet is a special function that may be scheduled to run, in interrupt context at a system
defined safe time.

They may be scheduled to run multiple times, but will only run once.

No tasklet will ever run in parallel with itself, since they only run once.

Tasklet can run in parallel with other tasklets on SMP systems.

They are guaranteed to run on the CPU that first schedules them.

DECLARE_TASKLET ( name , function , data ) ;
DECLARE_TASKLET_DISABLED ( name, function, data);

It can be scheduled, but will not be executed until enabled at some future time.

tasklet_schedule ( name ) ;

void tasklet_disable ( struct tasklet_struct &t ) ;

void tasklet_enable ( struct tasklet_struct &t ) ;

void tasklet_kill ( struct tasklet_struct &t ) ;

The BH Mechanism

All BH bottom halves are predefined in the kernel, and there can be a maximum of 32 of them.
Whenever some code wants to schedule a bottom half for running. It calls mark_bh. In the older
BH implementation, mark_bh would set a bit in a bit mask, allowing the corresponding bottom
half handler to be found quickly at run time. In modern kernels, it just calls tasklet_schedule to
schedule the bottom half routine for execution.
Several BH bottom halves declared by the kernel are:
IMMEDIATE_BH – The function being scheduled runs (with run_task_queue) the tq_immediate
task queue.
TQUEUE_BH - This BH is activated at each timer tick if a task is registered in tq_timer.
TIMER_BH – This BH is marked by do_timer.

Dealing with Race conditions / protecting data from concurrent access

Using a circular buffer and avoiding shared variables .
Using spinlocks to enforce mutual exclusions .
Using lock variables that are atomically incremented and decremented.

Circular Buffers

Very similar to „producer & consumer‟ algorithm.

Works fine for one producer & consumer.

Two pointers are used to address a CB: head & tail.

head is the point at which data is being written and updated only by producers .

Data is being read from tail, which is updated only by consumer.

CB runs smoothly, except when it fills up.

Spin locks

Spinlocks works through a shared variable. A function may acquire the lock by setting the
variable to a specific value. Any other function needing the lock will query it and seeing that it is
not available, will “spin” in a busy_wait loop until it is available .

#include < asm / spinlock.h >
spinlock_t my_lock = SPIN_LOCK_UNLOCKED ;
If it is necessary to initialize a spinlock at runtime spin_lock_init ( &my_lock ) ;

spin_lock ( spinlock_t *lock ) ;
spin_lock_irqsave ( spinlock_t *lock, unsigned long flags);
spin_lock_irq ( spinlock_t *lock ) ; disable interrupts on local processor .
spin_lock_bh ( spinlock_t &lock ) ;
spin_unlock ( spinlock_t &lock ) ;
spin_unlock_irqrestore ( spinlock_t *lock , unsigned long flags ) ;
spin_unlock_irq ( spinlock_t *lock ) ;
spin_unlock_bh ( spinlock_t *lock ) ;
spin_is_locked ( spinlock_t *lock ) ; returning „0‟ if spin is busy
spin_trylock ( spinlock_t &lock ) ;
spin_unlock_wait ( spinlock_t *lock ) ;

Linux also provides reader – writer spinlocks. These locks have a type of rwlock_t and should
be initialized to RW_LOCK_UNLOCKED. Any number of threads can hold the lock for reading
at the same time. When a writer comes along, it waits until it can get exclusive access.
The functions for working with reader-writer locks are as follows:
read_lock(rwlock_t *lock);
read_lock_irqsave ( rwlock_t *lock, unsigned long flags);
read_lock_irq ( rwlock_t *lock ) ;
read_lock_bh ( rwlock_t &lock ) ;
read_unlock ( rwlock_t &lock ) ;
read_unlock_irqrestore ( rwlock_t *lock , unsigned long flags ) ;
read_unlock_irq ( rwlock_t *lock ) ;
read_unlock_bh ( rwlock_t *lock ) ;

write_lock(rwlock_t *lock);
write_lock_irqsave ( rwlock_t *lock, unsigned long flags);
write_lock_irq ( rwlock_t *lock ) ;
write_lock_bh ( rwlock_t &lock ) ;
write_unlock ( rwlock_t &lock ) ;
write_unlock_irqrestore ( rwlock_t *lock , unsigned long flags ) ;
write_unlock_irq ( rwlock_t *lock ) ;
write_unlock_bh ( rwlock_t *lock ) ;

Using Lock Variables

The kernel provides a set of functions that may be used to provide atomic (noninterruptible)
access to variables. Use of these functions can occasionally eliminate the need for a more
complicated locking scheme, when the operations to be performed are very simple. The Linux
kernel exports two sets of functions to deal with locks: bit operations and access to the “atomic”
data type.

Bit Operations: It‟s quite common to have a single bit lock variables or to update device status
flags at interrupt time – while a process may be accessing them The kernel offers a set of
functions that modify or test single bits atomically. The functions are declared in

void set_bit(nr, void *addr);
void clear_bit(nr, void *addr);
void change_bit(nr, void *addr);
void test_bit(nr, void *addr);

Atomic integer operations: Kernel programmers often need to share an integer variable between
an interrupt handler and other functions. A separate set of functions has been provided to
facilitate this sort of sharing, they are defined in <asm/atomic.h>.

void atomic_set(atomic_t *v, int i);
void atomic_read(atomic_t *v);


Shared By:
Tags: Deivce, Drivers
Chandra Sekhar Chandra Sekhar http://
About My name is chandra sekhar, working as professor