Bus Arbitration, DMA, and
Jason Bennett, Tom DiLello,
CMSC 415 Computer Architecture
What is Bus Arbitration,
Bus Mastering and DMA?
• Bus Arbitration – an elaborate system for resolving bus control
conflicts and assigning priorities to the requests for control of the bus.
• Bus Mastering – a method of enabling different device controllers on
the bus to ‘talk’ to one an other, without having to go through the
• DMA(Direct Memory Access) – a method of transferring data from a
hard disk to main memory without having to go through the CPU.
Bus Arbitration Methods
Centralized bus arbitration requires hardware (arbiter)that will grant
the bus to one of the requesting devices. This hardware can be part of
the CPU or it can be a separate device on the motherboard.
Decentralized arbitration there isn't an arbiter, so the devices have to
decide who goes next. This makes the devices more complicated, but
saves the expense of having an arbiter.
Centralized Bus Arbitration
• Centralized One Level Bus Arbiter
– This method of arbitration uses one centralized bus controller that
all devices can query.
• Centralized Two Level Bus Arbiter
– Uses a Bus Request Line and Bus Grant Line for each Level
Centralized One Level Bus
This method of arbitration uses one centralized bus controller that all devices
can query. There are 2 lines that are used:
1. Bus Request Line – A wired ‘OR’ that the controller
knows a request was made, but does not know which device made the
2. Bus Grant Line – First a signal is propagated to all devices. The Bus
Grant Line is asserted to the first device in the chain. If that device made
the request it takes a hold of the bus and leaves the Bus Grant Line
negated for the next device in the chain. If that device didn’t make the
request then the Bus Grant Line is asserted for the next device in the
If two devices make a request for the bus at the same time then the device
closer to the controller gets the bus. This is called daisy chaining
Centralized Two Level Bus
• Centralized Two Level Bus Arbiter
– Bus Request line: one for each level
– Bus Grant line: one for each level
This Helps to alleviate the problem of the closest device to the
controller getting control of the bus. If more than one request
comes in at one time, control is granted based on priority. One
major advantage to this is when a lower priority device has control
of the bus, a higher priority device cannot ‘steal’ the bus from
Micro Channel Bus - Centralized
Two Level Bus Arbiter
• An MCA (micro channel architecture) bus is an example of a
centralized two level bus arbiter
• Has control built-in to make sure that no properly designed device can
be unwillingly locked out of bus access.
• This flexibility of the Micro Channel arbitration process is a result of
making several divergent functions work together. These include
preemption, fairness, linear priority, bus time-out, latency, and system
• Implements a priority system in which the Micro Channel adds several
new lines to the PC bus. Four of these, lines 0 – 3, are added to yield
16 different priorities.
• In addition, two additional levels of priority are used by the devices on
the system board of the PS/2 and do not appear on the Micro Channel.
These special internal levels are used to assign the absolute top priority
to memory refreshing and nonmaskable interrupt.
Def’n: Decentralized Bus arbitration does not require an arbiter so the
devices have to decide who gets control of the bus. The devices
therefore have to be more complicated, but this saves the expense
of having an arbiter.
• VAX SBI Bus
The VAX by DEC has 16 separate request lines. All Devices monitor on all the lines.
If they want to send data they determine if another device with a priority is using the
uses three lines: an Arbitration line, Busy line, and a Bus Request Line
arbitration – Vax SBI Bus
• All devices monitor the bus, when a device wishes to use the bus, it
makes sure that no other higher priority device is using the bus. If not
then is begins its transmission, if not it waits till the devices is done
transferring to begin its transfer.
• Q: When does a device negate its request line?
• A: When its request is completed
• Q: How does a device determine that whether or not the bus is in use?
• A: By seeing if another higher priority device has requested it. When
all is clear, that device will negate its request line.
arbitration – Multibus
• Arbitration line
This line can also be dubbed the ‘IN’ line. When this line is asserted the
device knows is has been granted the bus. If this line is negated then
permission has not been granted to the device. When no device is using the
bus all devices get the asserted ‘IN’ line, meaning anyone can use the bus.
When the device attempts to grab the bus it asserts its ‘OUT’ or Bus request
once a device has determined that no other device is using the bus, or a device
with a higher priority is using the bus it asserts this line, deemed the ‘BUSY’
line. This will let all other devices that this device is using the bus. Once the
‘BUSY’ line is asserted, the ‘OUT’ line is asserted.
• Bus Request
This line indicates whether another device has made a request. If Busy
is negated, then the device negates OUT and waits an undetermined
amount of time to see if its IN will be negated.
Def’n: Refers to a feature supported by some bus architectures that
enables a controller connected to the bus to communicate
directly with other devices on the bus without going through the
CPU. Normally, the processor is required to control the transfer
of this information. In essence, the processor is a "middleman",
It is far more efficient to "cut out" the middleman and perform
the transfer directly. This is done by having capable devices take
control of the bus and do the work themselves. In theory this
frees up the processor to do other work simultaneously.
• ISA (Industry Standard Architecture)
• MCA (Micro Channel Architecture)
• EISA (Extended Industry Standard Architecture)
• VLB (Vesa Local Bus)
• PCI (Peripheral Communications Interconnect)
ISA (Industry Standard
• Bus mastering really hasn’t been successful with the ISA bus. Any ISA device
can take control of the bus, but it must be done with caution. There are no
safety mechanisms involved, so if a device incorrectly takes control of the bus,
it may crash the system. For example, we all know the DRAM needs to be
refreshed periodically. If the ISA bus master doesn’t relinquish control of the
bus every 15 ms, to generate its own DRAM refresh, the DRAM will become
• To take control of the bus, the device first asserts its DRQ line. The DMAC
sends a hold request to the CPU, and when the DMAC receives a hold
acknowledge, it asserts the appropriate DAK line corresponding to the DRQ
line asserted. The device is now the bus master. AEN is asserted, so if the
device wishes to access I/O devices, it must assert MASTER16 to release
AEN. Control of the bus is returned to the system board by releasing DRQ.
MCA (Micro channel Architecture)
• The MCA bus was IBM's attempt to replace the ISA bus with something
"bigger and better". When the 80386DX was introduced in the mid-80s with
its 32-bit data bus, IBM decided to create a bus to match this width. MCA is
32 bits wide, and offers several significant improvements over ISA. (One of
MCA's disadvantages was rather poor DMA controller circuitry.)
• The main idea behind the MCA bus was, instead of constraining the computer
to working on one problem at a time, multiple problems can be approached
simultaneously. It allows the bus to be used by two or more bus masters at the
same time, by setting up a control system for coordinating their operations.
• The way this is accomplished is by using a master/slave concept. If two or
more devices via-ing for the bus. The bus slave functions just as an ordinary
expansion device in a non-mastering PC. The difference between the bus
master and the bus slave is entirely functional. In the MCA scheme one bus
master can take control over another, making the second its slave. Later the
two devices can reverse their roles when another situation demands it.
MCA (Micro channel Architecture,
• The MCA bus had some pretty impressive features:
– 32 bi wide bus: impressive considering it was introduced in 1987. Had far
superior throughput than the ISA bus.
– PnP (Plug and Play): MCA automatically configured adapter cards, so
there was no need to fiddle with jumpers. This was eight years before
Windows 95 brought PnP into the mainstream.
• MCA had a great deal of potential. Unfortunately, IBM made two decisions
that would doom MCA to utter failure in the marketplace. First, they made
MCA incompatible with ISA; this means ISA cards will not work at all in an
MCA system. The PC market is very sensitive to backwards-compatibility
issues, as indicated by the number of older standards that persist to this day.
Second, IBM decided to make the MCA bus proprietary. It in fact did this with
ISA as well; however in 1981 IBM could afford to flex its muscles in this
manner, while by this time the clone makers were starting to come into their
own and weren't interested in bending to IBM's wishes.
EISA (extended Industry Standard
• Introduced in 1987, the EISA bus was AST Research’s, Compaq’s, Epson’s,
Hewlett Packard’s, NEC’s, Olivetti’s, Tandy’s, WYSE’s, and Zenith Data
Systems’s answer to IBM’s MCA Bus. The EISA Bus provided 32-bit slots
at an 8.33 MHz cycle rate for the use with 386DX, or higher processors.
• Some of the Key features of this bus are:
– ISA Compatibility: ISA cards would work in EISA slots.
– 32 bit bus: Like MCA this bus was expanded to 32 bits. Giving it a
throughput of 31.8 MBs
– Bus Mastering: supported bus mastering cards for greater performance,
including bus arbitration
– PnP: EISA automatically configures adapter cards, similar to the Plug and
Play standards of modern systems.
• There were two main reasons for the downfall of the EISA bus architecture:
• 1: EISA based systems tended to be more expensive
• 2: There just weren’t many EISA-based cards available
VLB “VESA Local Bus”(Video
Electronics Standards Association)
• The VESA (Video Electronics Standards Association) a nonprofit organization founded
by NEC, released the VLB or VESA Local Bus in 1992. The VLB is a 32-bit bus that
gave direct access to the system memory at the speed of the processor, commonly the
486 CPU. Unfortunately, because the VLB heavily relied on the 486 processor when the
Pentium Processor arose in the Market place.
• The VLB is in a way a direct extension of the 486 processor/memory bus. A VLB slot is
a 16-bit ISA slot with third and fourth slot connectors added on the end. The VLB
normally runs at 33 MHz with a total Bandwidth of 127.2 MBs, although higher speeds
are possible on some systems. Since it is an extension of the ISA bus, an ISA card can
be used in a VLB slot, although it makes sense to use the regular ISA slots first and
leave the (small number of) VLB slots open for VLB cards, which won't work in an ISA
slot of course. Use of a VLB video card and I/O controller greatly increases system
performance over an ISA-only system.
• Four reasons for its downfall were:
– 1: The bus was heavily based on Intel’s 486, so adapting it to the Pentium was difficult
– 2: Tricky electronics. Not many cards could be supported on the bus, namely one or two. And
even when more than one expansion card was used, there were timing problems
– 3: No bus arbitration scheme
– 4: No PnP support
• Introduced by Intel in 1992, revised in 1993 to version 2.0, and later revised in 1995 to PCI 2.1. It’s a
32-bit bus that is also available as a 64-bit bus today. It can run @ 33MHz with a bandwidth of
127.2 MBs or @64 MHz with a bandwidth of 508.6 MBs.
• The key to PCI's advantages over its predecessor, the VESA local bus, lies in the chipset that controls
it. The PCI bus is controlled by special circuitry in the chipset that is designed to handle it, where the
VLB was basically just an extension of the 486 processor bus. PCI is not married to the 486 in this
manner, and its chipset provides proper bus arbitration and control facilities, to enable PCI to do
much more than VLB ever could. PCI is also used outside the PC platform, providing a degree of
flexibility and allowing manufacturers to save on design costs.
• The PCI bus also allows you to set up compatible IDE/ATA hard disk drives to be bus masters. To
get this to work 4 things must be present:
– 1: Bus Mastering Capable System Hardware: This includes the motherboard, chipset, bus
and BIOS. Most newer motherboards using the Intel 430 Pentium chipset family (FX, HX, VX,
TX) or the Intel 440FX Pentium Pro chipset, will support bus mastering IDE.
– 2: Bus Mastering Hard Disk: Normally this means that the drive must be capable of at least
multiword DMA mode 2 transfers. All Ultra ATA hard disks support bus mastering.
– 3: A 32-Bit Multitasking Operating System: Windows NT, Windows 95, Linux, or similar
– 4: Bus Mastering Drivers: A special driver must be provided to the operating system to enable
bus mastering to work..
DMA (Direct Memory Access)
• DMA (Direct Memory Access): DMA is a feature supported by some
bus architectures that allows data to be transferred to and from RAM
without burdening the CPU. This is accomplished by a DMA controller
chip. In addition some add-on cards need to transfer data to the
systems memory through a DMA channel. Each expansion card which
supports DMA uses at least one DMA channel. The PC supports up to
7 DMA channels (though some of these are not compatible with all
expansion cards). No two expansion cards can be using the same DMA
channel at the same time (and only a few cards support sharing of a
DMA channel when they are not using it). You must select a DMA
channel which is not used by another card installed in the computer
(sound cards frequently use one or even two DMA channels), and
configure it for that DMA channel. If you accidentally choose a DMA
channel which another card is using, the symptom is usually that no
DMA transfers take place. No data is acquired or if the conflict is with a
sound card, sounds may not play.
• How is works: The peripheral (a LAN adapter, for example) writes from its
memory directly to the PC's memory in one bus cycle (reducing the load on
the bus), rather than the two-step process of the CPU's DMA controller first
reading the data (from the adapter) and then writing it to the PC's memory in a
second bus cycle.
• Often, the adapter will do its transfer as the data are received from the LAN, so
no, or little, on-board LAN adapter memory is required (this saves money).
• Uses much less CPU time than other methods. For example, programmed
input-output (PIO) requires the CPU to first check for the availability of the
data, then read the data, and then write the data. This requires bus and CPU
time for both fetching the CPU's instructions and for reading and writing the
data. Also, bus master DMA is faster than standard DMA, since the CPU does
not even need to load the DMA registers (for example, with the source and
destination addresses) to set up each transfer.
IBM 370 Channel
• Channels are an extension of the Direct Memory Access (DMA) function.
• A channel directly executes instructions, this gives it complete control over the
• The main system processor is not used during the execution of the instruction,
but rather instructions are stored in main memory, where they wait to be
executed by the channel’s own processor.
• The CPU initiates the instruction by telling the channel to execute a program.
• The two most common types of channel architectures are the selector and the
– Selectors can control more than one device, but can only talk with one at
a time. Each device has a corresponding controller that is managed by the
channel, rather than the CPU.
– Multiplexors: can also control more than one device. There are a couple
types of multiplexor channels. A byte multiplexor accepts or transmits
characters as fast as possible to multiple devices. A block multiplexor
used with high-speed devices, alternates blocks of data from several
370 Channel Control
• The CPU can control one or more channels, which can be byte or block
• Each channel contains one or more controllers, called control units.
• A control unit is usually in charge of a set of similar or identical devices. For
example - a disk controller could control several disk drives.
• It is possible for one control unit to be connected to several channels and for
one device to be connected to several control units.
• This allows more than one physical path between the CPU and a device.
• This is important, if one pathway is busy or disabled, an alternate route may be
370 I/O Addressing
• The IBM 370 architecture uses an isolated I/O addressing scheme to reference
• Addresses on the 370 are 24 bits and device addresses are only 16 bits.
• The leftmost eight bits are set to zero.
• Eight bits are used to designate the channel allowing up to 256 channels.
• The next four bits indicate the control unit in the channel.
• The last four bits designate the device within the control unit.
• When more than one routes are available, then a device will have a different
address for each path.
Channel Instruction Execution
• When an I/O instruction is executed the CPU sends a command to the channel,
which contains three things.
– Control Unit
– Device Address
• When an I/O operation is started the channel reads the Control Access Word
(CAW) from location 72 of main memory.
• Memory address 72 is written by the program that called for the execution of
• The Channel Command Word (CCW) is fetched and decoded by the channel.
• The I/O program can be made up of one or more CCW, all of which must
already be stored in main memory.
• A CAW is made up of the following three fields.
– 1: Key: A 4-bit access key is associated with every 2K or 4K-byte block
of memory. The key in the CAW is used by the channel whenever a
reference is made to a main memory location during the I/O process.
– 2: S bit: When set indicates that it is possible for the CPU to suspend and
later resume the I/O process.
– 3: CCW Address: The location of the first CCW to be used for this
370 Control Command
• Channel programs that are made up of one or more CCW are held together
using branching and chaining.
• They can be chained together using either data or command chaining.
• In both cases when the command has finished executing the channel will get
the next CCW in the sequence.
• A CAW is made up of the following four fields.
– 1: Command Code: Essentially the opcode. It tells the channel what type
of operation to perform. The Command Codes include modifier bits that
are device specific.
– 2: Data Address: Specifies the starting location in memory for a data
– 3: Flags: Specify additional information about the operation to be
– 4: Count: Specifies the number of bytes to be transferred in this
IBM 370 I/O Channel
• Start I/O (SIO): Used for initializing an input/output operation that involves
sensing the status if a device, controlling the device, and data transfer between
the device and the main storage. This instruction causes an I/O channel to
fetch a Channel Address Word that will begin the I/O operation. The SIO
instruction is initiated if both the subchannel and device are available, and the
channel is available or the interruption-pending state and errors have not been
detected. The CPU is not released until the above conditions are checked and
the device is selected.
• Start I/O fast release (SIOF): Similar to SIO and used on block multiplexor
channels. The main difference is that with SIOF, the CPU is released as soon
as the Channel Address Word (CAW) is fetched by the I/O processor.
• Test Channel (TCH): Tests whether the channel is operational or not, if it is
operating in burst mode, and has a any pending interrupt requests. This is
primarily used to monitor performance.
• Test I/O (TIO): Tests the status of not only the channel, but also the
subchannel and the device. This instruction is used to monitor status or to
respond to an interrupt.
• Store Channel ID (STID): Used to obtain information about an addressed
channel, such as channel type (selector, byte multiplexor, block multiplexor),
and model number.
IBM 370 I/O Channel
• Halt I/O (HIO): Causes the current I/O operation to be halted. A channel,
subchannel, or device may be specified. This instruction provides the CPU
with a means of terminating an I/O operation before all data have been
transferred. This could be done to free a selector channel for a higher-priority
operation, or to provide real-time control on a multiplexor channel.
• Halt Device (HDV): Similar to HIO. Used primarily on block multiplexor
channels to halt a specific device without interfering with other channel
operations in progress.
• Clear I/O (CLRIO): Serves the same purpose as TEST I/O, and is used
instead of TEST I/O for block multiplexor channels. The Clear I/O instruction
may also cause an I/O operation to be suspended pending interrupt processing.
• Clear Channel (CLRCH): Causes the Channel to conclude operations on all
subchannels. Status information and interruption conditions are reset on all
subchannels, and a reset signal is issued to all assigned I/O devices.
• Resume I/O (RIO): Causes a currently suspended channel-program execution
to be resumed with the device
IBM 370 I/O Interface
• Bus Out (9): Used to transmit addresses, commands, control orders, and data to the
control units. Consists of eight information lines and one parity line.
• Bus In (9): Used to transmit addresses, status, sense information, and data to the
channel. Consists of eight information lines and one parity line.
• Outbound Tags (3): Identify the type of information present on BUS OUT. The
correspondence is the control unit ADDRESS OUT, the command COMMAND OUT,
and data requested by the control unit SERVICE OUT.
• Inbound Tags (3): Identify the type of information present on BUS IN. The
correspondence is the address of responding control unit ADDRESS IN, status
information STATUS IN, and data associated with the current I/O operation SERVICE
• Scan Controls (4): Used for polling and selection of control units. SELECT OUT and
SELECT IN form a loop from the channel through each control unit and back. HOLD
OUT provides synchronization, REQUEST IN indicates that the control unit is ready to
present status information or data and is requesting a selection sequence.
• Interlocks (2): Used to ensure that only one control unit is communicating with the
channel an any given time. OPERATIONAL OUT must be up for the other lines to have
significance. When it is down, all lines must drop any operation currently in progress
and must be reset. OPERATIONAL IN signals the channel that the control unit is
selected and communicating.
• Special Controls (4): Used for metering time, suppressing other lines, and other
specialized control purposes.
• Heath, Chet and Rosch, L. Winn The Micro Channel architecture. Simon &
Schuster, Inc., 1990.
• Cormier, R., Dugan, R., Guyette, R. "System/370 Extended Architecture: The
Channel Subsystem" IBM Journal of Research and Development. May 1983.
• IBM Corp. IBM System/370: Principals of Operation. GA22-7000-9. May,
• IBM Corp. IBM System/370 Extended Architecture: Principals of Operation.
SA22-7085-0. May, 1983.