15-297 Wireless Sensor Network Programming

Document Sample
15-297 Wireless Sensor Network Programming Powered By Docstoc
					15-297: Wireless Sensor Network Programming


                                                                                                                                          Gilman Tolle
                                                                 Wireless Sensor Network Programming (15-297) Summer 2005




1.    WHAT ARE SENSOR NETWORKS GOOD FOR? .........................................................................................2

2.    WHAT HARDWARE IS MAKING WIRELESS SENSOR NETWORKS POSSIBLE? ............................................6

3.    WHAT PRINCIPLES SHAPE THE DESIGN OF WSN APPLICATIONS? ......................................................... 11

4.    HOW DOES THE NESC COMPONENT MODEL SUPPORT WSN APPLICATION DESIGN? ............................ 15

5.    HOW DOES NESC SUPPORT EVENT-DRIVEN PROGRAMMING? .............................................................. 27

6.    HOW DOES TINYOS PROVIDE LOCAL OPERATING SYSTEM SERVICES?................................................. 37

7.    HOW DOES TINYOS PROVIDE LOCAL COMMUNICATION SERVICES?...................................................... 46

8.    HOW DO WIRELESS SENSOR NETWORK NODES COMMUNICATE?......................................................... 51

9.    HOW DO WSN NODES BUILD A RELIABLE NEIGHBORHOOD ABSTRACTION OVER LOCAL BROADCAST? . 55

10.      HOW CAN DATA BE EFFICIENTLY DISSEMINATED TO THE NODES IN A MULTI-HOP WIRELESS SENSOR
NETWORK? ................................................................................................................................................. 59

11.      HOW CAN DATA BE COLLECTED FROM THE NODES IN A MULTI-HOP WIRELESS SENSOR NETWORK? 66

12.      HOW CAN NAMING AND DATABASE TECHNIQUES IMPROVE WSN APPLICATIONS? ............................. 72

13.      HOW CAN A NETWORK OF SENSORS LOCALIZE ITSELF IN TIME?....................................................... 78

14.      PROJECT 1 - DESIGN A WIRELESS SENSOR NETWORK...................................................................... 83

15.      PROJECT 2 - DATA ACQUISITION, LOCAL DISPLAY, AND COMPONENT ABSTRACTION. ....................... 84

16.      PROJECT 3 - LOCAL NEIGHBORHOOD DATA SHARING ....................................................................... 86




                                                                                                                                                           1
1. What are sensor networks good for?

People have always used technology as a means to view the world. The telescope can be compactly described as a

technology intended for viewing far-away things. Similar technologies include the magnifying glass and the microscope

for small things and the supercollider for atomic events. Along the way, the responsibility for analyzing and presenting the

data obtained from these devices has been shared with computers. We use simulations to understand events that would

be too complex to grasp, like the explosion of the atomic bomb or the weather. These simulations are based on the
technology of statistical modeling, used to extract simple principles from complex and incomplete data.


Ecologists, economists, and social scientists, in contrast to the physicists, chemists, and biologists, have not b een as

well-aided by technology in viewing the bigger picture, the large-scale system. One of the most used observational

techniques in these fields is the survey. A few data points are collected about a complex system, usually with a lot of

human labor, and then the results are generalized to the system as a whole. But, the survey suffers from a lack of

density in both space and time: it takes too much work to survey every person, count every car, or measure every plant,
and it's impossible to keep monitoring everything continuously.


The technology we're going to study in this course, the wireless sensor network, is a step towards more detailed

observation of the bigger picture. It may be helpful to think of the sensor network as a "macroscope", a device f or seeing

large. What does it take to see large? It takes a lot of sensors, well-distributed over the necessary space, measuring the

phenomenon frequently enough to capture it. It takes a technological infrastructure for collecting this data and

transmitting it to a human observer. Finally, it takes an analysis technique that can make this embarrassment of data
more comprehensible, in order to extract meaning.


When studying and applying the macroscope, it helps to keep this question in mind: what is the needed sampling

density? How frequently does the phenomenon change over time? How can we distribute the samples so as to cover the

range of the phenomenon in space? As you may know from acoustic sampling theory, it's impossible to measure a wave

that changes at a rate greater than half of the sampling rate (the Nyquist frequency). Well, large-scale events have a

spatiotemporal Nyquist frequency of their own, and just figuring out that frequency can go a long way towards
understanding the phenomenon.


To better understand what the wireless sensor network macroscope can do, let's study an example of what it's already
done.


Consider the coastal redwood tree: the tallest species of tree in the world, and it can only survive in a relatively small

portion of the California coastline, shrouded in thick fog that comes and goes unpredictably. A tree is utterly dependent

on the water it pumps from the ground into the air. In fact, over 70% of the water flowing through the water cycle moves

through trees instead of evaporating directly from the ground. But, the ability of a tree to move water is highly dependent

on the local microclimate. Sunlight causes stomata in the leaves to open, and then the difference between water vapor

pressure in the leaf and water vapor pressure in the air causes water and gases to move through the stomata. The vapor



                                                                                                                             2
pressure of water is a function of local temperature and relative humidity. So, by monitoring sunlight, temperature, and
humidity, we can start to understand the microclimactic variables that affect the ability of a redwood tree to live.


The local interaction between the leaf and its microclimate is well-understood. But, consider the whole canopy. We often

assume that weather is relatively homogeneous: the humidity a short walk away is pretty similar to the humidity right here,

though the humidity a short walk away is not affecting the tree. To understand the complex interaction between an entire

tree and its local microclimate, we need to measure the microclimate right at the tree. Now take that short walk again,

directly upwards. A mature redwood tree can be 80 meters tall. The humidity at the base has a much lesser influence on

the leaves than the humidity at the canopy. Because a redwood tree takes up so much space, we can't just take one

microclimate reading at a weather station and assume it applies to the whole of the tree. To understand the interaction

between the tree and its microclimate, we need spatially dense sampling. And as we've all experienced, weather

changes quickly. Fog rolls in, dew forms, rain falls, and sunlight dries out the forest. We need temporally dense sampling
too.


How can we gather dense data on the complete microclimate around a tree? We could put a sensor platform on a winch,

and haul it up and down the tree as it takes readings. Researchers have done this, and have gathered some good data,

but we can't observe a whole tree with just one sensor, no matter how close it gets to that tree. So, we need many

sensors. How are we going to get the data from these sensors? We could run a cable from each sensor to the ground,

and collect the readings directly. But, draping an 80 meter tall tree with cables is no easy task, and could actually disrupt

the growth of the tree. So, maybe we could give each sensor the ability to record its data locally, and then send a grad

student up into the tree every week in order to dump each of the sensors. Have you ever climbed a redwood tree? It's

really hard. So, we need wireless networking. We could give each sensor a radio transmitter and let it beam its data back

to an antenna on the ground. But, long-distance radio transmission does take a lot of energy. We stopped the grad

student from collecting data every week, but she still might have to climb up there to replace the batteries. To save

energy, we can make all of our radio transmissions short-range: long enough to reach some of the other sensors, but not

long enough to reach the ground. Then, as long as every sensor is able to receive data from other sensors, store it, and

forward it along, we can collect data from a large area with little energy and minimal labor. What we have now is a

wireless network of sensors, and a network needs processing power. Each simple sensor becomes a networked

computer, with sensing and a radio transceiver, and finally, we have a macroscope that can continually gather dense
data on the complete microclimate around a tree.


Researchers at UC Berkeley did exactly that. With a wireless sensor network, they were able to monitor the microclimate

throughout the height of a 70-meter tall redwood tree for 44 days. These two numbers define the size of the "deployment

envelope" in both space and time. Because this was the first deployment, the researchers only considered the height

dimension, ignoring the distance from the tree trunk and the radial location around the tree. Most of the sensors were
placed close to the trunk, and most of the sensors were placed on the west side.


Having defined the size of the deployment envelope, we can now talk about the sampling density within that envelope.

33 sensors were deployed in an irregular pattern, but on average, were spaced every 2 meters. Each sensor took a

reading every 5 minutes from 4 sensors: temperature, relative humidity, photosynthetically active solar radiation coming

in from above, and reflected PAR coming in from below. Each data point was immediately forwarded through the network
                                                                                                                            3
to a small PC-class base station suspended from a second tree, and then stored into a local database. Some readings

were retrieved over a cellular modem, but the bulk of the data was retrieved each week by driving out to the site and
hooking a laptop up to the base station. Once collected, the data was visualized with MATLAB.


In this example, we see the main architectural components of a sensor network:


        data acquisition by sensors

        data transport within in the network

        data storage at a network gateway

        data transport over a long-distance backhaul to the observers
        data analysis by the observers


Each point in this collected dataset has a location in a 3-dimensional space: the time dimension, the height dimension,

and the space of possible readings from that sensor. We can see the wide range of microclimate variables by projecting

the 3-dimensional data onto the sensor dimension and visualizing it with a histogram. We can study the changes over

time by projecting onto the sensor and time dimensions.


But, range and time data would also be obtainable by a single sensor placed within the tree. The true benefit of the

macroscope becomes visible when we project onto the sensor and height dimensions. Now we can see exactly how

incident light falls off as it moves through the canopy. For the temperature and humidity data, the variability over time has

overwhelmed the differences over height, but we can make the spatial variability "pop out" by subtracting out the mean

reading for each timestep and then re-plotting the data. Now we can see how the temperature is lower under the canopy,
and how the humidity is lower at the top.


Finally, we can consider all three dimensions simultaneously: time, height, and sensor readings. We make the

visualization easier by using our own time dimension: a movie. Now, we can see the exact curvature of the microclimatic
variables at any instant in time, and then see how the shape of that curve changes from minute to minute.


This highly-detailed data can now be used by biologists to validate their own models of redwood tree microclimate. Other

aspects of a tree's behavior can be monitored, like sap flow rate, and then the correlations between microclimate and

sap flow rate can be studied directly. The macroscope doesn't just collect data: it collects data that can be used to test
theories, and data that can be used to make decisions.


The theories testable by macroscope data are not only quantitative and scientific. The second major use of wireless

sensor networks is for event detection. Imagine that you own an oil pipeline in the middle of the desert, and want to make

sure that nobody can get close enough to sabotage it. With sensor network nodes that can detect moving people and

vehicles, you can set up a networked tripwire along the length of the pipeline. Now, the spatial deployment envelope has

2 dimensions -- the length of the pipeline, and a few hundred meters on either side, but the temporal envelope is trickier.

You're not interested in collecting data at a fixed frequency from all sensors, but instead, you only want to collect data

from the sensors that have detected a potential problem. This introduces another element of our sensor network picture:
detection logic in the nodes.


                                                                                                                             4
This project is also real, and was demonstrated by a group of researchers from Ohio State. They were able to detect

some moving people and vehicles using a 1000-node network spread over a 1.2 kilometer by 200 meter grid. Each node

used a combination of passive infrared sensing (like a motion detector on a door), acoustic sensing, and magnetometers

to detect moving metal objects. Logic on the node was used to fuse these three sensor readings together, and report a

detection with a certain level of confidence. If the confidence level was over a particular threshhold, then the data was

forwarded through the network back to a base station. In fact, because this network was so large, it would have taken too

long for the data to move from the center of the network to the edge using only the other sensor nodes. This project

deployed 100 separate base stations throughout the space, and attached them together using an 802.11 network. This

high-capacity, high-power network formed the second tier of the transport network. Eventually, all the data was routed to
a single PC, which displayed the detected targets in their correct location.


A third class of sensor network lies somewhere in between the environmental monitoring class and the event detection

class. This sort of network monitors specific things, in space. Researchers at Harvard university have a sensor network

called VitalDust. Each sensor contains a pulse oximeter that can be attached to a person's finger, and can monitor basic

vital signs. The network is intended for rapid deployment at a mass disaster to assist rescue workers in determining

which people are in need of the most help. Here, the spatial deployment envelope is a large disaster scene, but the
spatial density must be shaped in a very specific way -- one sensor per person.


Many other projects can be placed into these three classes of sensor network, and other networks certainly fall in
between them.


        monitoring the environment

        monitoring things
        detecting events


The underlying concept remains simple: dense monitoring of large spatial regions as a new way to see the forest, and

see the trees too.




                                                                                                                            5
2. What hardware is making wireless sensor networks possible?

Wireless sensor networks can be viewed as the latest example in the trend towards computing miniaturization and

personalization. From mainframes, to PCs, to laptops, to PDAs, to cell phones, the number of people per computer has

been steadily falling, and they are becoming more distributed, more mobile, and more embedded in everyday life. The

best way to reduce the number of people per computer is to put a given level of functionality into a smaller, cheaper,

lower-power unit. One step smaller, and we can see wireless sensor network nodes, hundreds for every person,
streaming information to and from the physical world.


There are three trends in computer system design that are making wireless sensor networks possible:


        complete systems on a single chip

        integrated low-power radio communications
        integrated low-power transducers


The sensor network platform that we'll be using in this course is the Telos rev. B, developed at UC Berkeley and now

sold by Moteiv Corporation. The Telos platform is the latest result of these three trends, but is preceded by a long history
of wireless sensing and networking platforms.


During the Vietnam war, the Air Force deployed a project called Igloo White into the jungle. Igloo White used the sensor

network principle of dense instrumentation over a large geographic area in order to detect moving convoys of enemy

trucks. A single Igloo White node, called an ADSID, for Air Delivered Seismic Intrusion Detector, used a sensitive

seismometer to detect vibrations from moving people and vehicles. Each ADSID was about 3 feet long, weighed 38

pounds, and was shaped like a spike so that it could penetrate the ground and remain upright and camouflaged among

the foliage. The ADSID contained a 2-watt analog FM transmitter tuned to a unique frequency for sending detection

events back to overflying airplanes. These devices didn't do any collaborative forwarding, so every node needed to reach

the airplane directly. At full power, reporting a reading every 3/4 second, the node would last for 48 hours. Clearly, thi s

was not intended for long-lived deployment. But, the designers did understand the tradeoff between data quantity and

lifetime, and included a low-power mode that only reported every 10 seconds and extended the lifetime to 45 days. The

data received by the airplanes was relayed back to a control center using a higher-powered radio (a second tier backhaul
network), where it was analyzed by computers and people. The locations of detections were shown on a wall -size display.


The system suffered from mixed results, being very vulnerable to false alarms. The device could detect a single person's

footsteps 30 meters away, as long as nothing else was going on. It being war, of course something else was going on.

Earth tremors, wind, thunder, aircraft, helicopters, and bombs created enough noise to drastically raise the false alarm

rate for seismic detection. Vietnamese soldiers commonly disabled them, and once it was known that the Air Force would

occasionally bomb regions based only on this sensor data, would spoof them in order to waste American resources. It's

unclear how much the system actually helped the war effort, but it was incredibly helpful for wireless sensor networking
technology.



                                                                                                                               6
Move ahead to 1970, and we find the first use of digital packet radios for wireless networking in the ALOHAnet project.

ALOHAnet nodes were scattered across Hawaii, and were organized into a star topology in which each node transmitted

packets directly to a central receiver. In contrast to the ADSIDs, the ALOHAnet nodes all operated in the same frequency

band. This led to the development of wireless medium access control, which we will discuss later. The nodes could

transmit 88-byte packets at a rate of 9600 bit/sec. However, these nodes were used for interactive computing, not remote
sensing.


In 1979, we see the first use of digital packet radios for mesh networking: the DARPA Packet Radio project. Each PR

node could transmit at up to 400kbit/sec, on the 1.7 GHz frequency band, and was located in a moveable van. To form a

mesh network, each node collected information about the other nodes it could reach with its radio, and built the

necessary routing tables in order to forward packets to any other node. These nodes could be remotely debugged and

reprogrammed over the network, a necessity for a large mesh network. But, the nodes were heavy, expensive, required
a lot of power, and had very low processing capability.


In 1994, the idea of Low Power Wireless Integrated Microsensors was proposed by Pottie and Kaiser at UCLA, and in

1996, the first shoebox-sized devices were produced. By 1998, handheld PC104-sized devices were in production. In

1999, researchers in the Smart Dust project at UC Berkeley made a "COTS mote", using Common Off-The-Shelf

technology to build the smallest, cheapest node that actually worked. This mote, called WeC, can be considered the
earliest direct ancestor of the devices that we're using today.


The first major trend, remember, was complete systems on a chip. The WeC was designed around such a chip, a 4MHz

Atmel microcontroller. In addition to a standard processing core, this microcontroller contained 512 bytes of RAM for

program execution and 8KB of reprogrammable flash storage to hold program code. Integrated program storage and

RAM dramatically simplified the design of the mote, but the tiny amounts of both required a different programming
mindset that led to the development of TinyOS, which we'll discuss later.


The microcontroller provided a software-controllable counter that could interrupt the chip at a programmable frequency.

On top of such a counter, it becomes possible to build a timer that could be used to periodically execute code, a

necessity for a responsive interrupt-driven system. For super-low-power operation, the counter can be driven by a slower
off-chip oscillator, instead of the high-frequency on-chip oscillator.


In addition, the microcontroller in WeC included an on-chip analog-to-digital converter with 10-bit resolution. The simplest

sensors, like photoresistors and thermistors, are analog devices with a voltage drop that changes in proportion to the
natural phenomenon. An on-chip ADC makes it much easier to connect these sensors.


In the third major trend, we see integrated low-power transducers. These advanced sensors contain their own analog-to-

digital converters, because it allows a better fit between the characteristics of the ADC and the sensor, and more precise

control over calibration. These digital sensors could communicate with the microcontroller over a digital bus, examples of
which include SPI, I2C, and OneWire.




                                                                                                                          7
When you have a small integrated system, of course, you need a way to get data out of it. The most basic form of data

output was the LED. WeC contained 3 directly-controllable LEDs, which turned out to be so useful for debugging that all

subsequent motes still provide them. For something more complex than simple on-off debugging, the WeC

microcontroller contained a UART communications controller which provided a serial interface to a host PC device. The

UART controller could transmit data, one byte at a time, and provided an interrupt to the processor when transmission
was complete or when a byte had been received.


More generally, these microcontrollers contain integrated interrupt controllers to coordinate all aspects of the system.

The timer interrupt, the UART send and receive interrupts, the ADC-complete interrupt, and the digital bus send and
receive interrupts are all handled directly by the chip and provided to the software.


Finally, these integrated microcontrollers must be programmed somehow. The original microcontroller on WeC provided

a direct parallel interface for downloading code. In addition to direct programming, self-reprogramming is an extremely

desirable trait, because it could allow the running program to be changed at runtime. To do this, WeC included a

coprocessor that could read program code from an off-chip flash memory device, and then reprogram the main
microcontroller.


The second trend, just as important as integrated microcontrollers, is integrated low-power radio communications. WeC

used the RFM TR1000, which could transmit about 15 meters on the 900MHz band at 19.2 kbit/sec. Clearly, with such
an short-range radio, mesh networking would be required to cover any large-scale deployment envelope.


The broader trend in mote microcontrollers is increased integration, which saves on both power and complexity. The

original vision for Smart Dust was that sensing, analog-to-digital conversion, processing, communications, and energy

scavenging could all be included on a single piece of silicon. One such chip was created, but the vision may have been a
bit ahead of its time.


Along the way, a general sensor and peripheral connection interface was created, and a number of integrated sensor

boards were built to fit it. Each sensor board was designed around a specific task. The basic sensor board included light

and temperature sensors, which used the on-chip ADC. A more advanced sensor board included a 2-axis accelerometer,

a magnetometer, a microphone, and a matched pair of tone generator and tone detector. A third sensor board included

an ultrasound chirp generator and an ultrasound detector. A fourth board was designed specifically for weather

monitoring, and included highly-calibrated total solar radiation detectors, photosynthetically-active radiation detectors,

relative humidity, barometric pressure, temperature, and acceleration. This was the basis for the board used in the

redwood tree monitoring application we discussed last class. These general-purpose sensor boards were the basis of

many application prototypes, including acoustic ranging, moving object tracking, vibration detection, and environmental
sensing, all of which we'll discuss in detail later.


Fast-forward a few years, and we come to the current generation of hardware, called Telos. The basic design has
remained the same: integrated microcontroller, radio chip, flash storage chip.




                                                                                                                             8
The new TI microcontroller includes more program storage space (48 KB), more RAM, (10 KB), and runs at about th e

same speed. Compared to traditional programming environments, the amount of processing power and memory is quite

small, and is likely to remain so for some time. The exponential expansion described by Moore's Law is constrained here

by the energy needed to drive the extra silicon. So, one design principle of wireless sensor network applications is to be

extremely aware of computational limits. This design principle has been reflected in TinyOS, which we'll discuss in a few
days.


The new Chipcon radio has a longer range (50 meters), transmits at a faster speed, (250 kbit/sec), and is a true packet

radio that accepts whole packets and relieves the microcontroller of the burden of sending them byte-by-byte. However,

low-power radio communications remain "messy", and we can't just overcome the messiness by brute-force transmission
of redundant data due to the increased energy cost. We'll discuss this at length in a future lecture.


The biggest difference is a rethink of the interconnection system. All previous motes have used a parallel port for

downloading new programs into the microcontroller, and a serial port for exchanging data with the microcontroller. As we
all know, these ports were going the way of the dinosaur, and so Telos switched to using the USB port for both functions.


The general sensor board connection interface was also abandoned, for several reasons. The designers of the Telos

platform have argued that a customized printed circuit board should be created for any "real" wireless sensor network

deployment, and that the custom board should have integrated sensing. This removes the "weak link" of the connector,

which is more vulnerable to jostling and corrosion. However, for experimentation purposes, the Telos board does include

a simple expansion interface that provides a few ADC connections, UART, I2C digital bus, and basic digital I/O. Whether
a similar number of sensor boards will be created for this interface remains to be seen.


A wireless sensor network node, like any general purpose computer, is composed of a processor, memory, storage,

communications, and I/O devices. But, "wireless" doesn't just apply to the communications -- it applies to the power

source as well. The most important piece of the wireless sensor network node is the energy storage device, usually a

battery. Above all else, it is awareness of this finite energy reserve that drives the design of the remaining system.

Wireless energy scavenging devices can help stretch the energy reserve, but the length of any sensor network

deployment envelope is ultimately shaped by the amount of energy available to the node. When the energy runs out,

either the deployment is over or a person must be sent to the field to refill the node, and then we might as well be
sending a person out to collect the data also.


Let's start by examining the available energy storage options. The basic AA alkaline battery stores 2850 milli -amp hours

of energy. For comparison, a typical LED draws about 6 milliamps of current. This LED will stay on for about 20 da ys,

and towards the end of that time period, it will grow dimmer as the battery voltage drops below the rated output of 1.5
volts.


Now let's consider the simplest and most effective form of energy scavenging: solar power. A representative 30 cm^2

solar panel can generate a current of 40 mA at 4.8 volts, about 6 mW/cm^2 in direct sunlight. This is far below the total

amount of available solar energy, approximately 100 mW/cm^2, but the efficiency of solar cells has been slowly



                                                                                                                             9
improving. Of cours, direct sunlight is never continuously available. Other experimental energy scavenging techniques
include vibration scavenging, which could capture about 2 micro-Watts per gram of vibrating material.


So how much energy do these devices need? As microcontroller developers have become more aware of this battery-

powered low-energy design space, the current required to actively run a processor has been dropping. The Telos
MSP430 microcontroller has an active current draw of 3 mW, enough to run for a month or two on standard AA batteries.


But, this ignores the cost of running the radio. Transmitting a message, the radio draws power at 35 mW, and this cost

has remained roughly constant over past few years. The energy required to transmit a given distance d is proportional to

d^n, where n varies based on the environment. In free space n is about 2, and near the ground, n is about 4. Indoors, we

see even more complications, due to reflection and scattering. This suggests that several short range transmissions

would be more energy-efficient than one long-range transmission. Transmitting just requires a lot of energy, but
fortunately, is not likely to be happening constantly.


However, listening is. In addition, as radios have become more complex, the cost of listening for a mes sage has risen

drastically. The radio used in the first UCB motes drew 9 mW of power just while waiting for messages. The radio in the
newest Telos motes now draws 38 mW of power while listening. The real energy cost of radio usage is not in transmitting
messages, but in waiting to receive them.


We can learn something from considering the relative costs of the processor and the radio. Processing one CPU

instruction at a power consumption rate of 3 mW on a 4MHz clock requires 0.75 nJ. Sending or receiving one bit at a

consumption rate of 35 mW on a 250kbit/sec channel requires 140 nJ, around 200 times the energy. Running the radio

costs so much more energy than running the processor. If you can use processing to save radio transmission time, this is

also a huge energy savings. This suggests that in-network processing, compressing streams of readings, and using on-

mote detection logic instead of sending back raw data are all big energy wins. Of course, these solutions introduce their
own complexities as well.


When we look at the combined cost of running the platform, 3mW for the processor and 38 mW for the radio, one thing

becomes apparent. The total cost of 41 mW will exhaust a pair of AA batteries in about a week. Clearly, this is far too

short for an effective WSN deployment. Discussions with people interested in using wireless sensor networks suggest

that the minimum cost-effective deployment envelope is a year or two. But, even the older simpler motes would have still
exhausted those batteries in 2 or 3 weeks. How are we going to solve this problem?




                                                                                                                          10
3. What principles shape the design of WSN applications?

Let's start with a hardware principle, before we move on to the organizational principles. As we learned last lecture,

wireless sensor network hardware is fundamentally limited by energy. For example, the Telos platform requires 41 mW

of continuous power when actively processing and communicating. Running on 2 typical AA batteries with energy
storage capacity of 2850 mAh, an active Telos mote will die in about a week.


Because wireless sensors are intended to assist human information-gathering abilities, it's not cost-effective to send a

single human out to interact with a single mote more often than once a year. So, how are we going to get this mote
platform which would die in a week to last for a year?


The solution is to let the mote sleep nearly all of the time.


Any microcontroller being used in a mote must have the ability to enter into a low-power sleep mode. In this sleep mode,

the MSP430 processor consumes 15 micro-watts, 200 times less power than when active. A sleeping processor would

take over 20 years to discharge the batteries. In addition, the radio can be turned off completely by the processor, and

turned on when necessary. In fact, batteries leak current faster than a sleeping mote, and would discharge themselves
before the sleeping mote would.


There are two kinds of sleeping that contribute to mote longevity: unscheduled sleeping and scheduled sleeping.

Unscheduled sleeping results from careful operating systems design and application design. Here's a design principle:

motes are fundamentally reactive devices. The standard computational model of "program runs to completion" does not

apply to a system embedded in the physical world. Thus, when there's nothing going on in the world, there's no reason

for the mote to remain active. With careful operating system support, the mote processor can be put to sleep whenever

the mote has finished a task, and remain asleep until the next task must be performed. The processor can then be

woken up by an interrupt from a peripheral like the radio or a sensor with internal detection logic, acquire data, process it ,

communicate it, and then return to sleep again. In this model, the radio can be thought of as just another data source,
sensing what the other motes are doing.


Scheduled sleeping, or "duty cycling", has the potential to offer even more energy savings. A duty cycle is built from
answering two questions: how often does the mote have to be awake, and how long does it have to stay awake? By

careful consideration of the environmental phenomenon under study, we can decide how often the mote has to be awake

in order to effectively capture that phenomenon. Thinking back to the microclimate study, each mote onl y sampled its

environment every 5 minutes. So, the mote only has to be awake once every 5 minutes. Then, we need to think about

the complexity of the task. The mote has to take readings from several sensors, send its data, and participate in the

forwarding process in order to get data from every mote to the base station. When planning the study, the designers

decided that 3 seconds of awake time would be enough to complete these tasks. Thus, we have 3 seconds of awake
time for every 5 minutes of total time, or a duty cycle of 1%.




                                                                                                                            11
The 1% duty cycle is commonly considered to be the "right" one for long-term mote operation. With a 1% duty cycle, we

can calculate the effective power consumption of the mote, and then calculate its new lifetime. If a mote consume s 0.015

mW 99% of the time, and 41 mW 1% of the time, it effectively consumes 0.424 mW. This extends the life of the mote
from 1 week to nearly 2 years. Problem solved.


But, what do we lose with duty cycling? Radio throughput goes down to 1% as well, and if we haven't accurately

estimated the traffic demands of the network, then we may have to buffer data or lose it. We can only read our sensors

1% as often, and if we haven't accurately estimated the environmental phenomenon, we may not capture it. With th ese

sort of long-term duty cycles, we also lose interactivity. If the mote is only awake every 5 minutes, we have to wait 5

minutes for the mote to respond to any of our commands. So, a more advanced solution would be to vary the duty cycle

as necessary. We could leave the motes in a 1% duty cycle as long as they're collecting data, and then switch them to a

100% duty cycle for a short period of time in order to interactively exchange data with them. We'd still have to wait 5
minutes for the first command to go through, but then we'd have full interactive control for as long as we need.


Effective duty cycling requires the ability to quickly wake up the processor, and let it go to sleep just as quickly. The

MSP430 processor can be woken up in 6 microseconds, and the radio can be woken up in a few hundreds of

microseconds. Now think about waking up your laptop from "suspend" mode. If you're lucky, it may be ready to start work

in 3 seconds. But, we've only allocated 3 seconds for the mote to perform all of its useful work, fast wakeup becomes

critical. No matter how fast the hardware can wake up, if the operating system takes a long time to get ready to run,

we've lost the advantage. So, TinyOS has been explicitly designed for quick startup, as well as the quick sleep required

by unscheduled sleeping. TinyOS wastes no time on context switching, memory copying, or other preparations for
execution, as we'll see later.


So, we've solved the energy problem. There are many other design principles that shape wireless sensor network
system design. Here are a few:


* Multiple spatially-separate nodes must coordinate. * Each node is limited in computational and communications ability.
* The nodes are too numerous and inacessible for direct interaction. * The application drives the design of the network.


Let's consider some of the implications of these design principles.


Multiple nodes must coordinate in order to perform a task. This means that every wireless sensor network application

must be designed to run on a distributed system. Distributed applications are constantly concerned with the problem of

consensus, deriving from the fact that it's physically impossible to transfer information to every node instantly in all but

the simplest systems. Not every node will receive every piece of information, and therefore every node will have a

different view of the world. Therefore, your applications must be designed around approximate consensus. It's okay if all

of the nodes aren't exactly synchronized, as long as they continue to fulfill an abstract upper-level goal. For example,

your routing tree may not be optimal, indeed, cannot be, but as long as it's moving "enough" data, it's fine. But, "enough"

must always be specified in probabilistic terms. How much is enough? 75% of the data, all the time? 100% of the data for

a while, then none of the data for a while? This problem of quantifying the performance of a distributed system will



                                                                                                                               12
occupy a lot of your evaluation time. You'll find yourself wanting to trade-off between engineering out the complexity and
increasing the probability that useful work gets done in spite of the messiness.


Not only are there multiple nodes that must coordinate, but these nodes are spatially separated. In Internet-scale

distributed systems, we tend to assume that every node can communicate with every other node, but that the

communication will be imperfect. In spatially-distributed systems, every node can't even communicate with a portion of

the network. From the viewpoint of a single node, the network is always partitioned. This division between "neighbors"

and "non-neighbors" is fundamental to the design of sensor network applications. But here's the problem: the line is

constantly moving. Because radio signals fade and shift in response to environmental conditions, a node that was my

"neighbor" two seconds ago may not be my "neighbor" now. So, any higher-level application that wishes to provide the
illusion of a connected network must be constantly responding to the shifting disconnections underneath.


The second major design principle is that each node is limited in computation and communication. One natural response

to the challenge of consensus building is to keep sharing information until every node knows about all of the information

in the network. For example, you may want every node to have a list of every other node in the network, and then for

every node, a specific neighbor that provides the best next-hop for message forwarding. This may seem reasonable for

small systems, but you always need to consider the effect of scaling up your solution. Want to keep something like 8

bytes of state for each node in the network? You'll run out of RAM if your network ever grows to more than 1000 nodes,

and probably sooner. Then consider the amount of radio bandwidth it will take to maintain this information. If every node

has to announce its 8 bytes to every other node in the network, that scales by the square of the number of nodes. Try

periodically pushing 8 megs of data through the network, and you'll find yourself rapidly running out of energy. So, after

you get past the prototyping phase of your application, you'll want to try to reduce the amount of state it takes on each
node, and the amount of communication it takes to keep that state updated.


Of course, we all know that "premature optimization is the root of all evil", and this holds doubly true in sensor networks

thanks to the third design principle. There are a lot of nodes, the nodes are embedded, and you may not be able to reach

them at all. This means they're HARD to debug. Even the ability to get information in and out of the node over a direct

serial connection still relies on having some components of the system functioning. Getting information from a buggy

network in which each node must cooperate is even harder. Even gathering information from a correctly functioning
network in order to evaluate its performance may change the performance of the network.


Therefore, the best way to prototype a sensor network application is to use a simulator. In simulation, we see a tradeoff

between fidelity and scalability. A high-fidelity simulation accurately reflects the real world. A highly-scalable simulation

can simulate large numbers of nodes. At one end of the scale, we see abstractions of the algorithms involved, running in

something like MATLAB, Mathematica, or hand-coded simulations. At the other end, we see direct emulation of the

sensor network hardware, running real binary code, and using a complicated radio model to simulate communications.
Clearly, the amount of computation required to simulate the first is far less than the second.


For this class, we'll be using the TOSSIM simulator. TOSSIM uses real TinyOS code, but instead of compiling it for the

mote hardware, compiles it to be executed as a standard process on a PC. All of the calls to hardware devices like the

radio, output LEDs, or sensors are replaced with simulation functions. The TOSSIM executable can then be given

                                                                                                                            13
runtime parameters to set the number of nodes in the simulated network, and can be given a representation of the

connectivity graph for that network. With TOSSIM, you can include printf-style debugging statements, which you would

have a very hard time doing on a small embedded system. You can make the network less well -connected and study the

response of your algorithms. In short, you can convince yourself that your application is likely to work before you ever put

it onto sensor network hardware. I'd highly recommend that you prototype all of your applications using TOSSIM before
installing them.


Once the network is deployed, the large number of inaccessible nodes presents a different problem: you can't really

control them individually. Each router in a large-scale internetwork still needs some configuration before it can participate

in the adaptive routing structure. In a wireless mesh network, hand-configuring each node before it can join the network

would take a lot of extra work. The configuration process at deployment time should be as automatic as possible. A

single touch of the power switch, if there even is one, should be the platonic ideal of sensor network configuration.

Practically, reaching this point can be difficult, especially for heterogeneous networks. Then, once a router is a member

of the network, an administrator might use an end-to-end management protocol like SNMP to modify parameters on that

router. In a wireless mesh network containing a lot of nodes with relatively similar functions, an administrator is more

likely to change a parameter on all nodes than on just one. Interaction with the network should support both individual
and aggregate operation, multiplying user commands as necessary.


Finally, we come to the most important design principle: the application drives the design of the network. In contrast to

general-purpose computing, which is structured around interpersonal communication and symbol manipulation, sensor

networks are structured around acquisition of physical information. There are just too many different kinds of information

available in the world to design a general-purpose sensing node, Star Trek tricorder notwithstanding. So, the purpose of
the network shapes the choice of sensors.


In addition, the purpose of the application and the size of the deployment envelope shapes the design of the energy

system. The disaster-site triage network probably doesn't have to last more than a few days, while the environmental

monitoring network may have to last for years. The environmental monitoring network probably has access to solar
power, but the indoor air-quality monitoring network in the skyscaper doesn't have that opportunity.




                                                                                                                            14
4. How does the nesC component model support WSN application design?

Now that we've discussed the design principles and hardware making wireless sensor networks possible, let's talk about

programming. The early wireless embedded sensing systems ran on PC class hardware, and mainly ran Linux programs.

When WSN development shifted from microprocessors to microcontrollers, Linux was clearly no longer an option.

Embedded systems applications of the time were commonly developed in procedural C or directly in assembly language.

But, programs in these languages are hard to decompose, not particularly modular, and can easily get out of control as

the complexity of the application increases. In large-scale systems, the scalability problem is often addressed with object-

oriented programming, which makes it much easier to separate complex programs into independent, composable

components. But, object-oriented programming demands dynamic memory allocation and tends to require more
processing resources, which makes it unsuitable for embedded systems.


nesC, developed by UC Berkeley researchers, in the simplest view, can be considered an extension to C that enables
component-based programming. In fact, nesC offers three main contributions:


         component-based programming for lightweight event-driven systems

         an expressive concurrency model that requires minimal resources
         static program analysis to improve reliability and reduce code


nesC has proven to be very effective for wireless sensor network application development, and the entire TinyOS

operating system has been built to depend on it. Research is under way to enable development on top of TinyOS in other
programming languages, but currently, nesC is the only language that you can use to develop TinyOS programs.


Today, we'll talk about the component-based programming abstractions provided by nesC:


         modules

         interfaces
         configurations


Consider the following simple C program:


 File foo.c:

  int foo_alpha() { ... }

  int foo_beta() { ... }


If we were to rewrite this program using Java classes, which you all know, it might look something like this:


 File Foo.java:

  class Foo {

    int alpha() { ... }

    int beta() { ... }

                                                                                                                         15
  }

The purpose of the class is to combine related functions together into a unit. A nesC module is roughly equivalent to a

Java class, and performs the same major function: grouping related functions.


Thus, we might transform the program into the following nesC module:


 File FooM.nc:

  module FooM {

  }

  implementation {

      int alpha() { ... }

      int beta() { ... }

  }


But, what is a Foo? It's something that you can alpha() and beta(). So, what happens if you want to write a different class

that can also alpha() and beta()? In OO languages like Java, one option is to inherit. Make a subclass of Foo, and then
replace the alpha() and beta() methods with your own. The new class, call it Bar, will act as a Foo, and will typecheck as
one when necessary.


However, in Java, there's another way to do it. You can create an interface that represents what a Foo is supposed to
do:


 File FooInterface.java:

  interface FooInterface {

      int alpha();

      int beta();

  }



 File Foo.java:

  class Foo implements FooInterface {

      int alpha() { ... }

      int beta() { ... }

  }


Then, your Foo can implement the FooInterface, and your Bar can also implement the FooInterface. Any time you only

care whether you can alpha() and beta() an object, you can use a FooInterface object, and both Foo and Bar will fit. The

main point of an interface is to group related function signatures, and provide an amount of enforcement to ensure that a
class implementor has provided a function for everything that the interface advertises.




                                                                                                                          16
nesC is built around the idea of interfaces. In fact, with few exceptions, nesC modules can _only_ provide functions

through interfaces. A module is not defined by the set of functions it exports -- it's defined by the set of interfaces it
exports. Let's return to our FooM module:


 File FooM.nc:

  module FooM {

  }

  implementation {

      int alpha() { ... }

      int beta() { ... }

  }


Now, we'll create a nesC interface analogous to the Java interface above:


 File Foo.nc: (not syntactically correct, yet)

  interface Foo {
      int alpha();

      int beta();

  }


Now, let's modify FooM to export the Foo interface:


 File FooM.nc:

  module FooM {

      provides interface Foo;

  }

  implementation {

      int Foo.alpha() { ... }

      int Foo.beta() { ... }

  }

Here, alpha() and beta() are prefixed by "Foo." to show that these implementations belong to the Foo interface.


We can also create a Bar module that exports the Foo interface:


 File BarM.nc:

  module BarM {

      provides interface Foo;

  }

  implementation {

      int Foo.alpha() { ... }

      int Foo.beta() { ... }


                                                                                                                             17
  }


Now, BarM and FooM are interchangeable, because they both provide the functionality of a Foo. The M after the module

name is not strictly a requirement, but helps to distinguish modules from interfaces, which don't have any su ffix. You
should name your modules SomethingM.


Ok, let's fill in the missing piece so we can produce our first syntactically correct nesC program. The alpha() and beta()

functions are both provided by the implementor of the Foo interface. The name for a nesC function provided by an
interface implementor is a "command". We have to modify our interface and our module:


 File Foo.nc:

  interface Foo {

      command int alpha();

      command int beta();

  }


 File FooM.nc:

  module FooM {

      provides interface Foo;

  }

  implementation {

      command int Foo.alpha() { ... }

      command int Foo.beta() { ... }

  }


So now we have a module FooM, which provides an interface Foo, which contains functions alpha() and beta(). But, you

might say, why should the interface be specified separately from the module if you're sure that no other module will ever

provide these functions? Well, you never know. Wireless sensor network applications change all the time. Who's to say

that someone else, down the line, won't come up with a better way of doing what your FooM does? Here's one of the

main benefits of nesC: components written by different authors who don't even talk to each other still have a good

change of being interchangeably composable. If your functions don't fit the previously-specified interface, the program

won't compile. If you just can't deal with the interface as written, you can go change it, but then you probably have to talk
with the original author. Trapped into composable systems design!


Now that we have one module, let's talk about interconnections between modules. Here's where nesC, C, and Java start
to part ways.


In C, a function can reference another function when you put it into the source code directly:


 void blah_doit() { int x = foo_alpha(); ... }




                                                                                                                             18
Then, when the "blah" functions are linked with the "foo" functions, the call can be made.


In Java, an instantiated class, an object, must acquire some pointers/references to other objects, and then call methods
through these references.


 class Blah {

         Foo foo = new Foo();

         void doit() { int x = foo.alpha(); ... }

 }


In nesC, the linkages between modules are written down prior to compile time. There's no calling through pointers. The

mechanism by which modules are instantiated and connected to others is called the "configuration". Here's a
configuration, called BlahC, that instantiates our FooM module, but doesn't yet make any connections:


 File BlahC.nc:

     configuration BlahC {

     }

     implementation {

             components FooM;

     }


Here's a major difference between Java and nesC: classes can be instantiated many times, but nesC modules can only

be instantiated once. The prototypical example is a module that provides communications functions like send and receive,

over a single radio chip. There's only one radio chip. Why would there be more than one module? Thus, if another

configuration were written and included FooM on the "components" line, it's the same FooM, with the same functions and
private data.


But, this is still pretty useless. What about connections to other modules? Let's say that there's a module called BlahM
that depends on the functionality provided by FooM:


 File BlahM.nc:

         module BlahM {

             uses interface Foo;

         }

         implementation {

             void doit() { int x = call Foo.alpha(); ... }

         }


Note that here, the Foo interface is used instead of provided. Also notice that the call is made through the Foo interface,

and this call is prefixed with the "call" keyword. The BlahM module has no idea that the module on the other side of that

interface is called FooM. The interface decouples the user from the provider, meaning that the provider can be easily


                                                                                                                           19
changed. But, if BlahM doesn't know about FooM, and FooM doesn't know about BlahM, how do they connect? The
configuration:


 File BlahC.nc:

  configuration BlahC {

  }

  implementation {

      components BlahM, FooM;



      BlahM.Foo -> FooM.Foo;

  }


Here, BlahC instantiates the BlahM and the FooM modules, and then connects them together through their Foo

interfaces with a line of "wiring". Wiring is the key nesC concept: modules are wired together. The arrow runs from the

user of the interface to the provided of the interface. Now, when BlahM.doit() calls Foo.alpha(), thanks to the

configuration, it's actually going to call FooM.alpha(). In nesC, construction is separated from composition. Modules are
the unit of construction, and configurations are the unit of composition. Interfaces and wiring tie it all together.


But, are we limited to just one configuration? Does the application designer have to write the super-configuration to tie all

of the needed components together, then tie them to their dependencies, then tie those dependencies together, all the

way down to the hardware? No! Configurations can also provide and use interfaces. Let's say that FooM actually

depends on some other module called FooHardwareM. The author of the Foo subsystem can then create a configuration
FooC to tie them together, and re-export the interfaces:


 File FooC.nc:

  configuration FooC {

      provides interface Foo;

  }

  implementation {

      components FooM, FooHardwareM;



      Foo = FooM.Foo;



      FooM.Hardware -> FooHardwareM.Hardware;

  }


Here, we see that the FooC configuration is also providing the Foo interface. The "=" wiring is used to say that the real

provider of the Foo interface is the FooM module. Now, the user of Foo can wire directly to FooC instead of to FooM.


 File BlahC.nc:

  configuration BlahC {

                                                                                                                            20
  }

  implementation {

      components BlahM, FooC;



      BlahM.Foo -> FooC.Foo;

  }


The compiler will take care of the rest. BlahM.doit() calls Foo.alpha(), which is connected to FooC.alpha(), which is

connected to FooM.alpha(), which actually does something. Configurations and modules, which can both export

interfaces, are collectively referred to as "components". That's why the instantiation line is called "components". It can
refer to either.


To see how this all shows up in the real world, let's take a look at a real TinyOS application. Here's the StdControl

interface. Every component should provide this interface if it needs basic control functionality. "result_t" is a typedef'd

integer, that can take the values SUCCESS (1) and FAIL (0). If your component can't be initialized, started, or stopped, it
would return FAIL. Otherwise, it would return SUCCESS.


 File StdControl.nc:

  interface StdControl {

      command result_t init();

      command result_t start();

      command result_t stop();

  }


So, let's modify our Foo component to provide StdControl:


 File FooC.nc:

  configuration FooC {

      provides interface StdControl;

  }

  implementation {

      components FooM, FooHardwareM;



      StdControl = FooM.StdControl;



      FooM.Hardware -> FooHardwareM.Hardware;

  }



 File FooM.nc:

  module FooM {

      provides interface StdControl;

                                                                                                                              21
  }

  implementation {

      command result_t StdControl.init() { return SUCCESS; }

      command result_t StdControl.start() { return SUCCESS; }

      command result_t StdControl.stop() { return SUCCESS; }

  }


Now, our component can be initialized, started, and stopped. But who's going to call these functions? The Main

component. Let's first make a top-level configuration representing our application:


 File App.nc:

  configuration {

  }

  implementation {

      components Main, FooC;



      Main.StdControl -> FooC;

  }


This configuration doesn't do anything but wire the Main component to our FooC component. The top-level application

configuration must be the only configuration that includes the Main component. Traditionally, it isn't suffixed with a C like
every other configuration is.


Main is the starting component for every TinyOS program. Let's take a look at a slightly simplified version:


 File Main.nc:

  configuration Main {

      uses interface StdControl;

  }

  implementation {

      components MainM;



      StdControl = MainM.StdControl;

  }


Here, we see a configuration that uses the StdControl interface and then uses "=" to pass it through to MainM. "=" also

works for uses, as well as provides. Now, let's look at MainM:


 File MainM.nc:

  module MainM {

      uses interface StdControl;


                                                                                                                           22
  }

  implementation

  {

      int main() __attribute__ ((C, spontaneous))

      {

          call StdControl.init();

          call StdControl.start();



          for(;;) {

              TOSH_run_tasks();

                TOSH_sleep();

          }

      }

  }


Remember how we said nesC is an extension to C? Well, here you can see that statement in action. This the familiar old

C "int main()" function, within a nesC component. This is where execution starts. It doesn't take any arguments, because

it's going to start running when the sensor network node turns on. Where's it going to get arguments from? The first thing

it does is call StdControl.init(), which calls FooM.init(), thanks to the wiring in the App configuration. When this init

function returns, main() calls FooM.start(), which then returns as well. Now, the program has been initialized. The next
line is an infinite loop. main() never returns. We'll discuss the contents of this infinite loop next lecture.


So, we now can see where control flow begins, and how it propagates through the wiring graph to our first component,
FooC.


But, what happens if we want to initialize multiple top-level components? We use more wiring. Let's take a look:


 File App.nc:

  configuration {

  }

  implementation {

      components Main, FooC, BarC;



      Main.StdControl -> FooC;

      Main.StdControl -> BarC;

  }


Interfaces used by components can be wired into multiple providers. When this program is compiled, the multiple wiring

will be translated into a sequence of function calls, with an undetermined order. In MainM, "call StdControl.init()" will

probably call FooC.StdControl.init() before calling BarC.StdControl.init(), but you can't necessarily count on that order.

What do you think happens to the multiple return values in this situation? They need to be combined into a single return

                                                                                                                             23
value. Generally, nesC provides the notion of a "combining function", but practically, there's only one: combine result_t
values by ANDing them. If any call returns FAIL, the caller sees FAIL as a return type.


Now, what happens if FooC wants to initialize some subcomponents? There are 2 ways to do it. Here's the first:


 File FooC.nc:

  configuration FooC {

      provides interface StdControl;

  }

  implementation {

      components FooM, FooHardwareM;



      StdControl = FooM.StdControl;

      StdControl = FooHardwareM.StdControl;



      FooM.Hardware -> FooHardwareM.Hardware;

  }


This uses multiple wiring with the "=" sign. Now, when FooC.StdControl.init() is called, FooM.StdControl.init() and

FooHardwareM.StdControl.init() will be called. There's no guarantee that either component will be initialized before any

other, but usually, FooM will be initialized before FooHardwareM. This is often the wrong order -- you want to initialize the

components you depend on before initializing yourself. If you want to guarantee that FooHardwareM will be initialized

before FooM, there's another way to do it. I've found that using multiple "=" signs with the resulting lack of ordering can
cause subtle and truly annoying bugs, so you'll probably want to do it this way:


 File FooC.nc:

  configuration FooC {

      provides interface StdControl;

  }

  implementation {

      components FooM, FooHardwareM;



      StdControl = FooM.StdControl;



      FooM.SubControl -> FooHardwareM.StdControl;

      FooM.Hardware -> FooHardwareM.Hardware;

  }




 File FooM.nc:



                                                                                                                            24
   module FooM {


          provides interface StdControl;

      uses interface StdControl as SubControl;

  }

  implementation {

      command result_t StdControl.init() {

          result_t subResult = call SubControl.init();

          result_t myResult = // init myself



          return rcombine(subResult, myResult);

      }



      command result_t StdControl.start() {

          result_t subResult = call SubControl.start();

          result_t myResult = // start myself



          return rcombine(subResult, myResult);

      }



      command result_t StdControl.stop() { return SUCCESS; }

  }


Here we see an example of interface instance naming. FooM provides the StdControl interface, but it also uses it. When

code in FooM refers to StdControl, or outside code wires to StdControl, the actual interface is ambiguous. So, FooM

gives a local name of SubControl to the StdControl interface. This means that calls in FooM must be made through

SubControl, and wiring connections in FooC must be made through SubControl as well. But, SubControl is still an

instance of StdControl. If you try to wire SubControl to any other interface, the compiler will catch it and stop the program

from compiling. Thus, you may occasionally see components providing or using interfaces with unfamili ar names. Check
out the component source code in order to see the actual type of the interface.


Then, we see FooM.StdControl.init() calling FooM.SubControl.init(), which actually calls FooHardwareM.StdControl.init()

before initializing itself. The rcombine() function is that combining function we talked about earlier, that ANDs together the
result_t's.


Through multiple wiring and sub-initialization, we can see how the init() function which starts in MainM.main() can

propagate through the entire connected graph of nesC components before returning all the way back up to main(). Then,

the start() function will propagate all the way through the component graph before returning all the way to main(). Now,
the system is ready to begin execution, or go to sleep if there's nothing else to do.


                                                                                                                           25
So, we've seen modules, which contain functions, interface, which contain function signatures, and configurations, which

statically connect modules and configurations together through interfaces. We've compared the nesC examp les to more

familiar examples in C and Java. nesC may seem more complex now, but learning to understand the complexity is often
necessary to gain the benefits of structured component-based programming.


We've looked at the TinyOS boot sequence, and then studied more advanced wiring techniques in the context of

recursively initializing and starting an entire application that has been built from many components. This initialization

sequence is often considered the "top half" of a TinyOS program. Ponder this for next time: what about the "bottom half"?

We said that sensor network applications are interrupt-driven. Well, the events caused by these interrupts are going to
have to go somewhere...




                                                                                                                            26
5. How does nesC support event-driven programming?

Last time we looked at the "top half" of a nesC/TinyOS application, and explained the component model. Now, we'll look

at the "bottom half", and how nesC supports it.


As we've discussed, wireless sensor networks are fundamentally interrupt-driven systems. An interrupt is essentially an

external event that causes the processor to begin executing a predefined function. The processor holds a table of

function addresses, one for each possible interrupt, and relies on the compiler to insert code at the beginning of the
compiled program that fills this interrupt table appropriately for the program.


Let's look at the interrupt handler for the analog-to-digital converter. Why does the ADC use interrupts? It takes a non-

trivial amount of time to convert an analog voltage to a digital reading. When the application wants to take a reading, it

calls a function that starts the ADC, and when the ADC is finished, it sends an interrupt to the processor. This is

necessary to ensure that other processing can continue while the ADC converts the reading. This "other processing", if
there's nothing to do, may actually include the oh-so-vital function of "sleeping".


When the interrupt is fired, the program counter and the registers, which capture the current point of execution, are

automatically copied out and saved by the processor. The processor then begins to execute the ADC interrupt handler,

and when this function returns, the processor copies the saved execution context back into the registers and resumes
execution.


You may wonder if it's possible for an interrupt to interrupt another interrupt. Well, this requires a stack to save the

contexts, which is more complex than a single buffer. The hardware does support it, but we've chosen not to use it

because of the increased potential for conditions. But, there's still plenty of race condition potential between the

"synchronous" top-half program, and the "asynchronous" bottom-half program. Why? Well, the top-half program may set

a variable and expect it to stay the same until the next time it is accessed. If the top-half is swapped out in between these

points, the interrupt handler has the ability to modify that variable. If the bottom-half changes the variable, the top-half

may exhibit unexpected behavior. This is the essence of race conditions. nesC includes support for detecting potential
race conditions before exection, which we'll talk about later.


  module ADCM {

      provides interface ADC;

  }

  implementation {

      TOSH_SIGNAL(ADC_VECTOR) {

          uint16_t iv = ADC12IV;

          signal ADC.dataReady(iv);

      }

  }



                                                                                                                               27
This is an extremely simplified version of the ADC interrupt handler. TOSH_SIGNAL() is a macro that informs the

compiler that this function should be added to the ADC_VECTOR slot in the interrupt handler table. ADC12IV is a

register variable. When you read this variable, you're reading a real hardware register, not a memory location. Before

signaling the interrupt, the ADC fills the ADC12IV register with the digital version of the analog voltage. The interrupt is

used to signal the event, and the register is used to transfer the data. This is a common pattern.


The first thing done by the interrupt handler is to copy the digital value out of the register and into an in-memory variable

located on the stack. Are you familiar with the stack? If not, it's a lot to cover now, but you can think of it simply as where
local variables and variables being passed between functions are stored.


What's "signal"? Now we come to the nesC portion of the lesson. We've seen "call", which is used when a c lient

component wants to execute functionality provided by a server component. But, what about the other direction? What

happens when the server wants to execute some functionality in the client, and possibly pass it some data? It could
provide a function like this:


  interface ADC_Polling {
      command result_t getData(uint16_t *data);

      command bool isItReady();

  }

The client calls getData() and provides a pointer to a memory location that will eventually hold the reading. The client is

then responsible for periodically calling isItReady(), and if it ever returns true, then the data in its pointer is now valid. But,

how often should the client call isItReady()? Should the client even care? Won't this constant polling waste energy? Who

knows, no, and yes.


The solution is to enable the server to call functionality directly inside the client. You may be familiar with "function -

pointer" interfaces like this from C:


  void getData(function* dataHandler);

Or, "object-reference" interfaces from Java:

  interface ADC_Ref {

      void getData(DataHandler dh);

  }

In both situations, the client passes an arbitrary pointer to the server, which it can use to make a callback later. But,

arbitrary pointer-passing goes against the primary design maxim of nesC: everything is static. Static means efficient,

lightweight, and predictable. So, we need to include the idea of callback functions in a static way. With nesC, we use

"bidirectional interfaces":

  interface ADC {

      command result_t getData();

      event result_t dataReady(uint16_t data);

  }



                                                                                                                                28
The client is responsible for implementing an event handling function called dataReady, so that the server may call it.

The server is statically connected to the client's function, just like the client is statically connected to the server's function.

We use the term "signal" to mean a call in the opposite direction. All nesC interfaces have the option to be bidirectional,

and can contain both "commands", which are downwards calls, and "events", which are placeholders for upward calls, or

simply upcalls. This is what the client might look like:

  configuration ADCClientC {

  }

  implementation {

      components ADCClientM, ADCServerC;



      ADCClientM.ADC -> ADCServerC.ADC;

  }



  module ADCClientM {

      uses interface ADC;

  }

  implementation {

      command result_t StdControl.start() {

          call ADC.getData();

      }



      event result_t ADC.dataReady(uint16_t data) {

          // process that data

      }

  }
Now, remember the TinyOS main() function:

      int main() __attribute__ ((C, spontaneous))

      {

          call StdControl.init();

          call StdControl.start();



          for(;;) {

              TOSH_run_tasks();

                TOSH_sleep();

          }

      }

Once StdControl.start() returns, the processor can go to sleep. The getData() function has started the conversion

process, which executes outside the main processor core. The interrupt from the ADC will then wake up the processor

and execute the interrupt handler, which will signal the ADC.dataReady() event. The client component can then process

the data. Once the processing is complete, the processor can go back to sleep.

                                                                                                                                29
So, what happens if the client needs to do some processing based on the sensor reading, processing that might take a

while? We said earlier that interrupts can't preempt other interrupts. An interrupt handler, and the functions it calls,

effectively seizes control of the processor. While it's executing, no other interrupt handlers will execute and the processor
will not go to sleep.


In traditional operating systems, this problem of the executing code taking over the processor is handled in two ways:

cooperation or preemption. Cooperation is conceptually simpler. Each program has total control of the processor until it's

ready to yield that control, at which point the programmer must actively insert a call to some sort of yield function. But,

the processor needs to handle hardware interrupts that may come in at any time. If they aren't handled when they come

in, the data they represent is lost forever. So, if the executing application hasn't yet yielded control, interrupts may easily

be lost. And, the application developer can't possibly know the exact demands of the underlying layers, making it
impossible to insert yields in all the right places.


Preemption is already taking place once, here: the interrupt preempts the non-interrupt code. Introducing multiple

preemption on interrupt handlers would mean that every interrupt gets handled, at least a little bit, but would then require
the extra complexity of the interrupt stack.


nesC and TinyOS use a combination of the two. Code that may be executed directly in response to an interrupt is called

"asynchronous" or "async" code. Async code must be carefully written to do a minimal amount of copying and processing,

and then return control to the processor so that other interrupts may be handled. How minimal? Well, that's a matter of

careful profiling and debugging. When interrupts can't be handled fast enough, the code isn't minimal enough. This is the
cooperative section.


Code that runs in response to the node booting, including the init()/start() sequence, is called synchronous code.

Synchronous code may be preempted by asynchronous code. But, the only thing we've seen happening in the

synchronous code is an infinite loop with a sleep. What good is preempting a sleep? When async code has a lot of

processing that needs to be done, it wants to become preemptible. The way to do this is to stop executing a function
asynchronously, and begin executing it synchronously.


The way we transfer control from asynchronous context to synchronous context is by way of function pointers.


Remember that infinite loop in synchronous context?


  for(;;) {

      TOSH_run_next_task();

  }

Well, here's a simplified version of that function:

  void (*nextTask)(void);



  void TOSH_run_next_task() {

      void (*task)(void);


                                                                                                                              30
      __nesc_atomic_start();



      task = nextTask;



      if (task == NULL) {

          __nesc_atomic_sleep();

          return 0;

      }



      nextTask = NULL;

      __nesc_atomic_end();



      task();

  }

nextTask is a global variable, not within any component, that holds a function pointer to be executed by the synchronous

code. If that variable is NULL, the processor can go to sleep and wait for an interrupt. When the processor wakes up and

executes some async code that wants to become synchronous, it can "post" a task:

  module ADCClientM {

      uses interface ADC;

  }

  implementation {

      uint16_t savedData;



      command result_t StdControl.start() {
          call ADC.getData();

      }



      task void processData() {

          uint16_t data;



          atomic {

              savedData = data;

          }



          // process data here.

      }



      async event result_t ADC.dataReady(uint16_t data) {

          atomic {

                                                                                                                      31
                savedData = data;

\           }



            post processData();

        }

    }

"post" is a special nesC keyword. This is basically what it does:

    bool TOS_post(void (*task)()) {

        __nesc_atomic_start();



        if (nextFunc == NULL) {

            nextFunc = task;

            __nesc_atomic_end();

            return TRUE;

        } else {

            __nesc_atomic_end();

            return FALSE;

        }

    }

Posting a task simply saves the function pointer into a known variable, here called "nextFunc", so that it can be picked up

by the infinite loop running in synchronous context. It's just juggling. The ball goes from the lower hand to the upper hand.


When the interrupt handler returns, the processor goes back to executing the synchronous code. The sleep has been

interrupted, so the processor goes once around the loop, re-executes TOSH_run_next_task(), and then sees that there's

a task waiting to be executed. The processor then runs this task until it returns. This task, because it's been defined
inside the ADCClientM module, can access the "savedData" variable, and the rest of the module's variables. It can do

anything the interrupt handler could do, but it's doing it in a fully-preemptible context. Once it returns, the loop is
traversed, and unless this task has posted its own task, the processor can go back to sleep.


So, async code can schedule process to be executed later. It doesn't know when it'll be executed, only that it will be
executed synchronously. While that task is running to completion, it can be preempted by other interrupt handlers, and
the show can go on.


But, you may have noticed a problem. Let's say that interrupt handler A posts a task A'. The task begins to run. Interrupt

handler B executes, and wants to post its own task B'. What happens? Well, here we're lucky. A' is executing, has been

removed from the "nextTask" variable, and B' can enter. This is fine, as long as we only have one running task and one

waiting task. In practice, every interrupt handler wants to post tasks. What happens if an interrupt handler wants to post a

task, but can't post it? Effectively, the interrupt's data has been lost. The handler has to return, unsatisfied. This is a bad
situation.




                                                                                                                             32
For this reason, TinyOS provides a queue for task pointers. Tasks are posted to the tail of the queue and are executed

from the head, one at a time, to completion. Granted, this queue may still fill up, which would mean that the interrupt

handler still can't post the task. This is a real problem, one that we generally solve by making the task queue "very large".

The next major release of the nesC compiler will count the number of potentially executable tasks, and reserve a queue
of exactly that size. But for now, we just try to make it less likely to happen.


Interrupt handlers aren't the only pieces of code that can post tasks. Tasks can post tasks too. The task queue actually

acts like a scheduler, allowing multiple nesC components to share execution time on the processor. When one

component wants some processing time, it can post a task for later execution. But, the main difference between standard

operating systems and TinyOS is that tasks don't preempt each other. In a preemptive scheduler, a task is given a short

amount of CPU time, then its context is saved, and another task is given some CPU time. In TinyOS, a task runs until its

top-level function returns. Only then can other tasks get a chance to run. So, returning from the task function is a bit like

a yield in a cooperative scheduler, but it can only happen at the logical endpoint of execution instead of in the middle.

This suggests that your tasks should also be reasonably short, in order to give other tasks a chance to run. But, as long

as you avoid things like while loops that may potentially go on forever, you shouldn't be overly concerned with the length
of your tasks.


As for the discussion of race conditions before, nesC provides language support for both detecting and preventing them.

Notice that the ADC.dataReady() event is prefixed with async. The author of an interface can mark functions with the

async keyword to indicate that they may be called within asynchronous context. At compile-time, nesC searches for any

variables that may be accessed by functions marked async and by functions not marked async, which presumably could

be preempted by an async function. These variables are sources of potential data race conditions. The s avedData

variable above is one such variable. nesC provides the atomic {} block, which makes a piece of code nonpreemptible. As

long as we always touch that global variable within an atomic block, we can't have a race condition. But, we don't want to

make the atomic block cover the entire task. So, we copy the variable from a global to a local, and then use the local for

the rest of the task. If a global is only accessed by task-context synchronous code, you don't need atomic blocks

because tasks can only run one at a time. But if you happen to be using async functions, like the ADC, you'll have to pay
attention to them.


So, we've seen interrupt handlers, which signal nesC events, which post tasks, which may call commands, signal events,
or post tasks themselves.


Now, consider this situation: there's more than one component wired to an interface that includes an event. When the

event is signaled, all of the event handlers are executed. What if we don't want every handler to execute? For this, nesC
provides "parameterized interfaces".


Consider this example. The single ADC on the microcontroller can actually be attached to many different data sources,

meaning that it can take readings from many different sensors. Let's give each ADC channel a number, and then extend
the ADC interface we saw earlier:


  interface ADC_With_Channels {

                                                                                                                            33
      command result_t getData(uint8_t channel);

      event result_t dataReady(uint8_t channel, uint16_t data);

  }

If our ADCClientM only wants to get readings from the second channel, it will have to do something like this:

  module ADCClientM {

      uses interface ADC;

  }

  implementation {

      enum { MY_CHANNEL = 2; }



      command result_t StdControl.start() {

          call ADC.getData(MY_CHANNEL);

      }



      task void processData() {

          // process savedData here.

      }



      event result_t ADC.dataReady(uint8_t channel, uint16_t data) {

          if (channel != MY_CHANNEL) {

              return FAIL;

          }

          // process data

      }

  }
ADCClientM has to store the channel number that it's reading. Then, ADC.dataReady() will be called whenever a piece

of data is ready, no matter the channel, meaning that ADCClientM has to check to see whether the channel matches and

not handle the event if it doesn't match. Calling a bunch of handler functions will become a sequencue of if statements,

only one of which should ever be true.


First, this idea of channel selection is more common than just the ADC. Think about TCP ports. Or, just think about any

resource that there might be many of, even though you only need to use one. All our interfaces would have something
like this "channel" parameter, repeated over and over in an ad-hoc manner.


Second, this checking code is totally mechanical, but if you forget to write it, you may handle events that aren' t yours.


Third, a sequence of if statements is much more efficient when written as a switch statement, but we can't do that
because we don't know about the other components.




                                                                                                                             34
In nesC, we use parameterized interfaces to push this check up from the module level into the wiring configuration level.

Then, the compiler can generate the switch statements and the checking code, and we can take the channel parameters
out of the interfaces. Here's how it looks:


ADCClient.h:

  enum { MY_CHANNEL = 2; }



ADCClientC.nc:

  includes ADCClient;



  configuration ADCClientC {

  }

  implementation {

      components ADCClientM, ADCServerC;



      ADCClientM.ADC -> ADCServerM.ADC[MY_CHANNEL];

  }



ADCClientM.nc:

  module ADCClientM {

      uses interface ADC;

  }

  ... just like before ...

Here's how things look on the other side:

  module ADCServerM {
      provides interface ADC[uint8_t channel];

  }

  implementation {

      uint8_t currentChannel;



      command ADC.getData[uint8_t channel]() {

          currentChannel = channel;

          // start the process

      }



      TOSH_SIGNAL(ADC_VECTOR) {

          uint16_t iv = ADC12IV;

          signal ADC.dataReady[currentChannel](iv);

      }

  }

                                                                                                                       35
The interface parameter acts like a virtual function argument that's fixed in the client-side configuration but available in

the server module. Then, when signaling an event, the server provides the parameter argument, the compiler generates

a switch statement, and only calls the interface that has been wired with the correct parameter.


Interfaces can be parameterized on either end, so both:




  provides interface ADC[uint8_t channel];

and




  uses interface ADC[uint8_t channel];

Wiring a client-side parameterized interface looks like this:




  ADCClientM.ADC[MY_CHANNEL] -> ADCServerC.ADC;

Wiring an interface that's been parameterized on both sides looks like this:

  ADCClientM.ADC[MY_CHANNEL] -> ADCServerC.ADC[MY_CHANNEL];

Most of the time, you'll be using parameterized interfaces in the first way. But, you should be aware that it's more flexible ,

and you may see it in the future.


Finally, if you're calling a command in an interface that hasn't had its parameter set explicitly, or signaling an event in

such an interface, you should be aware that someone might not have wired that particular parameter. For example, the

ADC server may get a reading on a channel that doesn't have a client. Practically, this won't happen because ADC

readings are always taken in response to a get() command. But, this situation certainly exists in places we'll talk about

later. nesC requires you to provide a default command or event to be executed in the event that the parameter value
hasn't been wired. It's the "default" in the switch statement.


  module ADCServerM {

      provides interface ADC[uint8_t channel];

  }

  implementation {

      ...

      default event result_t ADC.dataReady[uint8_t channel](uint16_t data)

      {return SUCCESS;}

  }




                                                                                                                               36
6. How does TinyOS provide local operating system services?

What is an operating system? From the 15-412 notes, it's


        a software system

        that manages the hardware and other resources

        to provide an environment for processes that is

        convenient

        efficient
        safe


We're going to look at a few TinyOS abstractions today, through the lens of real applications. Each abstraction

represents a piece of hardware, and each is associated with a specific nesC interface. Here are the key system

interfaces that provide local services to the node.


        StdControl

        Event (technically MSP430Event)

        Leds

        Timer
        ADC


Using these abstractions, instead of messing with registers and interrupts, provides a more convenient and safer

environment for application designers. Because these interfaces are higher-level, the underlying component that

provides the interfaces can arbitrate between the conflicting demands of different TinyOS components for different

resources, leading to more efficient operation. As you may have noticed, TinyOS does not host processes. But each

component can post tasks, which encapsulate a scheduled thread of control. So, TinyOS is a software system that

provides an environment for components.


We'll start with the simple I/O abstractions with an application called Click.


A microcontroller has a number of pins. Each pin can be read, to determine whether current is flowing. Each pin can be

written, to close or open a circuit. We can interact with the pins by reading or writing bits in a special register. Each LED

is attached to a pin, so it can be individually toggled on and off. Our platform also includes a button switch attached to a

pin, and we can read whether the button is pressed or not. A change in the resistance on any pin also triggers an
interrupt to the processor, and then the register can be read to determine which pin caused the interrupt.


But, reading and writing bits in a magical register is exactly the kind of thing that we're trying to avoid. Only one interrupt

handler can exist at a time. Thus, we need to write an operating system in order to provide abstractions over the
hardware.




                                                                                                                             37
The LEDs are attached to pins 54, 55, and 56, but you shouldn't need to know this in order to use them. Here's the
interface you use:


  interface Leds {

      async command result_t init();



      async command result_t redOn();

      async command result_t redOff();

      async command result_t redToggle();



      async command result_t greenOn();

      ...



      async command result_t yellowOn();

      ...



      async command uint8_t get();

      async command result_t set(uint8_t value);

  }

This interface is provided by the LedsC component.


The color commands are straightforward. The get() and set() commands allow you to deal with the leds as a group,

instead of individually. get() returns a byte in which the lower 3 bits represent the state of the LEDs. set() takes a byte,

and sets the LEDs according to the lower 3 bits. For example, you can turn all the LEDs on by calling Leds.set(0x7),
which is 00000111.


Similarly, the user button is attached to port 27, and the interrupt is handled by a function something like this:


  TOSH_SIGNAL(PORT2_VECTOR)

  {

      volatile int n = P2IFG & P2IE;



      ...

      if (n & (1 << 7)) { signal Port27.fired(); return; }

      ...

  }

Port27.fired() is interpreted by a higher-level component called UserButtonC, so you can wire to this directly, through the

interface below:

  interface MSP430Event {

      async event void fired();

  }
                                                                                                                               38
The interface is MSP430-specific only because the MSP430-based Telos platform is the only one with a user button right

now.


Note that these events and commands are all marked async. This is a hint to the compiler that they may be called or

signaled from asynchronous, or interrupt context. That suggests that they're still pretty low-level interfaces.


With these interfaces we can write the Click application, like this:


configuration Click {

}

implementation {

    components Main, ClickM, TimerC, UserButtonC, LedsC;



    Main.StdControl -> TimerC.StdControl;

    Main.StdControl -> UserButtonC.StdControl;


    ClickM.UserButton -> UserButtonC.UserButton;

    ClickM.Leds -> LedsC.Leds;

}



module ClickM {

    uses interface MSP430Event as UserButton;

    uses interface Leds;

}

implementation {

    async event void UserButton.fired() {

        call Leds.redToggle();

    }

}

Here we see a simple TinyOS program that waits for a press of the user buttons, and then toggles the red LED when that

happens. Note that the UserButtonC component must be initialized with StdControl. In addition, this component requires

that the TimerC component be initialized. For some reason, it doesn't initialize it on its own. This highlights a bit about

component-based software design -- sometimes it's unclear whose responsibility it is to initialize things. If things don't

seem to be working, make sure you've wired Main.StdControl to everything you think might need it. There's no harm in

initializing things multiple times.


Now, let's take a look at a more interesting source of interrupts than a button on the circuit board: the Timer. The Timer is

a fundamental hardware resource that is managed by TinyOS to provide a convenient, efficient, and safe environment for
sensor network processes. Getting the timer right means that the rest of the system can run smoothly.




                                                                                                                              39
What you want from a timer is the ability to schedule a task to be run at some time from now. The TimerC component
provides this feature, through the Timer interface.


    interface Timer {

         command result_t start(uint8_t type, uint32_t interval);

         command result_t stop();

         event result_t fired();

    }

Here's an application that uses the Timer as a source of interrupts, and the LEDs as a sink for output.

configuration Blink {

}

implementation {

    components Main, BlinkM, TimerC, LedsC;



    Main.StdControl -> TimerC.StdControl;

    Main.StdControl -> BlinkM.StdControl;



    BlinkM.Timer -> TimerC.Timer[unique("Timer")];

    BlinkM.Leds -> LedsC.Leds;

}

Why do we use a parameterized interface here? Because many different tasks need scheduling. Last lecture, we looked

at parameterized interfaces in which we picked a specific parameter. Here, we don't really need to know _which_ virtual

timer we're using -- only that we have our own. nesC provides the "unique" keyword, which is translated into a unique

number at compile time. The string, in this case "Timer", specifies a pool from which the unique numbers will be drawn.

Thus, a single instance of unique("Timer") will be transformed into a different number than every other instance of
unique("Timer"). This may sound like instantiation, like we're getting our own Timer object with the ability to hold its own

unique state. Effectively, it is. The convention for the unique call is to call unique with the name of the interface, in this

case "Timer". If you mistype, you won't be unique from the other Timers, so be aware of this. This is how the Timer

component uses the information from unique("Timer") to allocate the correct amount of state:

module TimerM {}

implementation {



    enum { NUM_TIMERS = uniqueCount("Timer") };

    Timer_t m_timers[NUM_TIMERS];



    /*

         Set a timer by storing the current counter plus the timer interval

         set()



         set() {

                                                                                                                                 40
             Find the timer that has the least difference between now and its alarm

             Set the hardware timer to fire an interrupt after that difference

         }




         When the interrupt fires, signal the appropriate Timer interface

         set()

    */

}

Once you have a unique instance of the multiplexed Timer, you can call it, like this:

module BlinkM {

    provides interface StdControl;



    uses interface Timer;

    uses interface Leds;

}

implementation {



    command result_t StdControl.init() {

         call Leds.init();

         return SUCCESS;

    }



    command result_t StdControl.start() {

         return call Timer.start(TIMER_REPEAT, 1000);
    }



    command result_t StdControl.stop() {

         return call Timer.stop();

    }



    event result_t Timer.fired() {

         call Leds.redToggle();

         return SUCCESS;

    }

}

Here, we're setting the timer to go off in 1000 milliseconds. Our timer takes arguments in milliseconds. The first argument

indicates that the timer should fire repeatedly, until we call Timer.stop(). When the Timer fires, we toggle the red LED.

Between firings, the processor will be sleeping in order to save energy.



                                                                                                                            41
Having a repeating timer is quite useful, but sometimes we want an event to only occur once. For this, we can start the

timer with with the TIMER_ONE_SHOT argument. If we wanted to keep toggling the LED even while using the one-shot
timer, we could do something like this:


  event result_t Timer.fired() {

      call Leds.redToggle();

      call Timer.start(TIMER_ONE_SHOT, 1000);

      return SUCCESS;

  }

When the timer fires, it'll set the timer to fire again. It's best to use TIMER_REPEAT when your interval doesn't change,

and keep setting TIMER_ONE_SHOT timers when your interval might change in between firings. Why might your interval

change? Maybe you're implementing an exponential backoff algorithm with a progressively longer period, or maybe your

timer period can be changed in response to an external event like a configuration message.


Here's an important point about the timer: when the timer fires, it posts a task which signals your component. In the

Timer event handler, you're already in task context. You don't really have to worry about taking too long to handle the

event, because if the timer fires again, it will just post another task that will eventually be executed when the first one is
done. You can post your own tasks if you want, but you don't need to.


The actual implementation of the Timer is quite complex, because it must multiplex a simple hardware counter into many
different software timers, and do it all within strict time bounds. Feel free to examine it on your own time.


So, that's it for the Timer. You'll find yourself using this all the time. So, Click and Blink are roughly the TinyOS
equivalents of "Hello World".


On a sensor network, the most important form of multi-bit I/O is the sensors, and most sensors are accessed through an
analog-to-digital converter. We've already seen the ADC interface in our examples, but let's take a look at it for real:


  interface ADC {

      async command result_t getData();

      async command result_t getContinuousData();
      async event result_t dataReady(uint16_t data);

  }

The ADC interface, unlike Leds and MSP430Event, contains both commands and events. It's a split-phase interface. You

call getData() to start the sampling process, and dataReady() is signaled when the sampling is complete.

getContinousData() will produce a stream of dataReady() events until dataReady() is returned with FALSE.


Because there are many ADC channels, the ADCC component provides a parameterized version of the ADC interface.

Specific components are then implemented, representing the sensors available on a platform. We'll be using the

InternalTempC component, representing the MSP430's onboard temperature sensor. InternalTempC provides the ADC
interface as well, which we'll use in the following program: Sense.


                                                                                                                             42
configuration Sense {

}

implementation {

    components Main, SenseM, InternalTempC, TimerC, LedsC;



    Main.StdControl -> TimerC.StdControl;

    Main.StdControl -> InternalTempC.StdControl;

    Main.StdControl -> SenseM.StdControl;



    SenseM.ADC -> InternalTempC.InternalTempADC;

    SenseM.Timer -> TimerC.Timer[unique("Timer")];

    SenseM.Leds -> LedsC.Leds;

}

Here, we've wired to the ADC interface, the Timer interface, and the Leds interface.

module SenseM {

    provides interface StdControl;



    uses interface ADC;

    uses interface Timer;

    uses interface Leds;

}

implementation {



    command result_t StdControl.init() {

        call Leds.init();
        return SUCCESS;

    }



    command result_t StdControl.start() {

        return call Timer.start(TIMER_REPEAT, 1000);

    }



    command result_t StdControl.stop() {

        return call Timer.stop();

    }



    event result_t Timer.fired() {

        call ADC.getData();

        return SUCCESS;

    }

                                                                                       43
    async event result_t ADC.dataReady(uint16_t thisData) {

        call Leds.set(thisData & 0x7);

        return SUCCESS;

    }

}

This simple program calls ADC.getData() every time the Timer fires, and then sets the LEDs according to the lower 3 bits

of the reading returned from the ADC. Admittedly, this isn't the most useful program, and the lower 3 bits are likely to be

just sensor noise, but we can still see the general principle behind periodic sampling.


Here's a simplified version of the ADC module:


module ADCM {}

implementation {

    norace uint8_t owner;

    volatile bool busy;



    async command result_t ADC.getData[uint8_t port]() {

        bool oldBusy;

        if (port >= TOSH_ADC_PORTMAPSIZE)

            return FAIL;

        atomic {

            oldBusy = busy;

            busy = TRUE;

        }

        if (!oldBusy){

            continuousData = FALSE;

            return triggerConversion(port);

        }

        return FAIL;
    }



    result_t triggerConversion(uint8_t port){

        MSP430ADC12Settings_t settings;



        // fill "settings" appropriately for the hardware



        if (call MSP430ADC12Single.bind(settings) == SUCCESS){

            if (call MSP430ADC12Single.getData() != MSP430ADC12_FAIL) {

              owner = port;

                                                                                                                         44
                return SUCCESS;

            }

        }

        atomic busy = FALSE;

        return FAIL;

    }



    async event result_t MSP430ADC12Single.dataReady(uint16_t d)

    {

        return signal ADC.dataReady[owner](d);

    }

}

Most of the hardware details have been glossed over, but the two key calls are bind(), which tells the hardware ADC on

which port to sample, and getData(), which starts the conversion. The owner variable holds the currently active A DC port,

so it can be passed back to the dataReady event when the sampling has been completed.


This should be all you need to do your project. Next lecture, we'll talk about the TinyOS communication abstractions.




                                                                                                                        45
7. How does TinyOS provide local communication services?

In this wireless sensor networks course, it's time to talk about wireless networking. Just like Internet-scale networks, the

basic communication unit is the packet. A packet is nothing more than a group of bits, with an agreed-upon start and end.


+----------------+ +----------------+ +----------------+
| Data ...                          | | Data ...               | | Data ...                     | ...
+----------------+ +----------------+ +----------------+
It doesn't always have to be this way, of course -- the Igloo White WSN nodes sent sensor signals over analog radios.
      __                _
--/        \        / \           ____   ...
           \        /       \_/

               --

Let's start with the specifics. How do you, the TinyOS developer, send and receive packets using the node's radio? As

always, we present the interfaces:

interface SendMsg

{

    command result_t send(uint16_t address, uint8_t length, TOS_MsgPtr msg);

    event result_t sendDone(TOS_MsgPtr msg, result_t success);

}



interface ReceiveMsg

{

    event TOS_MsgPtr receive(TOS_MsgPtr m);

}

Most of this should look familiar. As is the TinyOS tradition, SendMsg is split-phase because actually transferring data

over the radio takes a nontrivial amount of time. TOS_MsgPtr is simply a pointer to the TOS_Msg structure, which

represents storage for a TinyOS data message.
configuration Talker {

}

implementation {

    components Main, TalkerM, GenericComm, LedsC;



    Main.StdControl -> TalkerM;

    Main.StdControl -> GenericComm;



    TalkerM.SendMsg -> GenericComm.SendMsg[AM_TALKERMSG];

    TalkerM.ReceiveMsg -> GenericComm.ReceiveMsg[AM_TALKERMSG];

                                                                                                                           46
     TalkerM.Leds -> LedsC;

}



enum {

     AM_TALKERMSG = 3,

};

GenericComm is the TinyOS component that provides SendMsg and ReceiveMsg. When wiring to GenericComm, you

use a parameterized interface. Why? Protocol dispatch. Many different components are likely to be using the radio at the

same time, but each is only interested in its own data. Every TinyOS packet must contain a 1-byte protocol dispatch field,

which is filled in from the interface parameter by GenericComm on behalf of the sending component.
+-------------------------+
|        Type      | Data ...                |
+-------------------------+
When a packet is received, GenericComm signals ReceiveMsg.receive() with a parameter equal to the contents of the

type field and the nesC parameter switch takes care of the protocol dispatch, statically. You can think of the protocol

dispatch field like you would think of the TCP port number. Only one component can be listening on a given dispatch

number at the same time, just as only one process can be listening on a given port.

module TalkerM {

     provides interface StdControl;

     uses {

         interface SendMsg;

         interface ReceiveMsg;

     }

}
implementation {



     TOS_Msg msgBuf;



     command result_t StdControl.start() {

         msgBuf.data[0] = 0x7;

         call SendMsg.send(TOS_BCAST_ADDR, 1, &msgBuf);

         return SUCCESS;

     }



     event result_t SendMsg.sendDone(TOS_MsgPtr buf, result_t success) {

         return SUCCESS;

     }



     event result_t ReceiveMsg.receive(TOS_MsgPtr buf) {

                                                                                                                          47
        call Leds.set(buf->data[0]);

        return buf;

    }

}

Each module that wants to send messages must allocate storage for its own message buffer. It then fills in the data field

of that buffer and passes the whole buffer it to send(). The length argument indicates the number of bytes of data within

the buffer. This length is also stored in the packet, so the receiver knows how many bytes to expect. Think of what life

would be like without the length argument. All messages would be exactly the same size, would probably be padded with

a lot of empty bytes, and receivers would _still_ need to figure out how many bytes are non-empty.
+--------+--------+-----------------+
| Length |               Type          | Data ...              |
+--------+--------+-----------------+
Because SendMsg is split-phase, the send() call returns immediately. If the lower layer has space to take ownership of

your packet buffer, it returns SUCCESS, and you are guaranteed to eventually get a sendDone() event containing that

buffer. If the lower layer doesn't have space to store your pointer, it returns FAIL and you'll never get a sendDone().

These are the terms of the contract offered by SendMsg. In return, your component must not modify the contents of the

buffer while the lower layer is holding it.


Many typical networking layers use "copy semantics" -- the data is copied when the lower layer takes hold of it, and then

you are free to modify your own copy. However, copy semantics require dynamic memory allocation for packet buffers,

which gets tricky in memory-constrained environments. This is why TinyOS uses ownership transfer semantics. The

memory allocation is statically done by the top-level components, and ownership of the buffers is transferred around as

they move down through the protocol stack. When you call send(), ownership of that buffer will actually pass through
several different components, making it even more important that the buffer is protected from accidental modification.


Thus, you often see components create locks around their message buffers:


implementation {



    TOS_Msg msgBuf;

    bool msgBufBusy;



    command result_t StdControl.start() {



        if (!msgBufBusy) {

          msgBufBusy = TRUE;



          msgBuf.data[0] = 0x7;



          if (call SendMsg.send(TOS_BCAST_ADDR, 1, &msgBuf) == SUCCESS) {


                                                                                                                           48
                // wait for the sendDone

            } else {

                msgBufBusy = FALSE;

            }

        }

        return SUCCESS;

    }



    event result_t SendMsg.sendDone(TOS_MsgPtr buf, result_t success) {

        msgBufBusy = FALSE;

        return SUCCESS;

    }

}

If we assume that StdControl.start() will only be called once, then this lock is unnecessary. But, if some other component

happens to multiply initialize your component while the send() call is in progress, the transfer semantics will be broken.

Thus, it's good practice, if tedious, to provide each message buffer with a lock. You shouldn't forget to check the return

value of send(), also. If you lock the buffer, and then send() returns FAIL, you'll never get the sendDone() event that will

unlock your buffer and deadlock will result.


Allocating your own buffers, building locks for them, and transferring them may seem like busy-work when compared to

the simple copy semantics of more dynamic environments. Honestly, it is. There's always more research to be done in
improving the programmer's experience. If you have any ideas, I'd love to hear them.


ReceiveMsg also provides an interface contract. When your component is done processing the data in the received

message buffer, you must return a message buffer to the lower layer. You could return the same message that was just

passed to you, which is the simplest option. But, let's say that your processing might take a while and you would prefer to

do it in another task. However, saving the buffer pointer, posting your task, and returning the buffer pointer could break

the transfer semantics. Two components, yours and the lower layer, could simultaneously modify the buffer. So, you at

least need to copy the data from the message into your own local storage, post your task, and then return the buffer

pointer. However, copying a bunch of bytes does take some time and energy, so TinyOS provides you with a second
option: you can return a different TOS_Msg buffer. This is the buffer swap. You allocate a buffer just so that you can pass

it back when you get the receive() event, while storing a pointer to the buffer you've just received. When you receive() a

second message, you can return the original pointer again. Thus, ownership transfer semantics are maintained without

copies. You'll have to decide whether you'd rather allocate and copy, or allocate and swap. "Best practices" are still
emerging.


Supporting the buffer swap on receive leads to another design decision: in TinyOS, all message buffers are the same

size. When you return a buffer to the lower layer, it will use that buffer to hold data bytes as they come in over the radio.

If the buffer runs out of room, the bytes will be lost. One component returning a short buffer could cause another

component to not get a packet. Really, making all the buffers the same size increases the predictability of the system,


                                                                                                                             49
and lessens the chance that one components will cause another to fail unexpectedly. If you think about dynamic

allocation, one component may allocate a bunch of memory, causing another component's allocation to fail unpredictably.

The lessening of accidental inter-component interaction is another benefit of the static approach. In practice, the actual

number of bytes available to you in the data field can be discovered by accessing the TOSH_DATA_LENGTH macro. If
you pass a length argument larger than this, the send() call will return fail.


In the earlier example, we saw code that edited the data field of the TOS_Msg struct directly. In practice, it's far more

efficient to define a C struct for each type of message you plan to send, cast the data pointer to that struct, and then
manipulate the fields in the structure. Here is a typical TinyOS message definition:


enum {

     AM_TALKERMSG = 3,

};



typedef struct TalkerMsg {

     uint16_t reading;

} TalkerMsg;

Then, you cast the data pointer as in C:

     TalkerMsg *talkerMsg = &msg.data[0];

     talkerMsg->reading = 0x7;

In addition to providing convenience on the nesC side, defining your structures in the above manner provides

convenience when you communicate between a PC and a mote. There's a different invocation for the nesC compiler,

called "mig", that takes the name of a structure as input, as well as the nesC files defining it, and then will produce a Jav a

class as output. This java class provides get and set methods for each field in the struct, relieving you of the need to

manipulate the bytes yourself. As long as you define an enum with the prefix AM_, followed by the name of the struct,

followed by the type of the message, the Java message receiving class can automatically instantiate the correct structure

parsing class for messages of that type. There'll be a special lecture devoted to the Java tools later.


Finally, if all of these design decisions on the endpoints seem confusing and/or arbitrary, you should know that

communication abstractions for wireless sensor networking are still an extremely active area of research. The solution

I've just presented is only one of many possible solutions, and while it is certainly possible to make a priori arguments

about minimizing harmful interactions or keeping things static, the proof of the pudding is always in the eating. Many
complex networking applications have been built in TinyOS, or perhaps in spite of it, but they have been built.




                                                                                                                            50
8. How do wireless sensor network nodes communicate?

Now that we've seen the practical details of the endpoints, let's take a look at how the data is actually communicated

between the nodes.


The radio link layer is given a message buffer to send, and immediately begins to flick a transmitter on and off in

accordance with the bits in the message. The receiver picks up these flickers, reassembles them into a packet, and
passes it up the stack. Simple, right? Hardly.


First, the airwaves contain electrical signals on many different frequencies all the time: noise. A receiving radio needs to

distinguish a signal from the noise, and to do so, it needs to detect that signal in the first place. In a wireless network, this
is done by means of a preamble. A preamble is a particular pattern of bits that is always sent before the packet:


+--------++--------+--------+-----------------+
|Preamble|| Length |                      Type        | Data ...                       |
+--------++--------+--------+-----------------+
A receiving radio is constantly sampling the radio channel to determine whether the energy level is high enough or the

signal characteristics are right enough to constitute a bit. If it detects one, it begins to sample frequently and at a

particular rate defined by the radio standard. Each time, it looks for the next bit of the preamble. If it gets the righ t bit, or

possibly absence of a bit, it continues to sample the channel. If the entire preamble has been received successfully, the

radio can be reasonably sure that it is about to receive a packet. If there were any errors in the preamble, the radio might

assume it was being tricked by noise and stop sampling.


Once the preamble has been received, the radio needs to determine how long to keep sampling in order to get the entire

packet. If all packets were the same length, this would be trivial. That's why the first byte is the length byte. Once this

byte is received, it is immediately converted to a number of bits, and after receiving that number of bits, the radio stops
sampling.


We mentioned noise that might cause errors in the preamble, preventing packet reception. The same noise might cause

errors in the packet, which must be corrected before passing up the packet, or simply detected so that the packet can be
discarded. This is done by calculating a CRC code for the packet prior to sending, and appending it to the packet body.


+--------++--------+--------+-----------------+--------+
|Preamble|| Length |                      Type        | Data ...                       |    CRC        |
+--------++--------+--------+-----------------+--------+
Once the correct number of bits have been received, the same CRC is calculated over the received packet and

compared to the received CRC. If the CRCs match, the message is probably correct. If they don't, it could be because

there are errors in the received message. Or, the message could be fine, but the errors are actually in the received CRC.

The conservative thing to do is throw it out anyway. Compared to the alternative of sending a 1 and getting a 2 on the

other side, the conservative approach sounds pretty reasonable, right?


                                                                                                                                 51
But, what's causing all this noise? The sun? WRCT? Probably. However, the amount of noise caused by these sources

pales in comparison to the noise produced by a second radio transmitting bits at the same time. When two radios

transmit at the same time, any receiver in range of both is likely to receive totally garbled bits. The problem of arbitrating

between multiple nodes as they try to access the same channel is called Media Access Control, or MAC. MAC protocols
can be divided into two types: schedule-based and contention-based.


Some radio protocols solve this problem by giving each node a different frequency to transmit on. This is FDMA,

Frequency Division Multiple Access. It's the approach taken by FM radio stations, for example. But, have you ever tried

getting your car radio to listen to multiple stations at the same time? In a network, this is important, but doing it requires a

separate tuning circuit for each frequency. This is expensive, so it isn't often done. In addition, if you have more nodes

than you have frequencies, you'll have to allocate frequencies to nodes. This is often done by requiring each node to

start transmitting on a particular known frequency, commonly called the "hailing" frequency, and request a frequency

assignment. Some authoritative node must respond on the hailing frequency, and then the sender must switch to the

new frequency. If there aren't enough frequencies, the sender doesn't get a slot. This is basically how those brick
cellphones from the 80's worked. However, on the hailing frequency, nodes might still transmit at the same time!


Other radio protocols solve this problem by giving each node a scheduled time to transmit, such that no two nodes will be

transmitting at the same time. There's still random noise, but most of the packets will get through. This is TDMA, or Time

Division Multiple Access. TDMA is most commonly used in star-topology networks, because a central server can assign

each node a non-conflicting schedule. Establishing a TDMA schedule in a mesh network is a harder problem, because of

the desirability of spatial schedule reuse. If every node in the mesh was assigned a separate time to transmit, the

network bandwidth would be quite underutilized. If two sender-receiver pairs are completely out of radio range from each

other, then it would be wasteful for them _not_ to transmit at the same time. Actually establishing an efficient TDMA

schedule among distributed nodes requires extremely careful coordination, and continual adaptation to the changing

radio environment due to external noise. The scheduling problem is isomorphic to the graph coloring problem, where any

two nodes within radio range are modeled as an edge in the graph. Determining an efficient schedule is equivalent to

giving each node a color such that no two connected nodes share the same color. As you may remember from

algorithms, graph-coloring is NP-hard, and certainly doesn't scale well with increasing numbers of nodes. In addition, it's

most easily done with information about all the nodes -- centralized. There is active work on the distributed version of this
scheduling problem, and if you're interested, I'll put up a link to the paper.


There's a third commonly-used media access control scheme called CDMA, for Code Division Multiple Access, but fully

explaining it will require a bit more theory than I have. To assign a "slot" to a node or set of nodes, the nodes must be

given a unique code. This code is then used to spread the signal over a subset of the frequency range in a unique way.

The code can then be used by the receivers to reassemble the spread signal. You can think of it like FDMA, except with

a bunch of frequencies that add up to the same amount of total bandwidth. In fact, the radio on the Telos motes uses

spread-spectrum, and each mote can choose a code. In practice, we give all motes the same code, to ensure that any

mote can talk to any other at the same time. But you may want to give your motes a different code than your neighbor
does.




                                                                                                                             52
Within any one of these systems, however, contention can still be found. Two nodes could be assigned the same

frequency by accident, or the same schedule, or the same code. Two nodes might hail on the hailing frequency or code

at the same time, or request a schedule assignment within the hailing timeslot. Solving these problems req uires a
contention-based MAC.


In a contention-based MAC, nodes independently try to avoid transmitting at the same time without an external schedule.

The simplest form of contention-based MAC is called CSMA, for Carrier Sense Multiple Access. Before sending the

message preamble, each node listens to the channel for a short time window. This is Carrier Sense. If the node detects

RF energy on the channel, it assumes that another node is currently sending a packet and delays sending its own

message for a time. After that time, it samples the channel again, and may possibly back-off again if it hears more

energy. Note that this will prevent collisions as long as two senders don't try to send at exactly the same time. Then, both

senders sample the channel, find it's clear, and then start sending. To lessen the chance of this, the initial waiting period
is chosen randomly, as is the backoff period.


The exact values for the sampling time, backoff time, and how the backoff time grows are tunable. The time required to

send one bit is 0.004ms, for reference. In our implementation, the time spent waiting for a message is a random value

chosen uniformly from the range 0.32ms to 5.12ms. Note that approximately 80 bits could be sent during the minimum

backoff period, and 1280 bits could be sent while the radio is waiting for the maximum backoff period. Thus, using an

initial backoff like this will always lower the channel capacity from the theoretical maximum. Then, if a packet is heard

during the initial backoff window, the radio waits for a random interval chosen between 0.32ms and 20.48ms. Once the

backoff period ends, the radio again begins to wait for the initial backoff period, and may repeatedly back off up to 8

times before giving up on the message. The point of these details is this: CSMA works astoundingly well under low

contention, but if many nodes are trying to transmit at the same time, nodes are likely to spend far more time waiting to
transmit than actually transmitting.


Considering the backoff algorithm described above, your application can't really assume anything about when a packet

given to the radio stack will actually be sent. This places a limitation on the kind of applications that can run over a
wireless sensor network, of course. However, the common applications don't require highly time-bounded data traffic.


Actually determining whether the energy on the channel is a packet or not can be tricky. Sampling the channel and

hearing "packets" when no other node is transmitting will lower the effective channel throughput, as nodes incorrectly

back off. A single sample when there's no packet may actually be quite high or low, due to the variability of noise, but the

true mark of a packet is sustained high energy on the channel. Accordingly, TinyOS nodes take a few samples of the
channel before sending, and if any low-energy outliers exist, the node assumes that it's hearing noise.


But, there's a big problem with CSMA. Consider 3 nodes, X, Y, and Z. X and Z each wish to transmit a message so that

Y receives it. X and Z are in radio range of Y, and out of range of each other. X samples the channel, hears that it's clear,

and begins to send a packet. Z does the same. Y receives both signals at the same time, and loses both packets. This is

the Hidden Terminal problem. Really, the sender can't actually tell whether the receiver will be able to receive it's

message, and uses carrier sensing as an approximation. Carrier sensing is a good approximation, but does not scale



                                                                                                                            53
well to high traffic loads. Fortunately, wireless sensor networks aren't generally intended for high-traffic scenarios, due to
energy cost and the low frequency of most environmental phenomena.


Ethernet also uses CSMA on a single shared wire, but with Collision Detection. Because all nodes can hear all other

nodes, it's possible for a sender to actually detect a collision when it happens and retry the send after backing off. In the

general case, the hidden terminal problem makes this impossible in wireless networks. In addition, Ethernet uses an

exponential backoff, while our MAC uses a linear backoff. In binary exponential backoff, the size of the random interval is

doubled each time the message is not sent, while it stays the same size in linear backoff. The effect of exponential

backoff is to quickly reduce the sending rate under congested conditions in order to prevent the network from entering
"congestion collapse".


Key points:


        WSN nodes communicate by exchanging packets.

        Packets contain a length, a dispatch type, and a data payload.

        Preambles are used to detect packet starts, and CRCs detect errors.
        Frequency or Code division is used for coarse separation.

        Carrier sensing is used for fine separation.

        The Hidden Terminal Problem makes CSMA less effective.
        The communication primitive is "broadcast to all nodes in range".




                                                                                                                           54
9. How do WSN nodes build a reliable neighborhood abstraction over local

    broadcast?

All shared wireless networks (and Ethernet) share a common characteristic: a message sent by any node will be

received by all other nodes in range. The fundamental communication primitive of wireless networking is _broadcast_.

But, you might want to send packets to a specific node. To make this easier, each node can be given a numeric address,

and then the destination address can be placed into the packet. The receiving radio, in hardware or software, can discard

packets with a different destination address. When a message really should be broadcast to all nodes, a special
broadcast address is put in the packet. By convention, this address is all ones.


+--------++--------+--------+----------------+--------+--------+
|Preamble|| Length |                     Type       |    Dest Address              |Data... |            CRC       |
+--------++--------+--------+----------------+--------+--------+
The first argument to the SendMsg.send() call is the destination address. TOS_BCAST_ADDR represents the special

broadcast address. When you compile a program and install it onto a node, you must give it a link -layer address between

0 and 65534. When installing, you give the address as an argument to the install flag: "make telosb install,1" to give the

node the address of 1. I tend not to use 0, and consider it an invalid address. If you're keeping a table of node addresses,

it helps to have a value that means "this entry is empty".


Once you've placed addresses into the link-layer packets, you can start to use link-layer acknowledgements. After

receiving a data packet, the node can send a special packet that acknowledges the receipt of the original data. If the

sender doesn't receive an ACK within a predetermined timeout, it can assume that its data was lost on the air, and

possibly retry its send. This can greatly increase the reliability of a wireless link. If you want to use ACKs in your own

TinyOS applications, you can turn them on by wiring to the MacControl interface of the CC2420RadioC component, and

then calling MacControl.enableAck() any time after StdControl.init() is finished. In the sendDone event, you can test

whether the packet was correctly acknowledged by the recipient by examining the "ack" field of the TOS_Msg structure.
A 1 indicates that the ACK was successful.


Some high-bandwidth links like 802.11 use the RTS-CTS protocol to increase the chance of successful packet reception.

RTS-CTS is a bidirectional protocol, requiring interaction between a sender and a particular recipient. Thus, it relies on

an address-based neighborhood abstraction. Before sending, instead of sampling the channel, a node sends a Request

To Send to a destination address. The destination then sends back a Clear To Send with the sender's address. Once the

sender receives the CTS, it sends its data packet, and then waits for the ACK. The RTS packet tells nodes around the

sender to be quiet, just like a regular send does in CSMA. But, the CTS packet tells nodes around the _receiver_ to be

quiet, which CSMA can't do. This helps to lessen the impact of the hidden terminal problem. These extra "be quiet"

packets are called Collision Avoidance, or CSMA/CA. However, there's a cost: in RTS-CTS-DATA-ACK, 4 packets are

sent for every one data packet, and more timeouts are incurred waiting for them. The extra packets and timeouts lower

the effective throughput of the channel, and sending them takes lots of extra energy. That's why we tend to see them in

high-bandwidth, high-energy links, but there are also implementations of RTS-CTS link layers for TinyOS. However, we
                                                                                                                             55
prefer to use the simpler, more energy and bandwidth efficient CSMA, possibly with ACKs if most of the communication
has a destination address.


Now that our neighboring nodes have addresses, we can begin to consider whether each node should try to maintain a

list of its neighbors, or "neighbor table". Keep in mind that any neighbor table is only an abstraction over a constantly

changing connectivity graph. When you have only a few nodes, and they aren't moving, maintaining a neighbor table is

easy. Allocate a number of slots, say, 4. Every time you hear a new neighbor, enter it into the table. In TinyOS, this

requires static arrays and for loops. However, deciding what packet a neigbor comes from requires entering the source

address into the packet. Looking at the Ethernet packet format, every message contains the destination address as well

as the source address. In TinyOS, link-layer packets only contain a destination address. This was chosen as a

compromise between information and efficiency. Applications that need source addressing have to place the address
into the data portion of the packet, and interpret it correctly upon reception.


One interesting use of a neighbor table is to maintain the most recent sensor reading taken by each one of your

neighbors, so you can compute a function over them. You might want to determine whether you have the largest reading,
or calculate the average reading, for example.


A second interesting use of a neighbor table is to estimate the quality of the links between the node and its neighbors.

How can you estimate link quality? One way to do it is to send packets with a sequence number. If each sent packet

contains a number that increases with every sent packet, then it becomes possible for neighbors to record the most

recent sequence number that has been heard from each neigbor. When a new packet is received, it can be compared to

the last heard sequence number, and if the difference is any larger than 1, packets have obviously not been received.

Thus, it's possible for the node to count how many packets have actually been received, and how many packets should

have been received. Dividing these two numbers will give an estimate of the rate of successful receptions for packets

traveling from the neighbor to the maintainer of the table. Granted, you don't need a neighbor table to perform link quality

estimation as long as you only want to estimate link quality to one neighbor. But, a neighbor table with a size of 1 is still a
neighbor table.


However, as we learned earlier, wireless links are not guaranteed to be symmetrical. The link quality for packets traveling

from the table maintainer to the neighbor must be estimated by the neighbor, and if a node needs a bidirectional link

quality estimate, the neighbors must exchange their inbound link quality estimates with each other. This costs more

energy, and application designers must decide whether the potential asymmetry is important enough to justify
exchanging this information.


The difficulties in neighbor table management start to arrive from two factors: you may have more neighbors than slots,

and some neighbors may leave your neighborhood temporarily or permanently. If you hear from a new neighbor with a

full table, you have the option to ignore the new neighbor. You also have the option to evict an entry in the table. Ignoring

the new neighbor will place an upper limit on the size of your neighborhood, which may limit your ability to process data,

but evicting an entry requires choosing which entry to evict. For those of you who have studied operating systems,
neighbor table management is analogous to the cache management problem.



                                                                                                                            56
Evicting a "good" neighbor could be harmful to system performance, so you want to pick a "bad" neighbor, but the

definition of a "bad" neighbor varies based on the application. In the application above, a "bad" neighbor might be a

neighbor that hasn't reported a reading for a long time. Perhaps the node moved or died. In order to determine the age of
each entry, you could consider including a timeout.


+--------+--------+--------+
|Address |Timeout |                    Data       |
+--------+--------+--------+
+--------+--------+--------+
|Address |Timeout |                    Data       |
+--------+--------+--------+
Each time you hear from a node, you can reset the timeout to zero. Otherwise, you can slowly increase the timeouts, and

choose the maximum timeout to evict. You might also do it in reverse, and automatically evict neighbors that you haven't

heard from within some interval. Upon hearing a message from a neighbor, you'll set the timeout value to maximum.

You'll then find the minimum remaining timeout, and set a timer to fire in that amount of time. When the timer fires, you

can evict a node that has not been reset, or wait longer. Both of these can be considered as variants on Least-Recently-

Used, or in this example, Least-Recently-Heard. As such, the standard optimization to LRU can be used also. Whenever

a neighbor is heard from, set a bit to 1. Pick a check interval. Every time the interval elapses, scan the table. If an entry

does not have a 1, evict it. Unfortunately, a single missed message will result in a neighbor eviction, which is not a recipe

for a stable table.


What we really want from a neighborhood is a stable list of neighbors who we have heard from recently and frequently.

One recent innovation for maintaining a table like this is the Frequency algorithm. Each table entry is given a counter.

When an existing node is heard from, its counter is increased. When a new node is heard from, it is given a counter

value of 1. A new node can be placed into an empty slot, or if no slots are available, it can replace a node with a counter

value of zero. But, if all the counters keep increasing, how will a node ever get a counter of zero? If a new node is heard

from and no nodes have a counter of zero, then every node's counter is decremented by one, and the new neighbor is

not entered into the table. In the LRU-style algorithms, a new neighbor was always entered into the table. Here,

neighbors that are not in the table can still have an effect on it, by providing pressure that decreases the value of the

current neighbors. If a neighbor not in the table has provided enough pressure to cause an existing neighbor to be driven
down to zero, it can enter the table. Frequency tends to favor a stable neighbor table, because if all neighbors are

equally good then the initial neighbors are preserved. However, a "malicious" node could easily enter into the Frequ ency

table by sending packets more frequently than the others. Frequency works best when all neighbors are agreed to use
the same beacon period.


Other applications might need an eviction policy not based on time or frequency. Some might want to evict the neighbor

with the lowest sensor reading, or the lowest measured link quality. Neighbor table management is a rich topic with much
research, and we've only scratched the surface.


Key points:


                                                                                                                            57
   Addresses are an abstraction on top of local broadcast.

   Addresses enable ACKs.

   RTS-CTS can avoid collisions, but costs a lot.

   Addresses enable neighbor tables, which must be actively managed.

   Neighbor tables build an abstraction of point-to-point links.

   Sequence numbers can estimate the quality of such links.

   Least-recently-heard is a simple neighbor table management algorithm

   Frequency maintains a stable list of the most-heard-from neighbors
   Neighbor eviction logic can also be application-specific.




                                                                           58
10.       How can data be efficiently disseminated to the nodes in a multi-hop

    wireless sensor network?

We've been talking about single-hop communication in sensor networks, but multi-hop communication is required in

order to communicate with a widely-distributed sensor network.


Why is dissemination useful? Without it, you have no control over your wireless sensor network. One common mode of

control is to disseminate commands: small units of data that direct the network to do something useful. A second mode is

parameter dissemination: small units of data that modify how the network functions, perhaps by changing a sample rate

or detection threshhold. We might even want to disseminate all of the data in a new binary, so that the nodes can reboot

into a wholly new program. This falls under bulk dissemination. Between the two, we may want to disseminate complex
queries, or small virtual machine programs that take a few packets to complete.


All of these disseminations use the same basic idea. We can use the following framework to think about developing a
dissemination algorithm. When a packet arrives at a node, the node responds by making four decisions:


        when to retransmit the message

        which link to retransmit the message on

        how to modify the message before retransmitting it
        how to modify local state before retransmitting it


Answering "never" to the first question rules out the possibility of moving data over multiple links to a destination.

Answering "sometime" opens up the possibility of multi-hop networking.


Let's take a look at one simple set of answers to these questions:


        Immediately

        All of them, in the broadcast address

        Don't modify it
        Don't modify anything


The resulting protocol is the infinite flood. Upon receiving a message, every node immediately retransmits it. A message

disseminated in this way has the potential to reach every node within a connected network.


However, the infinite flood has one problem that makes it entirely impractical: it never stops. A message sent by node X

may be received by node Y, and then retransmitted to node X, and retransmitted forever. Let's make one modification to

the second question: all of them, except the link the message came in on. This would prevent single-hop loops, but could

still lead to multi-hop loops in networks containing cycles, as most do. So, we need to make the decision based on the
message, not on the link.



                                                                                                                         59
Ideally, each node would only retransmit each message once, but doing this requires every node to keep record of the

messages that have been previously retransmitted. Recording every previously transmitted message would require
prohibitive amounts of storage, so we need a way to condense the contents of the message. The answer? Metadata.


One particularly useful instance of metadata is the sequence number, or version number. Each new piece of data is

given a new version number, and as long as the version number is greater than the most recent version number stored

on each node, each node will retransmit it. Otherwise, it will be dropped. Each node then stores the sequence number,
preventing the node from transmitting the same message twice. This results in the classic flood:


        Immediately, as long as the sequence number is newer

        All of them

        Maintain the sequence number
        Record the sequence number


However, version numbers are not without problems of their own. First, an ever-increasing version number will eventually

reach the upper limit of the allocated storage space. Perhaps a different message could be disseminated, with a
command that resets the main version number. But, a more elegant solution is to allow the version number space to

wrap around. By performing a signed subtraction and then checking whether the result is positive, zero, or negative, we

can implement a version number with half-space wraparound. If a new number is less than half of the total space ahead

of the old number, the result will be positive and the new number will be judged as newer. If the new number is more

than half the space ahead, it will be judged as older. However, a very old number, that is more than halfway behind the

current number, will be judged as newer. If a piece of data is injected with such an old number, perhaps because the

network was partitioned for a long time and some node missed a large number of new messages, it could be

disseminated just as though it were new.


This is a bigger problem if we are permanently storing the piece of disseminated data than if we are just processing it

and throwing it away. A misbehaving node could easily inject an old piece of data that overwrites a newer one.


Second, disseminating a new piece of data requires choosing a newer sequence number that the number currently

present in the network. If the node that wants to disseminate the data has not been following previous messages, like a

mobile PC-class node wanting to send a message into a sensor network, the sending node must request the version

number from a node that has been following and then increment it before sending the new data. Or, a sending node

could inject the message with a flag that directs the receiving node to increment its stored sequence number before
retransmitting.


In addition to the challenges needed to manage a version number, the classic flood is particularly inefficient in broadcast -
based wireless networks for two reasons:


        missed messages
        sender overlap




                                                                                                                          60
In a classic flood, each node only transmits once. If there's enough noise to result in a node missing all of the messages

that its neighbor sends, it will never participate in the dissemination. This is the unreliability problem.


If two senders are both in range of a receiver, and both senders retransmit the message, then the receiver will receive it

twice. This is the redundancy problem. In fact, the redundancy problem can exacerbate the effects of the unreliability
problem.


Both of these problems grow in magnitude as the density of the network increases. What we would like is a density-
independent dissemination protocol.


The first problem can be solved with retransmission. Transmit the message multiple times. But, this drastically increases

the demands on the network, even beyond a complete O(n) flood. In addition, limiting the retransmissions doesn't fix the
problem. Thus, every node should be prepared to retransmit, forever.


The second problem can be solved with suppression. During the random backoff window, if a node overhears another

node transmit the same metadata, the same sequence number, then the node can cancel or delay its own transmission.

This would lessen the chance of sender overlap and lower the load on the network, in a manner that is independent of

density. However, it comes at the cost of lowering the chance that the message would reach the entire network, if the
suppressed node is the only gateway between the covered and uncovered portions of the network.


Fortunately, these two solutions work together. Suppression minimizes the chance of network overload caused by

retransmission. Retransmission minimizes the chance of missed messages caused by suppression. When the two are

combined, we have a reliable dissemination protocol, caused by the retransmissions, and a bandwidth-efficient protocol,
caused by the suppressions. This is the Trickle protocol.


Basic primitive: "every so often, a node transmits ... metadata if it has not heard a few other motes transmit the same
thing". This is called "polite gossip".


Each node maintains a timer that expires in t, a second timer that expires in T, a suppression counter C, and a
suppression threshhold k (usually 1).


Here's the protocol:


- Hear a node with older metadata:

  - send new message immediately

- Hear a node with the same metadata:

  - increment the suppression counter C

- Hear a node with newer metadata:

  - set T to T_l.

  - set the "trickle timer"

- t expires:

  - if suppression counter c is less than threshhold k, retransmit.

                                                                                                                          61
    - else, don't retransmit.

- T expires:

    - double T, up to T_h.



- set the "trickle timer":

    - reset C to 0.

    - pick a new random t from the range [T/2, T].

    - set the t timer and the T timer.

This protocol results in a polite dissemination, in which every node only transmits if it has not yet been suppressed by

another equal transmission. This lessens the negative effects of simultaneous retransmission and overlapping senders.


As distinct from the classic flood, this protocol also results in an epidemically reliable dissemination. Because the nodes

continually retransmit the message forever, it's eventually guaranteed to reach every node. In the classic flood when
each node retransmits once, a node could miss all of the messages.


The suppression will not complete without the infinite retransmissions, and the retransmissions would overload the

network without the suppression. They depend on each other, and together, they produce a polite, reliable dissemination
protocol.


This protocol is implemented as a TinyOS network layer by a component called DripC. It provides a Receive interface,
with an event that is signaled only when a new message arrives from another node:


interface Receive {

    event TOS_MsgPtr receive(TOS_MsgPtr msg, void* payload, uint16_t len);

}

The first interface is the network-layer counterpart to the TinyOS ReceiveMsg interface. All network-layer components

should provide this interface, and should signal Receive.receive() when a new message arrives. DripC will only signal

this when a message with newer metadata is received, so your component should only get the event once per new

message.


You'll notice that the address argument is missing: that's because network-layer addresses are different from link-layer

addresses, and have not necessarily been standardized to the 2-byte values we use other places. In fact, Drip doesn't
even use addresses. Every message will propagate to all nodes.


In addition, there's an additional two arguments: the payload, which points inside the message to the beginning of th e
message contained _within_ the network layer message, and the payload length.


+----------+-------------+----------------+---------+-----+
| Preamble | Link Header | Network Header | Payload | CRC |
+----------+-------------+----------------+---------+-----+
DripC exports this interface with a parameter. Here's how you wire to it:


                                                                                                                           62
TestDripM.Receive -> DripC.Receive[AM_TESTDRIPMSG];

DripC sends its messages on a particular Active Message ID, or link-layer dispatch type, and then provides a second

type field just for different messages being disseminated over Drip. Each Drip message gets its own Trickle timer, as

described above, and is flooded through the network independently of the others.


You may wonder where the Send interface, the counterpart of Receive, is. Well, DripC provides a special interface:


interface Drip {

    command result_t init();

    command result_t setSeqno(uint8_t seqno);

    event result_t rebroadcastRequest(TOS_MsgPtr msg, void *payload);

    command result_t rebroadcast(TOS_MsgPtr msg, void *payload, uint8_t len);

    command result_t change();

}

Here's how to wire it:

TestDripM.Drip -> DripC.Drip[AM_TESTDRIPMSG];

DripC.DripState[AM_TESTDRIPMSG] -> DripStateC.DripState[unique("DripState")];

Because every node must cache the latest Trickle message forever so that it can be retransmitted at any time, and there

may be several different kinds of Trickle messages (one for each Drip type), DripC gives the client component the

responsibility of caching the message.


When the Receive.receive() event is signaled, the client component must copy the payload into a local variable. Then,

when the Drip.rebroadcastRequest() event is signaled, the client component must copy the payload back into the given

payload pointer so that it can be retransmitted. After copying the payload, it must call the Drip.rebroadcast() command,
providing the original msg pointer, the original payload pointer, and the length of the payload.


If the client wants to send a new message into the network, it calls the Drip.send() event, which increments the sequence

number, resets the Trickle timer, and will eventually signal the Drip.rebroadcastRequest() event. Then the client can
provide the new data.


The Drip.init() command must be called in StdControl.init(). That's because Drip must be initialized once for each
parameterized channel, not once overall like StdControl would do.


Because each channel must maintain its own timer and metadata, we need to allocate space in an array based on the

number of active channels. Thus, we need the unique() call. That's what the second line of wiring does: it connects the
Drip component to the DripState component, which holds one timer and metadata for each channel.


I'll be honest, this interface is a bit convoluted. If Drip could cache each message, instead of leaving that to the client,

then a client could call the Send interface when it has a new message to send and then let Drip copy the payload into the
cache and continually retransmit the message when necessary.




                                                                                                                               63
interface Send {

    command void* getBuffer(TOS_MsgPtr msg, uint16_t* length);

    command result_t send(TOS_MsgPtr msg, uint16_t length);

}

However, Drip would have to allocate an array of TOS_Msgs, one for each channel. If the size of the payload is less than

the size of a TOS_Msg, this would be wasteful. In addition, the client would have to allocate its own TOS_Msg in order to

call Send. So, we would have two TOS_Msgs, instead of one buffer exactly the size of the payload. If RAM gets more

plentiful, Drip might move to a more programmer-friendly interface.


Just so you know, you'll be using Drip as part of your last project.


So, Drip provides a service for disseminating a single message through the network, using the Trickle "polite gossip"

protocol. But, what happens if we want to disseminate a data object larger than a single message? This is a common

request, especially for the goal of "sensor network retasking". We may want to download an entirely new application

binary, let it spread through the network, save it to local storage, load it into program memory, and then reboot and start
running the new application.


There's a component that provides this service as well: Deluge.


The implementation of Deluge is not a typical network-layer protocol, because it does not provide an upward interface for

sending and receiving these large data objects. There's simply not enough memory to do the right amount of buffering.
Instead, it saves these images directly to local storage.


However, the Deluge protocol is general enough to support network-layer dissemination of large binary objects, so we'll

talk about it. The Deluge protocol is based on an earlier sensor network dissemination protocol called SPIN. It has three
phases, each with its own type of message:


         Advertisement

         Request
         Data


When Deluge is running, nodes use the Trickle algorithm to periodically broadcast a set of Advertisement (ADV)

messages. Each ADV message contains the metadata for one of the binary objects that can be disseminated over

Deluge. If the receiving node is "up to date" on each object, it suppresses its own ADV according to Trickle. This

minimizes the constant advertisement traffic.


Advertisement: source address, adv version, type?, node description, image description, number of images


Image description: guid, version number, image number, total pages, CRC over vnum and numpgs, pages complete


If the receiving node is older, it stores the new metadata. Then, it moves to Request phase. A requesting node asks the
advertising node for the data object by sending a REQ message.

                                                                                                                         64
Request: dest address, source address, version number, image number, page number, packet bitmap


The node responds by sending a sequence of DATA messages, each containing a fixed-size portion of the binary object.


Data: version number, image number, page number, packet number, data


When the node has received the entire object, it starts sending its own ADV messages with the metadata for the new
object.


However, this is an extremely simplified description of the Deluge protocol. Deluge has a multitude of features that make

it incredibly efficient at the challenging task of disseminating large objects through a mesh of small unreliable wireless
links.


The large image is divided into pages. Instead of transmitting the entire image before readvertising, nodes readvertise as

soon as they've received a page. This allows simultaneous transmission of page N close to the source while page N -1 is
being transmitted further out in the network: spatial multiplexing.


Any node can satisfy a request, minimizing the chance of bad links. Suppression, again, prevents the problem of

overlapping senders. Nodes prefer to respond to the lowest requested page number, ensuring that earlier requests are
satisfied first so that the dissemination can proceed.


Finally, consider a dissemination, small-message or bulk, in which only a subset of the nodes have the right to forward

the message. If the subset is narrow and long, the dissemination could follow a path through the network. This looks an
awful lot like routing, which we'll talk about next time.




                                                                                                                             65
11.       How can data be collected from the nodes in a multi-hop wireless

    sensor network?

When the main purpose of a sensor network is to gather data from many points in space for human analysis, you need a

way to get it all out of the network. Perhaps we could use dissemination: let every node send it to every other node. Then,

we could put our bridge node anywhere in the network and access the data. In highly mobile networks, this is actually

close to what happens. In a project called ZebraNet, a mote was attached to each member of a whole zebra herd,

tracking its location and a bit about its environment. Clearly, this is a highly mobile scenario, and there's no way to

predict which zebra will be the "best" at getting data out of the network. However, as we learned before, simple

dissemination with only one piece of metadata doesn't work when two nodes try to send new data at the same time -- the

sequence numbers collide and the dissemination doesn't reach everyone. How could we mak e this work with

dissemination? Give each sender its own sequence number, and ensure that every other node has enough space to

store the newest data produced by every possible sender. This is a huge amount of state, with a huge amount of
synchronization. But, it works.


At its most basic level, routing is an optimization of dissemination. If we have information about the network that suggests

that some nodes need the data more than others, we can consider giving a subset of the network the responsibility of

forwarding the data towards those nodes. If only a subset has to retransmit, we save energy and state. However, once

we select this subset, we have to have som assurance that the subset we selected will remain the correct set of nodes

for a sufficiently long time. Otherwise, we'll be spending so much messaging energy to maintain the right subset that the

performance of the network would be impeded. So, routing works best in static scenarios, deployments in which most of
the nodes aren't moving at all. We'll start by assuming this stasis, and then look at some options for mobile networks.


The design of the Internet was based on the end-to-end principle: all of the intelligence should be at the endpoints,

especially human intelligence. In a sensor network, there's no human at the endpoints, and in a data collection

application, no intelligence. If all each node can do is collect data and send it, there's really no need for it to receive d ata

from other nodes. The only node that needs to receive data is the node with intelligence attached to it: a person, or at

least a database. This sort of network results in an extremely common traffic pattern: many -to-one. We don't have to set
up routes so that any node can reach any other node (any-to-any), which greatly simplifies the routing problem.


This is "collection routing". There's only one sink, and every other node has only to select one neighboring node that can

assist it by forwarding its data to that sink. In turn, every node must be prepared to act as a forwarder for other nodes.

Constructing this routing structure requires minimal state: a 2-byte integer to store the address of one neighbor. It also

requires a bit of forwarding logic. When receiving a message, check whether it should be processed by collection routing.

This is done by picking a link-layer type for the collection routing protocol. If a collection message is received, and you

have previously stored the address of a closer neighbor, send it to that neighbor. The node might just send it and forget it,

but this is highly vulnerable to packet loss. Instead, once a node has selected a destination neighbor, it becomes

possible to use the link-layer acknowledegement mechanism to increase the reliability of the link. The node may


                                                                                                                               66
repeatedly try to send to its neighbor until it gets an acknowledgement. Then, it can discard the message and either start
forwarding the next, or wait for another to arrive.


Though the mechanisms of the forwarding protocol are quite simple, the difficulty comes in picking the contents of that 2-

byte integer. When we're not sure what to do, brute force it. Before installing each node, assign it the address of its

closest neighboring node by hand. If the nodes aren't moving, heck, this might even work. However, if the radio

environment changes even a bit, this neighbor might not remain the neighbor with the best chance of successfully

forwarding the message. Or, the neighbor itself might run out of batteries, or be stepped on by a zebra. Then, when one

neighbor becomes inaccessible, every other node that was recursively depending on that neighbor will also become
inaccessible.


In a real radio environment with constant changes, the choice of next-hop neighbor must adapt. OK, let's try the

centralized solution. Gather all of the connectivity data, using an application like your Project 3. Then, pick the best-

connected neighbor for each node that's closer to the sink. If some nodes become inaccessible, repeat the process.

However, actually collecting this data is impossible without walking around within the network, and it will quickly become

out of data once it's gathered. What we really need is a distributed solution, in which each node is responsible for picking
its own best next hop.


Here's the simplest distributed solution. The root node broadcasts a message. Every node that hears that message

records the root's address, then retransmits. If a node hears a beacon and has not yet selected a next hop, it selects the

first beacon it hears, and holds onto it forever. This is simple, but not adaptive. It also ignores link qualities. Rather, it

assumes each link is equally good and that it only needs to select the first one. However, we really want to respond to
different link qualities in the network.


There's already an algorithm we can use to do this: the Bellman-Ford shortest path algorithm. Bellman-Ford finds the
shortest path from one source node to every other node in the graph. Here's a refresher:


foreach node:

  if it's the source, node.distance = 0;

  else node.distance = infinity;

  node.next_hop = null;



foreach node:

  foreach edge between nodes u,v:

     if (v.distance > u.distance + uv.weight)

        v.distance = u.distance + uv.weight;

        v.next_hop = u;

The key addition here is the "distance" variable. For the rest of this lecture, we'll call it "cost", so as not to confuse it with

physical distance between nodes. In addition, because the root of the shortest-path tree is intended to collect data, we'll

call it the "sink".



                                                                                                                                 67
Bellman-Ford is a centralized algorithm, but each iteration only requires local information: u.distance, v.distance, and
uv.weight for a pair of nodes. Thus, it's easily distributed. Here's the distributed version. The initialization is the same.


foreach node forever:

  send a message containing u and u.distance;



message from u received by v:

  if (v.distance > u.distance + uv.weight)

    v.distance = u.distance + uv.weight;

    v.next_hop = u;

The "foreach node forever" portion will continually "relax" edges, which will eventually converge to a shortes t-path tree.

Once this tree has been constructed, each node can forward messages down to the root.


However, we've neglected one term: uv.weight. How do we calculate it? Link estimation. uv.weight is exactly the output

of a link estimator. Like we discussed earlier, there are lots of ways to do link estimation. When a message is received,

we could look at the radio signal strength or we could measure the bit errors in the preamble. Or, we could maintain a

neighbor table, periodically send beacons, and use sequence numbers to estimate the packet success rate. Or, if we

hear a message from a node, we could just assume the link is perfect. The choice of link estimator, as long as each node

is using the same one, is actually quite separate from the problem of routing graph construction. For now, we'll assume

we can just conjure up a link estimate out of nowhere. The only requirement is that it be structured as a cost function:
lower numbers are more desirable than higher numbers.


But, each time we access uv.weight, it might have changed since the last time. If it didn't, we wouldn't need an adaptive

algorithm in the first place. So, as the nodes continually send their beacon messages, the tree might never become
stable. If the tree is not stable, previously good paths may become less good in an unpredictable fashion.


The simple solution to this problem is to limit the adaptation time, by limiting the number of messages each node will

send. Consider the above protocol, in which each node sent exactly one message. Will this construct a tree that covers

the entire network? It depends on when the messages are sent. Let's say that all of the messages are sent at random

times, relative to each other. It's possible for a node with infinite distance to only hear from other nodes with infinite

distance and never select a next hop. In the worst case, all of the nodes will transmit before the sink transmits, and only

the nodes one hop from the sink will select it as a next hop neighbor. But, in the best case, the sink will transmit first.

Then, all of its one-hop neighbors will transmit. Then, its two-hop neighbors, and so on. What does this look like?
Dissemination. A flood.


One simple way to construct a routing tree is to flood a message out from the sink. When a node n hops away receives a

message from a node n-1 hops away, it obtains an instantaneous link estimate between itself and the sender. It then

runs one iteration of Bellman-Ford, either selecting a parent for the first time or selecting a better parent if the new

distance is lower. It waits for all of the other (n-1)-hop nodes to send, and then sends its own beacon message. This

gives a subset of the (n+1)-hop nodes a chance to select it as a neighbor. Under this flooding tree-build algorithm, no



                                                                                                                                68
node will transmit a beacon before it has selected a parent, before it has a non-infinite distance. Thus, every message
reception will result in a parent selection, and a tree will be built.


So, we have two traffic patterns that can be used to compute a distributed Bellman-Ford algorithm and build a routing

tree: a random beacon by each node and a flood beaconing from the sink. Both can be made adaptive, by performing

them repeatedly. Flood beaconing will build a complete tree with fewer beacons, but both will eventually bui ld a tree.

Trees built by flooding messages will propagate changes in link quality more quickly to the end of the tree. Trees built by

unsynchronized periodic random transmissions will take longer to propagate changes, and thus will take longer to settle
down.


The question then becomes: how accurate is this tree, and how stable is this tree? If we knew the exact bidirectional

quality of each link at all times, we could compute an optimal shortest path tree. Trees computed in this way are
approximations that depend on two factors: the accuracy of the link quality estimator, and the rate of parent change.


Looking at the flooding tree-build protocol with instantaneous link estimation, storing only the current next hop and the

current distance, we see a problem: what happens if we miss a beacon message from our next hop node? Our distance
stays the same, though the fact that we've missed a message suggests that the link quality estimate should be lowered.

We might hear routing messages from every other neighbor, but if the currently-selected next hop was the best, it will

never be changed even if it has ceased to become the best. If we have an idea of the expected beacon rate, we could

notice that a beacon was missed and increase the distance enough that the next time we hear a beacon from a better

next hop, we'll pick it. But how much is enough, especially when the link estimate is based on a physical phenomenon

like signal strength or bit errors? If we were storing a table with potential next hop neighbors, in addition to the one

current next hop neighbor, we could switch to the next best neighbor when we think we've missed enough beacons from

the current best neighbor. But, we don't have such a table. These problems suggest that a better treebuilding protocol

requires two things: a link estimator that operates independently of the routing beacons and can accurately respond to

decreases in link quality as well as increases, and a table of potential next hops that can be used as soon as the current
next hop link quality has decreased.


Whereas before we only stored our own next hop and distance, we now store the distance of each neighbor too.


So, let's enhance our algorithm:


message from u received by v:

  store u.distance in table;

  foreach neighbor n in table:

     nv.weight = evalLinkCost(n,v);

    v.potential_distance = evalCost(n.distance, nv.weight);

     if (v.potential_distance < v.distance)

        v.distance = v.potential_distance;

        v.next_hop = n;



                                                                                                                            69
Now, whenever a beacon message is received, we update one node's distance. We then re-evaluate all neighbors, and

choose the best. We've replaced the instantaneous link estimate with the "evalLinkCost" function, which is independently

updated. Now, we can detect that our current next hop has become worse than a potential next hop, and select it

immediately. In fact, the algorithm can now be further decoupled: there's no reason to run the next hop selection process

whenever a message is received.


Trees in which each node only maintains one next hop neighbor are likely to remain suboptimal for longer, and not

respond as quickly to changing link conditions. Trees in which each node maintains one next hop neighbor and several

potential next hop neighbors can adapt quickly to declining link quality by selecting the next best option. However, one-
neighbor trees require much less state.


When updating a distributed Bellman-Ford tree, a change in the link quality between two nodes has a ripple effect that

changes all of the other link qualities further down in the tree. This means that any noise in the link estimator will be

amplified into a cascading series of changes. By limiting the rate of next hop selection, we can limit the instability in the
tree. However, we could always do the same by limiting the rate of flood beaconing.


We've been glossing over the fact that link quality is not symmetrical. An instantaneous link estimator measures inbound

link quality, but the link quality that matters in routing is outbound link quality! We can only justify using an instant aneous

link estimator if we assume that the asymmetry is not too great. With a very asymmetrical link, we could end up sticking

with a high-quality next hop node that we can't ever get any data to. So, our more advanced link estimator must provide
inbound and outbound link estimates.


Our statistical link estimator can also measure link quality as an actual rate of packet success and not as a simple cost.

The simplest thing we can do with a packet success rate is threshholding. Consider all links with both an inbound and

outbound success probability above K to have a link cost of 1, and all others to have a link cost of infinity. This is

shortest-hop-count routing over good symmetrical links. Because a good symmetrical link is most likely to be a link that

covers a short distance, this will tend to increase the number of hops required to cover a given area, which increases
energy cost.


A more nuanced technique would be to use the packet success rates directly. We could consider using multiplication

instead of addition, because multiplying a series of outbound success probabilities will result in a outbound success

probability for the whole path. Then, we can to select the neighbor with the highest success probability instead of the

lowest cost. This technique selects for lots of short, reliable links, which may also have a negative effect on energy cost.

And if we're using link-layer retransmissions, it doesn't account for the probability of losing ACKs, which will increase the
retransmission rate.


Instead, we can convert packet success rates to values that indicate the expected number of transmissions required to

forward a message. To transmit a message, we need a successful transmission and a successful ACK. So, we multiply

the inbound and outbound link qualities. Because the inverse of a success probability is the expected number of tries

before a success, we can take the inverse. A low success rate will result in a high expected number of transmissions,

which happens to fit the standard model of a cost function! We can add these ET values along the path, selecting paths

                                                                                                                              70
that minimize the number of transmissions required to forward a message to the sink. This directly minimizes energy cost.
This style of routing is called MET, for Minimum Expected Transmissions.


MET routing has been shown to be very good at constructing trees. However, because it depends on a statistical link

estimator, it can be quite slow to converge. Trees based on minimum signal strength or minimum bit errors can be
computed quickly, but are hard to reason about.


In any of these trees, temporary inconsistencies in the distance values could result in a cycle being formed. Random

beaconing, due to its long convergence time, is more likely to result in cycles because distance information cou ld remain

inconsistent for longer. Flood beaconing helps to prevent long cycles them before they occur. A simple explanation is

this: if nodes at hop n all transmit before nodes at hop n+1, the n+1 nodes will present higher distances than the n -hop
nodes, and thus will never be selected by a node with a lower hopcount.


Most tree-building algorithms at least prevent one-hop cycles by including the node's next hop in the route building

message. Each node can then refusing to select that neighbor if that neighbor had already selected the node. In addition,

snooping on packets sent by other nodes can let a node determine the next hops of its neighbors, which can also be
used to prevent short cycles.


The negative effects of long cycles can be prevented by including a maximum hop count value in the message, and

decrementing that value each time the message is forwarded. If the value ever reaches 0, the message is dropped from

the network. However, this also limits the maximum hop count of the network. An alternate approach is to give each

message a sequence number, and increment it every time a new message is sent into the routing layer. Every other

node then records the latest sequence number heard from its neighbors. If a node is ever given a message to forward
that has an older sequence number, it can detect that it has become part of a cycle.


What started out as a simple attempt to select a 2-byte integer has turned into a whole table of neighbors, each with

distances, link estimates, and sequence numbers. More robust tree-routing protocols simply require more state. In

addition, we see that attempting to build a stable routing tree on top of an unstable connectivity graph, in the presence of

imperfect coordination and imperfect estimation, is quite challenging. The algorithm must constantly adapt to the next-
hop neighbor becoming inaccessible, and attempt to damp the fluctuations caused by doing this.


Here's an alternate approach: perform the Bellman-Ford algorithm to calculate a distance for each node. But, don't

record the next-hop neighbor that caused the node to have a given distance. Instead, each node should place its own

distance into the header of every message it sends, and send the messages to the local broadcast address. Then, any

node that hears the message compares the distance to its own distance. If the receiving node has a lower distance, it will

forward the message. This is called gradient routing, so called because each node has a position in a gradient centered

at the sink. In tree routing, the sender decides which receiver will forward the message. In gradient routing, the receiver
itself decides.


Do you believe that this will eventually get the message to the sink destination? It will, but many more redundant nodes

will attempt to forward the message along the way. But, it's much more robust to changing connectivity for the same


                                                                                                                          71
reason that Trickle is: every node has a chance to participate. If the best forwarder is dead, some other node is bound to

have heard the message and will also try to forward it. More research remains to be done on this technique, but it seems

that suppression techniques like those used in Trickle can help to lessen the unnecessary redundant transmissions while
still maintaining the robustness of the forwarding process. What do you think?



12.         How can naming and database techniques improve WSN applications?

Now that we've studied local communication and then examined multi-hop communication, collection, and dissemination

can be built on it, we can talk about building an entire WSN application.


The simplest sort of WSN application is commonly called "sense-and-send". In a sense-and-send application, each node

starts executing as soon as it is turned on. The node continually samples a fixed set of sensors with a fixed period.

Concurrently, each node attempts to join and then maintain membership within a single collection tree. When a set of

readings are ready, the node collects them into a message and sends it up the collection tree. A waiting gateway node
records each reading as it comes in over the tree. The nodes run until their battery runs out, or they are turned off.


Here's our sense-and-send application's message:


typedef struct SenseMsg {

    uint16_t nodeid;

    uint16_t light_reading;

    uint16_t temp_reading;

} SenseMsg;

The sense-and-send configuration looks like this:

configuration SenseAndSend {

}

implementation {

    components Main, SenseAndSendM, TimerC, DrainC;



    Main.StdControl -> SenseAndSendM;

    Main.StdControl -> TimerC;



    SenseAndSendM.Timer -> TimerC.Timer[unique("Timer")];



    SenseAndSendM.LightADC -> LightSensorC.ADC;

    SenseAndSendM.TempADC -> TempSensorC.ADC;



    SenseAndSendM.Send -> DrainC.Send[AM_SENSEMSG];

}

The sense-and-send application is a single module:

                                                                                                                         72
module SenseAndSendM { ... }

implementation {



    TOS_Msg msgBuf;

    bool msgBufBusy;

    SenseMsg* senseMsg;



    command StdControl.start() {

        call Timer.start(TIMER_REPEAT, 1000);

    }



    event result_t Timer.fired() {

        uint16_t allowed_length;

        msgBufBusy = TRUE;

        senseMsg = (SenseMsg*) call Send.getBuffer(&msgBuf, &allowed_length);

        call LightADC.getData();

    }



    event result_t LightADC.dataReady(uint16_t data) {

        senseMsg->light_reading = data;

        call TempADC.getData();

    }



    event result_t TempADC.dataReady(uint16_t data) {

        senseMsg->temp_reading = data;
        sendMsg();

    }



    void sendMsg() {

        senseMsg->nodeid = TOS_LOCAL_ADDRESS;

        call Send.send(&msgBuf, sizeof(SenseMsg));

    }



    event result_t Send.sendDone(TOS_MsgPtr msg, result_t result) {

        msgBufBusy = FALSE;

    }

}

Now that we've seen all of the pieces, you can see that constructing a sense-and-send application is actually quite

simple. The most complicated part is the tree maintenance protocol, and even there, the level of complexity can be

determined by the designer. A sense-and-send application has a useful sort of purity. The number of things it can do is

                                                                                                                      73
quite limited, and it cannot be changed without recompiling the application. This limits the possibility of failure, and

suggests that once the application is started running, it will continue to run.


However, this simplicity also limits the user's ability to control the network, and limits the possible responses to failures . If

a sense-and-send node stops sensing or sending, it's just gone. In addition, it limits the user's ability to retask the
network by selecting different sensors or different sample rates.


Let's say that each node has several different sensors. These sensors could be real, like a light or temperature sensor.

Or, the sensors could also be virtual. Consider a "sensor" that reports the number of packets that have been sent over

the radio, or the address of the current next hop node. Even simpler, consider a "sensor" that always reports the node's

own address, or location. We can think of useful virtual sensors until the cows come home, and create tons of them. But,

there's never enough network bandwidth available to report the values of all of our sensors, real and virtual. We need to
improve our application by giving the user the ability to select which sensors are worth reading.


Consider the following statement:


    SELECT nodeid, light, temp FROM sensors SAMPLE INTERVAL 1s

This is a query, in a query language that provides one primitive: SELECT. SELECT takes a list of sensors and returns

the results. In fact, this query encapsulates the entire sense-and-send application above ... in one line. Instead of writing

a bunch of different sense-and-send applications, we can write one query processor. The query processor can connect

to all of the different sensors. It can receive a query over the dissemination layer, start the timer, sample the sensors

given in the query, and send the result over the collection layer. The PC becomes an active participant instead of a

passive listener, constructing the query and then processing the results.


But, for select to work, we need to develop an abstraction for names. Each sensor needs its own name. This is a huge

leap. Instead of letting the nesC wiring define the names of the sensors and a single static query, corresponding to
whichever sensors are wired in, we can introduce a layer of abstraction.


         Each sensor has a name.

         A query contains a list of names.
         A dispatcher can take a name and return a value from the correct sensor.


The simplest sorts of names, for our small-bandwidth devices, are numbers. Let's say that each sensor is assigned a

unique integer. Then, we can implement a dispatcher using nothing more than a switch statement. And, we've all seen

how good the nesC compiler is at creating switch statements. Instead of giving the sense-and-send application the

responsibility of wiring to a few sensors, our query processor wires to a client-side parameterized interface. This means

that every sensor has to provide the same interface. Remember this interface?

interface Attribute {

    command result_t get();

    event result_t getDone(uint16_t data);

}


                                                                                                                               74
It's the other key piece to implementing a query system. It defines a simple type system for sensor values: they're all 2 -

byte integers. And, it defines a common set of functions for accessing that data.


Now, we can write one query processing component. This query processing component wires to the dissemination layer,

and uses it to receive query messages then save them. A query message contains a list of integers and a sample rate.

When processing the query, the query processor iterates through the list of attributes, and calls Attribute.get() with the

correct parameter for each sensor. As the getDone() events arrive, the query processor copies the values into a result
message. When all sensors have been read, the result message is sent out.


Now, we can implement three attributes: one that returns the node ID (conveniently a 2-byte integer), one that reads the

light sensor and returns the result, and one that reads the temperature sensor and returns the result. We can wire these

attributes into the query processing component, compile the application, and run it. Then, we can write a client program

that takes a list of sensor numbers from the user, builds a query, sends it, and as the results come in, displays them.

Instead of writing a new sense-and-send application, the user can build a new query, and get results from all three

sensors, one, or none. When we write new attributes, we can wire them into the query processor, and they instantly
become available.


What are the tradeoffs here? First, the query injection. If a node doesn't hear the query, it won't respond. So, we want

reliable dissemination to ensure that every node will eventually hear the query. Second, the query processing. If a node

saves the query into RAM and it later reboots, it will lose the query and stop responding. Reliable dissemination can take

care of this also, because the node can re-acquire the query when it boots up again. Or, the node can save the query to

permanent storage and begin to execute it when it comes back up again. Third, the code overhead. A query processing

application could be larger than a sense-and-send application. Fourth, the bandwidth overhead. It costs energy to

disseminate the query, energy that wouldn't be spent if the query was just wired in. The response message will probably

also have to contain some kind of header that identifies the attributes that have been selected, or an identifier that the
client can use to match the request with the response. This costs some bandwidth too.


So, the query processing system is less efficient. But, it lets us design a flexible and retaskable sensor network, and
makes it much easier to deploy a new application. See why you might want to do it?


There are whole classes of application improvements which we could not have done without a query system. Let's
examine a few.


In the sense-and-send application, our sampling period was fixed. In our query, the period was also fixed, but it doesn't
have to be. Consider the following query:


  SELECT nodeid, light, temp FROM sensors LIFETIME 30d

This query, instead of specifying a sample rate, specifies a minimum lifetime for the query. We then want to sample as

quickly as we can, provided that the nodes will not run out of energy before the lifetime has elapsed. To compute this

sample rate, we need metadata about each attribute. How many joules of energy does it take to acquire a reading? The

nodeid sensor is probably a lot smaller than the temperature sensor, because it only requires copying some bytes from


                                                                                                                             75
RAM to RAM. Then, because the nodes must send and forward the results, we need to estimate how many joules it

takes to send and receive a message. We can then add the sampling cost to the sending cost to compute how much

energy it takes to send a reading. We can then look at the tree structure to calculate how much energy each node will

need to spend on forwarding. Finally, we can divide the available power per unit time by the energy consumption, and

derive a maximum sampling rate for each node. The query might also include a minimum meaningful sampling rate, and

warn the user if a node can't satisfy it.


Next, we can extend our query language to include predicates. Let's say we are interested in the readings every second,

but only when it's bright out. Here's a new query:


    SELECT nodeid, light, temp FROM sensors WHERE light > 0 SAMPLE

      INTERVAL 1s

Now, our query processor has to decide which order to read the sensors. To test the predicate, it has to read light first.

Then, if the predicate passes, it can read the rest and send the result. This problem of selecting the ordering of sensors

to satisfy a predicate can become a complex optimization. Trying to compute a lifetime along with a predicate might

actually be impossible. But, there's always more work for database researchers to do, especially in the sort of "streaming

databases" or "acquisitional query processors" that we can build on top of sensor networks.


Every query language limits the sort of processing we can perform. For example, we can't express a predicate that

includes a FFT and some smoothing if all we have is comparisons to threshholds. This sort of processing is very useful

for event detection. In this case, we may want to create a new attribute that includes the code to perform these

calculations. When we read the attribute, the code can perform the computation and return the latest result. But, if we're

really interested in whether the computation detected an event or not, and not the numeric value, we're kind of at a loss.

If we call get(), and the computation hasn't detected anything, what should it return? We could use a failure code for

NO_RESPONSE. But, let's say that the computation contains some sort of confidence threshhold. We could have the

attribute return that threshhold, and then submit a simple comparison query that only returns a result when the
confidence is over a threshhold.


If it's really not possible to specify the detection condition in terms of a query, we may want to take an alternate

approach: named events. Just like an attribute is a named data item that can be gotten, a named event can be signaled

when it happens. However, an event might have data associated with it, so it could also be considered under the notion
of attribute. Let's extend our Attribute interface:


interface Attribute {

    command result_t get();

    event result_t getDone(uint16_t data);

    event result_t changed();

}

By adding the changed() event, the attribute code is no longer at the mercy of the query processor. If it decides that

something interesting has happened, it can signal the changed() event, which the query processor can listen for. Then,

the query processor can retrieve the latest data, put it in a message, and send it up the tree.

                                                                                                                         76
Finally, if an attribute can be gotten, perhaps it should be settable too. We could then submit queries like this:


    UPDATE sensors SET nodeid = 12 WHERE nodeid = 10

This brings parameter modification under the framework of the query system. We can extend the Attribute interface like

this:

interface Attribute {

    command result_t get();

    event result_t getDone(uint16_t data);



    command result_t set(uint16_t data);

    event result_t setDone();



    event result_t changed();

}

Finally, we might reconsider our earlier decision to limit the type of an attribute to a 2-byte integer. An integer could have

an arbitrary type with an arbitrary size.

interface Attribute {

    command uint32_t size();



    command result_t get();

    event result_t getDone(void* data);



    command result_t set(void* data);

    event result_t setDone();


    event result_t changed();

}

Actually handling these arbitrarily sized attribute values will demand more of our in-node buffering and our network layer.

Isn't this starting to look a bit like files? Or database BLOBs? Or HTTP GET/POST? It all hangs together.




                                                                                                                            77
13.       How can a network of sensors localize itself in time?

As we've discussed, sensor networks are designed to be tightly coupled to the physical world. Every node should have a

known location in space, which is easy to conceptualize. Just as important but harder to understand, every node should

have a known location in time. For this lecture, we'll talk about temporal localization, or timestamping. Just determining

the relative location of each node in space or in time is a challenge. Fortunately, nodes tend not to move in space.

Unfortunately, they often drift in time. This makes temporal localization even harder, because it must be performed

constantly. But, a node can't move itself in space, but it can move itself in time. When multiple nodes attempt to move to

the same place in time, we call it time synchronization. Temporal localization and time synchronization are both important
procedures that we'll discuss today.


Why is temporal localization important? Post-facto data correlation. When two different WSN nodes collect data, it's

important to be able to determine the time at which each piece of data was collected. When the times are right, it

becomes much easier to compute functions over the data. For example, a time-correlated series of event detections from
heat detecting nodes could be used to establish a position and velocity for the wavefront of a forest fire.


Why is temporal synchronization important? A priori correlation of data acquisition times. Sometimes, knowing when two
pieces of data were acquired is not enough -- we want them to be acquired at exactly the same time.


The challenges with temporal localization and synchronization in sensor networks arises from the distributed nature of

such systems. Each node has its own clock, its own temporal location. Within a single node, there is a clear ordering of

events, but no such ordering exists across multiple nodes. Different clocks may have no common referenc e. In addition,
different clocks may be moving in time at different rates -- drift.


Let's start by looking at a node clock. A node clock is based on a crystal or oscillator that vibrates with a known

frequency. A piece of counter hardware is attached to the oscillator, which contains a counter register and a holder

register. Each time the oscillator vibrates, the counter is decremented by one. When the counter reaches zero, a timer

interrupt is sent to the processor. The counter is then reloaded from the holder register, and the cycle begins again.

When the timer interrupt is received, the processor may increment a separate counter in RAM, which can be read by
other components when they need a local time value for timestamping.


The contents of the holder register can be changed by the processor to generate interrupts at different rates. Many

clocks actually contain multiple holder registers, making it possible to generate interrupts at a small number of different

rates. However, there are never enough holder registers to generate interrupts at all of the different rates needed by all

the system components, so we implement a timer multiplexing component. This is TimerC. TimerC acts as a virtual clock,

containing compare and holder registers for each of the different system components that use it. TimerC then sets the

hardware timer to generate interrupts at the slowest rate that is still fast enough to satisfy all of the interrupt rates desired

by the different software components. As the hardware timer fires, the software timer decrements each virtual compare
register appropriately, and signals the appropriate Timer.fired() event when one expires.



                                                                                                                              78
Here are a few definitions that are needed to talk about time:


Time - The time of a clock on node p is given by the function C_p(t). In a perfect, synchronized clock, C_p(t) = t, where t
represents "true" time.


Frequency - Frequency is the rate at which a clock progresses. Frequency may change over time, leading to a function
C_a'(t).


Offset - Offset is the difference between the time reported by a clock and the "true" time. This is equivalent to temporal

location. The offset of a clock C_a is given by C_a(t) - t. Two clocks have a relative offset, which may also change over
time. The relative offset of two clocks at time t is given by C_a(t) - C_b(t).


Skew - The skew is the difference in frequencies between a clock and a perfect clock. The relative skew of two clocks at
time t is defined by (C_a'(t) - C_b'(t)).


Drift - The drift rate of a clock is the second derivative of the clock value with respect to time: how fast is the clock
frequency changing over time. The relative drift of two clocks is given as (C_a''(t) - C_b''(t)).


In a perfect clock, the change in the clock value relative to the change in real time (dC/dt) is exactly 1. When this value is

greater than 1, the clock runs fast, and when it is less than 1, the clock runs slow. Clock manufacturers often specify a
maximum skew rate, such that 1 - p <= (dC/dt) <= 1 + p.


A localization process is based on the following primitive: ranging. In the temporal case, this means that two nodes

should be able to determine the relative offset of their clocks. In addition, each node should be able to estimate the value

of the other node's clock. If each node can determine its offset from a node with the "reference" time, the network is
localized in time.


In response to temporal localization, we can perform a temporal synchronization. In synchronization, clocks are adjusted

to run in tandem, effectively "moving" every node in time to the same reference point. However, synchronization is not

strictly necessary if the goal of temporal localization is after-the-fact correlation of collected data. If multiple nodes need
to collect data at the same real time, however, synchronization is necessary.


Here's the simplest time synchronization method. Assume that a server exists, with the reference temporal location, and
that a client exists with no temporal location.


The server copies the current value of its clock into a message, and sends it to the client.


The client then sets its own clock to this value.


Unfortunately, sending a message takes some time, which means that the client's clock will always have a negative
offset from the server's clock.




                                                                                                                              79
Cristian's method can be used to perform time synchronization between a client and a server while accounting for
message delay:


At the client's time T_0, it sends a message to a server.


The server then responds with a message containing its own clock value S_t.


The client receives this message at local time T_1.


The difference T_1 - T_0 is the round-trip time for messages sent between the client and the server.


Half of this time is an estimate of the time required to send a single message from the server to the client. If the cli ent

receives a message with time S_t, and some time has elapsed since that timestamp was made, it can be estimated with
(T_1 - T_0)/2.


Thus, the client can then set its clock to S_t + (T_1 - T_0)/2.


Unfortunately, RTT may not be constant. Cristian's method should be performed a few times to compute an average RTT.

In addition, travel time from client to sender and from sender to client may not be equal. Thus, the estimate of single -
message transit delay may be wrong.


The NTP protocol is a more advanced version of Cristian's method, which computes a relative offset between two clocks
while taking delays into acccount.


At time T_c1, the client sends a message to a server with this timestamp.


When the server receives this message, it saves a timestamp from its own clock: T_s1.


The server then sends a message to the client at time T_s2. This message contains the timestamps T_c1, T_s1, and
T_s2.


The client then receives this message at T_c2.


Let value a = T_s1 - T_c1 and value b = T_s2 - T_c2.


If we assume that the difference in message delays for the client->server message and the server->client message is

small, then we can compute the temporal offset by computing a+b/2. We can compute the roundtrip delay by computing
a-b.


Because delay is variable, the Network Time Protocol performs this protocol repeatedly, and stores a sequence of offset

and delay values (O_i, D_i). The offset corresponding to the minimum delay is then chosen, and the client's clock is
adjusted.



                                                                                                                               80
However, Cristian's method and the NTP protocol require one message exchange per pair of nodes. In a large sensor

network, the sending and forwarding costs required to accomplish this are probably too high. Several temporal

localization and synchronization protocols have been developed specifically for the sensor network space. Let's look at a
few.


The Reference Broadcast Synchronization protocol from UCLA (RBS), uses the broadcast nature of RF communications

to estimate the temporal offsets between multiple nodes. When a radio message is sent, it will arrive at many different

nodes simultaneously, with nearly identical delay. Technically, RF does propagate at a fixed speed, so different

distances between the sender and different receivers will result in different propagation delays, but these differences are
so small as to be negligible. Here is the RBS protocol:


A transmitter broadcasts a reference packet to two (or more) receivers.


Each receiver records its local time at which the packet was received.


The receivers exchange the times at which they received the packet.


The temporal offset (temporal distance) between the two receivers is computed as the difference of the local times at
which the same message was received.


This protocol has an elegant sort of simplicity. One broadcast packet by a reference source can enable an entire radio

neighborhood of nodes to determine their relative temporal locations. The offsets can be exchanged only when
necessary, for synchronization or post-facto correlation.


However, the protocol depends heavily on each receiver recording a timestamp as soon as the message is received.

Thus, RBS requires real-time coordination from the underlying radio stack. If the node is busy, there may be
nondeterministic delay between the time that the message is received and that the timestamp is taken.


In practice, each RBS node records timestamps for several reference broadcasts. The offset between two nodes is then

calculated as the average of the time differences. By recording multiple timestamps, RBS can also estimate t he
difference in skew between two clocks using a least-squares fit to a sequence of increasing phase offsets.


RBS has been shown to compute offsets with a precision of several microseconds. RBS only computes the relative

location of each clock, and is thus more useful for post-facto timestamping than time synchronization. In a single-hop

network of n-nodes, a full exchange of offsets requires O(n^2) messages. For multi-hop networks, nodes in different

radio neighborhoods must be synchronized to a broadcast by a gateway node in both, leading to a further problem of
selecting appropriate gateway nodes. Actually doing multi-hop time synchronization with RBS doesn't scale well.


An alternative to exchanging all offsets is hop-by-hop updating of timestamps. If node 1 sends an event which occurred

at time t_1 to node 2, node 2 can use its knowledge of offsets to change the timestamp from node 1's clock to node 2's

clock. Therefore, by the time the message leaves the network, it will be synchronized to the root's time. If this is done for



                                                                                                                           81
every hop, the timestamps themselves can carry the synchronization information out of the network. However, this only
applies to post-facto synchronization.


The Flooding Time Synchronization protocol (FTSP) was developed at Vanderbilt University. The goal in developing

FTSP was to make a multihop time synchronization algorithm that scales. The FTSP primitive is pairwise synchronization,

like Cristian's method. Cristian's method suffers from variability in message delay, and because it uses 2 messages,

variability in turnaround time on the server. FTSP uses a single message, containing the server's timestamp, and

reduces the message delay down to a small constant value that can be easily subtracted from that timestamp to obtain
the proper client timestamp. FTSP minimizes these delays through clever interaction with the radio hardware.


When sending an FTSP beacon, the sending node begins to transmit the preamble bits. As soon as the last preamble bit

is transmitted, the sending node takes a timestamp, and stores it at the end of the message buffer that still has yet to be

transmitted. The receiving node, as soon as it receives the last bit of the preamble, takes a timestamp from its own clock.

When the message is passed up, it contains two timestamps: sender and receiver. They were both taken exactly 1 bit-

time apart, minimizing message delay, and no turnaround has been performed. The receiver can correct for this bit -time

by subtracting it from the receiever's timestamp. Effectively, then, the two timestamps have been taken at the same time,

and can be directly compared to determine the offset between the sender's clock and the receiver's clock. The receiver
can then correct its own clock, or simply maintain the offset.


The FTSP works over multiple hops by selecting one node as the synchronization root, defining a global time, and

computing all other nodes' offsets from that time. Leader election is used to select the root with the lowest unique

address. Nodes one hop from the root synchronize with the root and then retransmit, and the root's time floods through
the network. Each node sends its beacons asynchronously, as when building a routing tree formation.


This method is robust to network structure and mobility, and requires a minimal amount of communication. It has been

shown to synchronize nodes to within 1.5us in a single hop. This method does suffer from increasing inaccuracy per hop,
and experimentally, error increases by 0.5us per hop.


With microsecond-level time synchronization, it becomes possible to compare the arrival times of a single acoustic signal

on two different nodes. By knowing that the signal arrived at node 1 at time T1, and node 2 at time T2, the distances

between the nodes and the source can be estimated, and the distances between the nodes can be triangulated. This can
be used to calculate a spatial coordinate for every node in a network, which we'll talk about next lecture.


Thus, really good time-sync can lead to really good space-sync. How about that?




                                                                                                                         82
14.       Project 1 - Design a Wireless Sensor Network

Draft due Wednesday, June 29th, by email

Final due Friday, July 6th, to be presented in class


Now that you've seen a few examples of wireless sensor networks in action, it's time for you to design your own. In your
writeup, answer the following questions:


        What is the phenomenon you want to study?
        What is the spatial and temporal range of the phenomenon?
        What sensing density would be optimal for monitoring it?


        Do the sensors have to form a mesh network?
        If so, what radio range would be necessary?
        If not, how do you get data out of the sensors?
        How does data get from the network to the people who need it?
        How could the data be visualized and analyzed?


        What decisions could be made by people, based on the sensor data?
        Could any of the decisions be made electronically?
        Would any automatic responses be useful?
        Who would want the information from your network?
        What would they gain from having it?
        Who shouldn't have the information?


Some things to consider monitoring:


Water, earth, climate, plants, animals, people, households, buildings, machines, disasters, utility networks,
factories, farms.


Keep the following motto in mind: "Making the Invisible Visible".


A few pages of text should be enough to get your point across.


I'll email a few comments on your drafts before you present on Friday.


Aim for a 5-10 minute presentation. We'll discuss each project afterwards.




                                                                                                                      83
15.       Project 2 - Data acquisition, local display, and component abstraction.
Due date: Monday, July 18th, before 1:30pm (class time).


By now, we've discussed how TinyOS provides abstractions for local input from buttons and sensors and
local output to the LEDs. We've seen "Hello World" programs designed to use all of these abstractions.
We've also seen how nesC interfaces make these abstractions possible.


You're going to be writing a small application that takes input from local sensors, the user button, and the
timer, passes it through common nesC interfaces, does some local processing, and displays the results on
the LEDs.


First, the abstraction. Start by writing this interface:


  interface Attribute {
      command result_t get();
      event result_t getDone(uint16_t data);
  }
This interface provides a single command, used for getting a piece of data from another component, and a
single event used for returning that data.


You'll then write 4 components that provide this interface:


      1. A component that maintains an internal 16-bit counter. This component will return the current value
          of the counter to a get() request by signaling getDone(), then increment the counter. A stream of
          get() requests should provide a monotonically-increasing stream of numbers.
      2. A component that returns a random 16-bit integer to each get() request. Use the Random interface
          of the RandomLFSR component for this one. You'll find it in tos/system.
      3. A component that returns the current battery voltage. Use the ADC interface of the InternalVoltageC
          component, located in tos/platforms/msp430.
      4. A component that returns the current temperature. Use the ADC interface of the InternalTempC
          component, located in tos/platforms/msp430.


Now, you have 4 separate "sensor" attributes, two real and two imaginary. We'll be using these four
"sensors" for the rest of the class. In the components providing these interfaces, your get() command should
not signal getDone() directly. It should post a task that signals getDone(). You'll need to figure out how to
transfer the sensor reading from async context to task context using a local variable.


Then, write a driver program called Display to display readings from these four sensors. Your driver program
should start a repeating timer with a period of 1 second, and then each second, take a reading from one



                                                                                                                84
sensor component and display the lower 3 bits on the LEDs. Start with the counter -- it's the one that's the
most obvious when you get it right. It will count up in binary.


Then, extend your driver program to switch between the four sensors by using the MSP430Event.fired()
event from the UserButtonC component. Each time the user button is pressed, select the next sensor as
your source of readings. Switch between the sensors in the order provided above. You'll probably want to
use a parameterized interface for this.


By the time you're finished, you should be able to turn the mote on and see it count up in binary. Then you
should be able to press the user button and see it switch to random numbers. Then you should be able to
press it again, and see the low 3 bits of voltage, and again, and see the low 3 bits of temperature. These are
likely to be pretty random also. Next assignment, we'll look at the real values by sending messages out on
the radio. With a final press of the user button, your mote should return to counting in binary at the point
where it stopped before.


Email me a tar.gz file of your application, containing a Makefile, a top-level configuration called Display.nc,
and 4 configurations and modules representing the 4 sensors. It should compile out of the box, with "make
telosb", and it should run correctly on my test mote.


Good luck! If you have any questions, please email me.




                                                                                                                  85
16.        Project 3 - Local Neighborhood Data Sharing
Due Date: Thursday, July 28th, at midnight.


For this project, you will create an application that measures the connectivity of a wireless sensor network.


To do this, each node has to:


          maintain a table of its neighbors
          estimate the quality of links from its neighbors to it


SIMULATOR


You should get this program working in the simulator before ever running it on the nodes. The simulator can
take a network connectivity graph as input. Generate it with the following command:


java net.tinyos.sim.LossyBuilder -d 4 4 -s 20 -o lossy.nss
This command generates a 4x4 grid, with each node spaced "20" apart. At this spacing, each node will have
about 4 good neighbors, and a few more bad ones.


Run the simulator with the following command:


./build/pc/main.exe -rf=lossy.nss 16
This launches 16 nodes, and the lossy.nss file will set up the grid connectivity.


As part of the debugging process, you will want to write a function that prints out the contents of the neighbor
table using a for loop and dbg() statements. You will also want to print a dbg() line each time a message is
sent or received.


BASIC SENDING AND RECEIVING


You can build the link estimator from your Counter attribute.


Start by creating a component that broadcasts the following message every second:


enum {
     AM_SHAREMSG = 1,
};


typedef struct ShareMsg {
     uint16_t source;
     uint16_t data;
} ShareMsg;
                                                                                                                86
Fill in the source field with TOS_LOCAL_ADDRESS, and the data field with the latest value of the counter
attribute.


Install the TOSBase application on a node and attach it to the PC in order to receive the messages from the
nodes. Make sure each node is counting correctly.


Next, create a ReceiveMsg handler for these messages. You may want to blink a LED when this message is
received, as a test.


NEIGHBOR TABLE MANAGEMENT


Now, each node can become aware of its neighbors, and you can fill in the neighbor table.


Create a small neighbor table of 4 nodes.


enum {
     SHARE_NEIGHBOR_COUNT = 4,
};
Each time a message is received, apply the Frequency algorithm as discussed in class. Ihis will determine
whether to insert the neighbor, and which neighbor it should replace.


Now, implement a second SendMsg interface that sends this message:


enum {
     AM_REPORTMSG = 2,
};


typedef struct link_est_t {
     uint16_t addr;
     uint16_t success;
} link_est_t;


typedef struct ReportMsg {
     uint16_t source;
     link_est_t nbrHood[SHARE_NEIGHBOR_COUNT];
} ReportMsg;
After sending your ShareMsg, fill in the ReportMsg. Put the node's address in source, and each neighbor's
address in nbrHood.addr. You don't have anything for the success rate field yet. Send this message after the
node finishes sending the ShareMsg.


LINK QUALITY ESTIMATION


                                                                                                            87
Now that your neighbor table is being managed, you can start to calculate a message success rate for each
node.


Every time a SharedMsg is received, check whether the node has been heard from.


If the node has not yet been heard from, store the counter value and set the received and sent counters to 1.


If the node has been heard from, compare the new counter value to the previously stored counter value for
that node. If the difference is greater than one, messages have been lost. Update the number of messages
received by the node, and then update the number of messages that were actually sent and should have
been received. Finally, store the counter value for next time.


You can compute the success rate with fixed-point division. Shift the number of received packets left by 8
bits, then divide by the number of sent packets. A 100% success rate will then be 256, and a 0% success
rate will be 0.


Place the success rate for each neighbor into the ReportMsg before sending.


Now, you can read each node's ReportMsg on the PC to determine its estimates of the link quality between it
and its neighbors.


DELIVERABLES


Hand in a tar.gz containing these files:


        Makefile
        Share.h
        Share.nc
        ShareM.nc
        Attribute.nc
        (the files defining your Counter component)


Email it to me by midnight on Thursday, July 28th. This project is quite a bit harder than the last one, so you
should start early.


If you have any questions, please email me.




                                                                                                             88