AN INTERVAL ROUTING ENABLED PUBLISHSUBSCRIBE COMMUNICATIONS

Document Sample
scope of work template
							    AN INTERVAL ROUTING ENABLED PUBLISH/SUBSCRIBE
COMMUNICATIONS PROTOCOL FOR THE AD-HOC SENSOR NETWORK
                    ENVIRONMENT




                                  by

                         Virendra J. Marathe




            A thesis submitted in partial fulfillment of the
            requirements for the Master of Science degree
                         in Computer Science
                       in the Graduate College
                      of The University of Iowa




                             August 2003




          Thesis Supervisor: Associate Professor Ted Herman
                             Graduate College
                           The University of Iowa
                              Iowa City, Iowa


                      CERTIFICATE OF APPROVAL
                      ___________________________

                            MASTER’S THESIS
                            _________________




This is to certify that the Master’s thesis of

                          Virendra Jayant Marathe

has been approved by the Examining Committee for the thesis requirement
for the Master of Science degree in Computer Science at the August 2003
graduation.



Thesis Committee:
                        Ted Herman, Thesis Supervisor



                        Douglas Jones, Member



                        Sukumar Ghosh, Member



                        Sriram Pemmaraju, Member
To my beloved Gurudev,
 Aaie, Baba and Deepti




          i
                              ACKNOWLEDEMENTS


   I would like to express my sincere gratitude to my thesis supervisor Prof. Ted

Herman for his invaluable support and guidance all through this research endeavor. This

thesis would not have come to life without his initiative, patience and wisdom.

   I am grateful to my academic advisor Prof. Douglas Jones for his enthusiastic moral

support throughout. I would also like to thank Prof. Sukumar Ghosh and Prof. Sriram

Pemmaraju for serving on my thesis committee.

   Finally, I would like to thank my family and friends for their support in all respects.




                                             ii
This work has been supported by the Defense Advanced Research Projects Agency

Contract F33615-01-C-1901.




                                     iii
                                     ABSTRACT


Sensor networks, due to their applicability, have been attracting a considerable amount of

attention in the research community recently. The limited resources available for the

sensors (like memory, energy, etc.) pose significant challenges in building routing

infrastructures for sensor networks. The ad hoc nature of sensor networks also present

challenges in fault tolerance. This thesis report is a discussion of a Publish/Subscribe

routing scheme in the ad hoc sensor network environment. The discussion explores the

limitations and hence challenges posed by the ad hoc nature of the sensor networks in the

context of developing routing infrastructures. First we discuss a pre-existing distributed

Brokering Infrastructure for Publish/Subscribe. The infrastructure supports data

aggregation, loop-free routing and fault tolerance for Publish/Subscribe in the ad hoc

sensor network environment. A potential drawback for scalability in the algorithm in

terms of routing-table size is pointed out. Subsequently, a proposal for an alternative

infrastructure for the Brokering System based on Interval Routing is made. Interval

routing is a memory efficient routing methodology in distributed networks that leads to

routing tables of size O(d), where d is the degree of the network (number of immediate

neighbors at every node). Interval routing is static in nature, and hence needs some add-

on services to be useful in the ad hoc environment of the sensor networks. The services of

neighbor detection, leader election, and support for fault detection are used in this

context. This proposed alternative has been primarily inspired by the limited resource

(more accurately memory) availability challenge posed by the sensor networks. After a

detailed discussion of the design, an elaboration about its implementation is made. The




                                            iv
implementation was made in a simulation environment called the TinyOS Simulator. This

simulator mimics the behavior of ad hoc sensor networks to a reasonably good extent for

testing routing protocol behaviors. The specification of the implementation is followed by

a fairly detailed review of the results obtained in the experiments that were carried out –

this is basically the evaluation section of the behavior of the protocol. The results of the

simulation experiments reveal the effectiveness of our new routing infrastructure in terms

of memory efficiency, fault tolerance and even fairly acceptable routing efficiency.




                                             v
                                         TABLE OF CONTENTS


                                                                                                                      Page


LIST OF FIGURES ....................................................................................................... viii

LIST OF TABLES ............................................................................................................ x

CHAPTER

1. INTRODUCTION....................................................................................................... 1

2. BACKGROUND STUDY........................................................................................... 6
2.1  A SURVEY OF SENSOR NETWORKS RESEARCH............................................................ 6
    2.1.1   Hardware Architecture................................................................................ 6
    2.1.2   Systems Software Infrastructure ................................................................. 9
2.2 THE PUBLISH/SUBSCRIBE COMMUNICATIONS PARADIGM ......................................... 14
    2.2.1   The Publish/Subscribe Idea....................................................................... 14
    2.2.2   Comparison with Traditional Communication alternatives...................... 15
    2.2.3   Types of addressing in Publish/Subscribe ................................................ 19

3. MOTIVATION ......................................................................................................... 21
3.1 A DISTRIBUTED BROKERING SYSTEM INFRASTRUCTURE FOR PUBLISH/SUBSCRIBE .. 21
3.2 SCALABILITY ISSUE WITH THE BROKERING SYSTEM INFRASTRUCTURE .................... 26

4. THE NEW DESIGN ................................................................................................. 28
4.1    GOALS FOR THE NEW DESIGN ................................................................................... 28
4.2    THE DESIGN OVERVIEW............................................................................................ 28
4.3    INTERVAL ROUTING .................................................................................................. 30
      4.3.1    Construction of the Interval Routing Structure......................................... 31
      4.3.2    Drawback in Interval Routing................................................................... 32
      4.3.3    Frond Links............................................................................................... 33
4.4    DISTRIBUTED INTERVAL ROUTING STRUCTURE CONSTRUCTION .............................. 34
4.5    ROUTING MESSAGES ................................................................................................. 40
4.6    FAULT TOLERANT ROUTING ..................................................................................... 41
4.7    UNDERLYING SERVICES ............................................................................................ 43
      4.7.1    Neighbor Detection................................................................................... 43
      4.7.2    Initiator (Leader) election and Interval Routing Structure construction... 46
      4.7.3    Detection and propagation of changes in the network.............................. 48
4.8    THE NEW BROKERING SYSTEM ................................................................................. 48
4.9    THE PUBLISH/SUBSCRIBE LAYER .............................................................................. 50
      4.9.1    Scarce Subscribers .................................................................................... 50


                                                             vi
       4.9.2        Abundant Subscribers ............................................................................... 51
       4.9.3        An Intermediate Scenario ......................................................................... 51

5. IMPLEMENTATION .............................................................................................. 54
5.1  THE EXECUTION ENVIRONMENT - TINYOS DESIGN .................................................. 54
5.2  THE PROGRAMMING MODEL ..................................................................................... 55
5.3  THE IMPLEMENTATION ARCHITECTURE .................................................................... 57
    5.3.1    The Application Layer .............................................................................. 59
    5.3.2    The Publish/Subscribe and Brokering System Layer ............................... 60
    5.3.3    The Interval Routing Structure Layer ....................................................... 63
    5.3.4    The Routing Layer .................................................................................... 66
5.4 THE API SPECIFICATION ........................................................................................... 67
    5.4.1    The App component.................................................................................. 68
    5.4.2    The PubSub component ............................................................................ 68
    5.4.3    The IRS component .................................................................................. 70
    5.4.4    The Route component ............................................................................... 75
    5.4.5    The Main component ................................................................................ 77
    5.4.6    The PubSubI interface............................................................................... 77
    5.4.7    The IRSI interface..................................................................................... 77
    5.4.8    The RouteI interface ................................................................................. 77

6. RESULTS AND DISCUSSIONS ............................................................................. 81
6.1    MEMORY USAGE....................................................................................................... 83
6.2    DISTRIBUTION OF MESSAGES .................................................................................... 84
6.3    DISTRIBUTION OF MESSAGES OVER TIME ................................................................. 85
6.4    BROKER DISTRIBUTION OVER TIME .......................................................................... 86
6.5    EFFECTIVENESS OF ROUTING AND UNIFORM DISTRIBUTION OF PUBLICATIONS ........ 87
6.6    DISTRIBUTION OF BROKERS OVER INTERVAL ROUTING STRUCTURES....................... 88
6.7    FAULT TOLERANCE ................................................................................................... 89
6.8    EFFECTIVENESS UNDER ARBITRARY FAULTS ............................................................. 91
6.9    ROUTING EFFICIENCY (CHAINED VS RANDOM STRUCTURES) ................................... 92
6.10   ROUTING EFFICIENCY (GRID VS HYPERCUBE TOPOLOGIES) ..................................... 93

7. CONCLUSION ......................................................................................................... 95
7.1    FUTURE RESEARCH OPPORTUNITIES.......................................................................... 96

BIBLIOGRAPHY ......................................................................................................... 100




                                                            vii
                                               LIST OF FIGURES


Figure                                                                                                                        Page


2.1   A picture of a Spec prototype sitting on top of the previous generation of
      UC Berkeley Motes, Mica node [Source: The Smart Dust project in UC
      Berkeley].....................................................................................................................7
2.2   Another Spec picture, showing the size of the Spec and implying the success
      of the Smart Dust project [Source: The Smart Dust project in UC Berkeley]............7
3.1   A content-based Publish/Subscribe system with a distributed Broker routing
      infrastructure. The infra-structure contains two Broker Advertisement Trees
      rooted at Brokers 1 and 3 respectively......................................................................22
4.1   Figure 4.1 – The layered architecture design of the new Interval Routing
      enabled Brokering System based Publish/Subscribe communications
      protocol. ....................................................................................................................30
4.2   Figure 4.2 – An example Depth-first Interval Routing tree over a distributed
      network of 10 nodes..................................................................................................32
4.3   Figure 4.3 – Intervals of Node 2 from Figure 4.2.....................................................33
4.4   Figure 4.4 – The Init algorithm for initializing the process of constructing
      the Interval Routing Structure...................................................................................36
4.5   Figure 4.5 – The RcvAdvertMsg algorithm to handle the event of reception
      of the AdvertMsg message........................................................................................37
4.6   Figure 4.6 – The RcvAdvertAlreadyExistsMsg algorithm to handle the event
      of reception of the AdvertAlreadyExists-Msg message. ..........................................38
4.7   Figure 4.7 – The RcvSubtreeBelowDoneMsg algorithm to handle the event
      of reception of the SubtreeBelowDoneMsg message. ..............................................39
5.1   Figure 5.1 – The architecture of the implementation of the Interval Routing
      enabled Publish/Subscribe communica-tions protocol. The figure contains
      configurations, modules and the messages exchanged between the modules. .........58
5.2   Figure 5.2 – A block diagram of the App component. .............................................59
5.3   Figure 5.3 – A block diagram for the PubSub component. ......................................61
5.4   Figure 5.4 – The block diagram for the IRS component. .........................................63
5.5   Figure 5.5 – A block diagram for the Route component. .........................................66
5.6   Figure 5.6 – Main configuration component of the Interval Routing enabled
      Publish/Subscribe protocol application.....................................................................78
5.7   Figure 5.7 – The Publish/Subscribe Interface linking the App module
      component to the Publish/Subscribe Brokering System module component. ..........79
5.8   Figure 5.8 – The Interval Routing Structure Interface linking the
      Publish/Subscribe Brokering System module component to the Interval
      Routing Structure module component. .....................................................................79
5.9   Figure 5.9 – The Route Interface linking the Interval Routing Structure
      module component to the Route module component. ..............................................80
6.1   Figure 6.1 – The data structures representing the Interval Routing Structure
      Table at every node in the network for one Interval Routing Structure, and



                                                               viii
       the Most Favored Broker Set for all the Interval Routing Structures
       collectively................................................................................................................82
6.2    Figure 6.2 – A graph representing the distribution of Publications, Broker
       Advertisements and Interval Routing Structure construction messages for
       networks of random topologies.................................................................................84
6.3    Figure 6.3 – Distribution of Publications, Broker Advertis-ements and
       Interval Routing Structure construction messages over time for a network of
       a random topology ....................................................................................................85
6.4    Figure 6.4 – A graph representing the number of brokers coexisting in the
       system over time in networks of random topologies ................................................86
6.5    Figure 6.5 – A graph showing the increase in effectiveness of reception of
       publications with increased subscribers. It also shows the uniform
       distribution of messages over subscribers. The network has a grid topology ..........87
6.6    Figure 6.6 – A grid topology network of 100 nodes. The Fault Zone above
       the group of nodes that fail simulta-neously.............................................................89
6.7    Figure 6.7 – The graph showing the behavior of the net-work at the
       occurrence of a mass failure of nodes.......................................................................90
6.8    Figure 6.8 – A graph representing the effectiveness of our Publish/Subscribe
       system under arbitrary faults.....................................................................................91
6.9    Figure 6.9 – A graph showing the routing efficiency for the chain-like
       interval routing structure scenario and a random interval routing structure
       scenario .....................................................................................................................92
6.10   Figure 6.10 – A graph comparing the routing efficiency over Interval
       Routing Structures of a grid topology with a hypercube topology. In both
       cases, the node numbering is random .......................................................................93




                                                                 ix
                        LIST OF TABLES


Table                                                     Page


6.1   Distribution of Brokers over Interval Routing Structures
      (Network Size = 150)………………………………………………………………88




                               x
                                     CHAPTER 1

                                  INTRODUCTION


Technological advances in the past few decades have proved Moore’s Law to be

remarkably accurate. It has led to the genesis of significantly small and cheap processing

units capable of reasonably high computational power and storage capacities. The direct

implication is the possibility of the existence of very tiny processing units capable of

fairly sophisticated computations. The overall impact has been on the evolution of a

diverse breed of embedded systems. A parallel advancement in wireless communications

and micro-electro-mechanical systems (MEMS) technology has resulted in the

development of significantly complex wireless networked embedded systems as well.

   The evolution in sensing devices has not been as explosive as semi-conductor

technology though. A lot needs to be done still in that direction. Even then, today sensing

devices like thermal, light, magnetic, acoustic, etc. are becoming a commonplace in our

everyday lives. These can be used fairly effectively to gauge the activities in their

deployment environment (though we still need a far more advanced reaching technology

in sensing). For example, in a real-time battlefield scenario, magnetometer sensors can be

used to detect the presence of metal in the surroundings, thus aiding in monitoring

potential enemy activity in the region of interest. Such sensing devices, coupled with

sophisticated wireless networked embedded systems have given birth to a new class of

distributed wireless sensor networks consisting of sensor nodes having significantly high

computational power. The overall effect is the possibility of building highly sophisticated

applications over the distributed sensor networks that indulge in massive sensing,


                                            1
actuating and computational tasks. The applicability and research potential of the sensor

networks, which can be recognized easily, has attracted a tremendous amount of interest

in the research community lately.

   The real-time battlefield example given above is incomplete in the sense that it does

not address the problem of how to communicate the sensed data to some base-stations

that are interested in monitoring the region in which the sensor nodes are deployed. Here,

the sensor networks play the role of communicating the sensed data to these base-

stations. This implies that the sensing device is either attached to or embedded in a sensor

node, and a number of sensor nodes (potentially even in hundreds of thousands) are

deployed in the region of interest. The sensed data is relayed to the base-stations using

some underlying routing infrastructure. The subject of this thesis is such a routing infra-

structure. Our routing infrastructure is engineered towards achieving the objective of

memory efficient routing, and also fault tolerance in the ad hoc sensor network

environment.

   The deployment environments of the sensor networks are extremely diverse, starting

from the safe confines of office corridors to extremely hostile and inaccessible regions of

enemy territory in real-time battlefields. Due to this variance, the sensor networks can

range from being static, at one end of the spectrum, to being highly dynamic and ad hoc

at the other end. Most of the research effort though has been towards the study of ad hoc

sensor networks. The dynamic nature of these sensor networks, which also includes

unexpected occurrences of faults, mandates the use of adaptive infrastructure software

implementations.




                                             2
   Additionally, the sensor nodes are tiny in size. All the sensing, computational and

relay components are embedded in the small space available on the sensor nodes. The

ultimate vision of deploying dust particle size sensor nodes, called Smart Dust, has

already been contemplated [1]. This imposes a restriction on the number of resources

available for use in a sensor node. In the most general case, sensor nodes are powered by

batteries or solar cells. This imposes a significant restriction to the amount of allowed

energy drain occurring in various sensing and actuating activities. Similarly, memory is

another such limited resource in sensor nodes. These limitations in the sensor nodes play

extremely important roles in the overall architecture and functioning of the sensor

network environments. All these factors have spawned a large number of innovative

activities in the research community in the recent past.

   On close observation it is evident that the science and philosophy applicable to the

arena of sensor networks has three core aspects to it: sensing of phenomena in the

environment, resource management, and communications. The challenges posed by the

uniqueness of the sensor network environment, which will be discussed in detail in the

next chapter, have given several directions for research. The technology in sensing needs

to a lot more advance than what we have today, especially with regards to the sensor

network environment. A plethora of research is being done as we speak in the aspect of

resource management. Parallelly, considerable research is going on in the field of

communications as well. Our thesis is an elaboration of the study of a communications

infrastructure in distributed sensor networks. Our communications infrastructure does the

primary task of communicating messages across the distributed system and also

addresses the issue of efficient resource management. Memory is the main resource that




                                             3
our infrastructure is concentrating on. Additionally, the communications infrastructure

also adapts to a dynamic and ad hoc environment that is so predominant in the sensor

networks regime.

   This thesis proposes and tries to evaluate the design of a routing infrastructure in the

ad hoc environment of sensor networks. Chapter 2 is more of a literature review and a

brief discussion of Publish/Subscribe. It starts off with an overall review of the research

activities in sensor networks that have been predominant in the recent past. The

discussion then narrows down its scope to communications. A short discussion about the

Publish/Subscribe communications paradigm in the context of sensor networks follows.

Our research work has Publish/Subscribe at its heart.

   Chapter 3 is a dedicated to the discussion of a distributed Brokering System routing

infrastructure for Publish/Subscribe that was proposed in [30]. We start with presenting

this design idea. A brief analysis of this design is made and its potential drawbacks are

pointed out. One of the drawbacks is scalability from the perspective of memory

utilization. This shortcoming has been the main inspiration for the work of our thesis.

   Chapter 4 is a detailed elaboration of the design of the Interval Routing enabled

Publish/Subscribe communications protocol that is the subject of this thesis. It starts off

with a statement of the new design goals. This is followed by the review of a memory

efficient routing infrastructure called Interval Routing. The flow of the text then goes into

the discussion of a distributed implementation of Interval Routing. Distributed Interval

Routing alone cannot meet all the goals we talk about and the challenges posed by the

sensor network environment. Some add-on services are also required. A description of




                                             4
the services we have used is also made. This is communications infrastructure is a

layered architecture.

   Chapter 5 gives the implementation specifics. The implementation has been done as a

simulation of the sensor network environment. Chapter 6 is all about results and related

discussions. Chapter 7 sums up everything and points out some ideas for giving

directions to future research.




                                           5
                                     CHAPTER 2

                             BACKGROUND STUDY


Advances in technology in the recent past have led to the genesis of the modern day

sensor networks. The hunger for having more extensive functionality for the cost of lesser

and lesser resources has been the driving force for the tremendous research that has been

going on. Sensor networks are still in their juvenile state. Massive amount of research is

happening, as we speak, in all the facets of sensor networks. This chapter makes a brief

review of the different current day sensor network issues and the corresponding research

activities. After the review we dwell deeper into the background of the Publish/Subscribe

communications paradigm. This work is the design, implementation and study of a

routing infrastructure for the Publish/Subscribe communications paradigm in sensor

networks.


2.1     A Survey of Sensor Networks Research

From its genesis itself, the sensor network idea has opened up a host of diverse research

opportunities. The research activities can be characterized in two main arenas – hardware

architecture and systems software infrastructure.


2.1.1    Hardware Architecture

There are several issues and high potential research areas for hardware design and study

in sensor networks. Significant research [2] has been going on lately to achieve the

ultimate envisioned goal of creating dust particle size sensor nodes called Smart Dust

[31]. The primary goal of this work has been small size for the sensor nodes. Figure2.1


                                            6
shows a single chip prototype of Smart Dust, Spec. Figure2.2 points out the success that

the Smart Dust project is getting.




               Figure 2.1 – A picture of a Spec prototype sitting on top of
               the previous generation of UC Berkeley Motes, Mica node
               [Source: The Smart Dust project in UC Berkeley].




               Figure 2.2 – Another Spec picture, showing the size of the
               Spec and implying the success of the Smart Dust project
               [Source: The Smart Dust project in UC Berkeley].




   The high-level hardware specifications for the Spec are that it measures approx-

imately 2mmx2.5mm, has an AVR-like RISC core on it, 3K of memory, 8-bit on-chip

ADC, FSK radio transmitter, Paged memory system, communication protocol acce-


                                            7
lerators, register windows, 32 KHz oscillator, SPI programming interface, RS232

compatible UART, 4-bit input port, 4-bit output port, encrypted communication hardware

support, memory-mapped active messages, FLL based frequency synthesizer, Over-

sampled communication synchronization, etc. It has definitely been a staggering ach-

ievement.

    Additionally, there are hardware components of sensor nodes that have a considerable

amount of influence on the behavior of today’s sensor networks as a whole [14]. These

are – storage, power supply, sensing devices, and radio.


•    Storage – The amount of storage capacity required is highly dependent on the

     nature of the application. Speaking from the other perspective, the storage capacity

     in individual sensor nodes significantly impacts the network architecture and

     communications protocols. As progress in technology happens, storage will get

     more and more cheaper and feasible for the tiny sensor nodes. But correspondingly

     a trend of bloating up applications on individual sensors will mandate a possibly

     optimum usage of memory. For example, the Spec sensor node has a measly 3K of

     memory, which would definitely be insufficient for running substantially sophis-

     ticated applications on it. It would be relevant to mention here that optimum

     memory utilization for the middleware routing protocol has been the prime

     motivation of this thesis.


•    Power supply – By far energy conservation has been the most exhaustively

     researched aspect of sensor networks [1, 5, 11]. There are primarily two ways in

     which energy can be supplied – high-density or rechargeable batteries; and

     harvesting energy from the environment (solar cells, vibrations to electrical energy


                                            8
        [24], etc.). The trend has been towards using batteries though. This also

        significantly influences the nature of communications in sensor networks. It has

        been generally accepted that energy conservation is going to be the main bottleneck

        for the usability of sensor networks.


•       Sensors – The sole task of sensor nodes at the core level is sensing. The technology

        of sensors is not as developed as the semi-conductor technology today; it has still a

        long way to go. A listing of issues with sensors has been provided in [14].


•       Radios – The most widely accepted communications medium in the sensor

        networks arena has been radios. This is due to the inherent requirement of a

        wireless communications medium in sensor networks, and also the broadcast-like

        nature of radio communication. The radio acts as the main energy drain in the low-

        powered sensor nodes [32, 33]. The radio architecture does significantly impact the

        network architecture and the structure of the MAC protocols as a whole. Most

        notable energy loss occurs when listening to the radio channel. A general trend has

        been towards putting the radio to sleep mode intermittently for a reasonable amount

        of time [35].

2.1.2    Systems Software Infrastructure

On the systems software front there has been an incredibly diverse explosion of research

in sensor networks recently. Different important aspects of and issues in the sensor

network scenario have inspired all these research activities. Most of these research

activities are communications-centric. Roughly the research efforts can be broken up into

the following categories – system software support, energy efficient routing, adaptation




                                                9
to highly dynamic environments, scalability, non-GPS based localization, load-balancing

and clock synchronization.

•    System software support – For system software support, the Tiny OS group has

     done the first effort of developing a lightweight operating system, the Tiny OS [19,

     9]. This effort was made to provide an overall system architecture that supports

     efficient modularity and concurrency-intensive operations. A programming

     environment [16] for the Tiny OS environment has also been provided for

     structured middleware and applications development. It would be apt to mention

     here that the implementation of our Interval Routing enabled Publish/Subscribe

     communications protocol has been done in a Tiny OS simulation environment.

     Efforts   have   also   been   made   to   implement    energy   efficient   runtime

     reprogrammability [27].

•    Energy efficient routing – From the beginning of this thesis document itself a

     repeated emphasis on energy efficiency has been expressed. The scarce availability

     of energy mandates its utilization in a very thrifty manner [34, 37, 43, 44]. The

     energy efficiency constraint has had the largest influence on the routing

     infrastructures proposed for sensor networks so far. It is the energy conservation

     constraint that forces the routing algorithms in sensor networks to be multi-hop in

     nature. Most of the research effort so far in sensor networks has been, directly or

     indirectly, concerned with energy conservation.

•    Adaptation to highly dynamic environments – The practical utility of sensor

     networks has been envisioned to be potentially in hostile environments (e.g. real-

     time battlefields). These surroundings increase the overall dynamism in the




                                           10
    environment of the sensor networks. The radio communications medium is

    relatively easily prone to environmental disturbances. Such temporal disturbances

    introduce transient faults on the links between sensor nodes. Even though the sensor

    nodes are not mobile, the links between them keep going down and coming up.

    Additionally, sensor nodes die out due to depleting energy; new nodes may also be

    deployed in the network. In real scenarios sensor nodes would be generally

    deployed in an almost random fashion, hence a uniform distribution of nodes in the

    area of study is not guaranteed. Consequently, the topology of sensor networks is

    very arbitrary to start with, and keeps on altering in due course. Sensor networks are

    thus ad hoc in nature. The dynamic nature of sensor networks raises the issue of

    faults, both transient and permanent. The communications infrastructure for such

    environments must be highly fault resilient. One can rarely find a communications

    protocol developed with sensor networks in mind that is not adaptive to dynamics

    of the environment. In general, every communications protocol for sensor networks

    includes support for fault-tolerance as a secondary goal, although there have been

    efforts [22] that have been focusing on fault-tolerance as the main thesis of their

    research work. Our Interval Routing            enabled   Publish/Subscribe    routing

    infrastructure also supports adaptation to dynamic environments as a secondary

    goal.

•   Scalability – Future sensor networks consisting of hundreds of thousands of sensor

    nodes have been envisioned by researchers. The characteristic of such sensor

    networks that comes to mind evidently is their scalability. The ad hoc nature of

    sensor networks and the limited resources in individual sensor nodes raises the




                                          11
    question of how far is a proposed communications infrastructure scalable.

    Achieving high scalability in such an environment is a very difficult goal to

    achieve. There are several factors like limited power, storage, etc. to be considered

    while developing a scalable communications infrastructure. A general approach

    towards building scalable routing infrastructures has been by using localized

    algorithms [13] or cluster-based infrastructures [45, 46].

•   Non-GPS based localization – Localization, determining where a sensor node is

    physically located in the network, is extremely useful in large sensor network

    systems, and very crucial for many applications too. For example, localization

    opens up new ways of reducing power consumed in the multi-hop wireless sensor

    network environments. Additionally, localization can also help in determining the

    activity regions in the sensor network deployment area. Traditionally, the

    localization problem has been solved using GPSs. But use of GPSs in the low-

    powered sensor network environments is relegated due the high-energy require-

    ments of GPS devices. Non-GPS based low-cost localization systems [7] have been

    studied. Some such systems also make use of consistently powered reference points

    [6].

•   Load balancing – Load balancing is an extremely crucial feature for any

    communications infrastructure on a distributed system. In sensor networks, other

    than dealing with communications bottlenecks, load balancing also influences the

    battery lives of individual sensor nodes. If a small group of sensor nodes is used

    excessively for communication, the batteries of those individual nodes will surely

    drain out prematurely causing serious fissures in the physical sensor network




                                           12
     topology that could potentially result into partitioning of the network or even the

     failure of the routing infrastructure as a whole. A good communications

     infrastructure distributes the load of communications finely across the network.

     Very little work [18] has been done in researching the load balancing aspect of

     sensor network systems. Our thesis is based on a load-balanced model proposed by

     [30]. More of this model will be discussed in a later section.

•    Clock synchronization – Clock synchronization one of is the most recent activity

     areas in sensor networks. Clock synchronization is a very crucial aspect of any

     distributed system. There are several uses of the synchronized time across

     distributed sensor network systems. For example, to integrate a time-series of

     proximity detections to compute the velocity of a moving object, to suppress

     redundant messages by determining that they describe duplicate detections of the

     same event, etc. Very little notable work [12] has been done on clock

     synchronization in sensor networks. The unique environmental aspects and

     constraints of sensor networks make the currently existing clock synchronization

     algorithms not satisfactorily applicable.

    The briefly described aspects of sensor networks above span over the sensor network

research efforts being taken all over. There would be some more aspects that could be

discovered and gain importance in future from a research perspective. From the next

section onwards we become more specific to the subject of our thesis work.




                                            13
2.2       The Publish/Subscribe Communications paradigm

As has already been mentioned, this thesis is a report of the study on a newly developed

hybrid communications protocol for sensor networks. But at its heart, this protocol is a

Publish/Subscribe protocol. So what is Publish/Subscribe?


2.2.1      The Publish/Subscribe Idea

The Publish/Subscribe paradigm is a communications infrastructure connecting

independent nodes in a distributed system. It is a conceptually easy to understand and

implement paradigm. Publish/Subscribe middleware has been studied for almost a decade

now and even commercially accepted solutions exist [10, 39]. Publish/Subscribe systems

contain a group of nodes, called publishers, which provide information in the form of

publications; and another group of nodes, called subscribers, which disseminate

subscriptions (certain criteria to be applied on the publications). The publisher nodes

publish messages (or events), called publications, in the system. These publications are

routed to the appropriately matching subscriber nodes. The underlying infrastructure is

responsible for the efficient and timely delivery of the publications to the subscribers

whose subscriptions match the publications made.

      On close observation one can notice that Publish/Subscribe is an event driven system.

Information is relayed and processed in the form of events. Publish/Subscribe systems are

appropriately termed as “Event Notification Systems” or “Event Notification Services”.

There are two main approaches towards implementing Publish/Subscriber systems:

      •    Using Brokering Systems – In this kind of infrastructure the system contains

           special intermediate nodes between the publishers and subscribers called brokers.




                                              14
          The brokers are used for routing publications to subscribers. A distributed system

          of brokers for routing publications [20, 30] is a preferred approach.

   •      Using Non-Brokering Systems – These are flatter, peer-to-peer type infra-

          structures. Directed Diffusion [21, 13] is a very good example of a peer-to-peer

          type of Publish/Subscribe system. The Directed Diffusion does not directly talk

          about Publish/Subscribe, but it is very evident that the sinks are the subscribers.

          The interests disseminated in the network are synonymous to the subscriptions

          made. The reinforcement mechanism yields with a direct path from the source

          (publisher in the Publish/Subscribe context) to the sink for the events detected at

          the source. This is more of a peer-to-peer model for implementing

          Publish/Subscribe.


2.2.2     Comparison with Traditional Communication alternatives

In today’s age of ever bloating distributed systems where inter-process communication is

becoming       increasingly    important,   Publish/Subscribe    outweighs    the   traditional

communications approaches in terms of flexibility and scalability.


2.2.2.1    Traditional Communication approaches

Traditionally, the communication between nodes in a distributed system can be roughly

categorized in two models:

   •      Client-Server model – This model represents an inherently synchronous

          communications model where a client makes a request to the server, some internal

          computation is invoked in the server and a reply is dispatched to the client. A

          small degree of asynchrony can be added to this model by introducing the




                                               15
          callback mechanism from the server to the client. Even then, the drawback of

          non-anonymity cannot be removed from the client-server model.

   •      Shared Resource model – Classical approaches like distributed shared memory

          and message-queues can be categorized in this model. The main drawback here is

          that extremely lightweight units, like the sensor nodes, find the synchronization

          for the accessing shared resources to be very inconvenient.

   •      Broadcast model – In the broadcast model, the nodes interact with each other

          using broadcasted messages. This model demands outrageously high-energy

          resources, and hence is ruled out for the sensor network environment.

   •      Multicast model – The multicast communications model is a very scalable type of

          communications technique. It can be extremely useful for the sensor network

          environment for conservation of energy. It will be clear later in this text that our

          proposed Publish/Subscribe model does exploit the multicasting opportunity

          offered by the sensor network environment.


2.2.2.2    The Publish/Subscribe approach

Although the subscribers may seem synonymous to clients and publishers to servers in

the traditional client-server model, the Publish/Subscribe communications model differs

from the client-server model significantly. In Publish/Subscribe systems the

communication can be anonymous in nature. In such scenarios, the communicating nodes

do not have the slightest idea about the destination of a sent message and the source of

the received message. The paradigm does not force the request-response communications

mechanism that is so predominant in the client-server model. The subscribers do not have

to make repeated relays of their subscriptions, once a subscription is made it will persist



                                              16
in the system. It is the responsibility of the system to direct the publications made to the

appropriate subscribers. This results in the subscriber receiving multiple publications for

a single subscription made. If the subscriber does not want to continue receiving

publications matching a specific subscription it has already made, it simply needs to relay

an unsubscribe message in the distributed system for that subscription. While sending

over this unsubscribe message into the system, the subscriber does not need to know

anything at all about how the unsubscribe message is communicated and assimilated in

the system.

   As already stated, Publish/Subscribe is event-driven in nature. Consequently, it is

inherently asynchronous in nature. There is no acknowledgement mechanism implied in

the specification of Publish/Subscribe. The publishers do not need to wait for acknow-

ledgements after sending out publications. The subscribers also need not wait for

publications after making subscriptions in the system; the publications will be received as

events. Due to this anonymity and asynchrony in communication, Publish/Subscribe is

suited for dynamic distributed environments – making it, all the more, a good choice for a

communications infrastructure between nodes in the ad hoc sensor network environment.

   The Publish/Subscribe communications paradigm is data-centric. Subscriptions are

made for matching events (or publications) only. The routing of publications is done

using these event-criteria describing subscriptions alone. The communication in sensor

networks mostly needs to be data-centric. For example, in a sensor network deployed to

monitor an active volcanic region, the sensor nodes would be sensing temperature along

with other parameters that calibrate the volcanic activity. The base-stations (or

subscribers in the Publish/Subscribe scenario) make subscriptions for values of the




                                            17
sensed parameters that cross a threshold level. It could also be observed that several

applications also need the knowledge of the location of the activity sub-region in the

sensor network. This location information can also be embedded in the publications made

by the publishers.

   Publish/Subscribe is a multicasting system in nature. The sensor networks are

distributed in nature, and individual sensor nodes have limited resources available. The

Publish/Subscribe system has to scale to support a large number of publishers and

subscribers in the system. The communications bandwidth must be used efficiently.

Another implication of this is an optimum consumption of energy in the sensor nodes, for

communication. Multicasting in Publish/Subscribe provides a significant leverage in

sensor networks for communications bandwidth usage and minimum energy consumption

criteria. Multicasting also helps Publish/Subscribe to scale to large sensor networks. The

broadcast nature of the radio medium of communication coupled with the limited range

of radio signals (forced by the efficient energy consumption requirement) makes

multicasting easy to implement in the sensor network environment.

   Most importantly, Publish/Subscribe is very dynamic in nature. Publishers and

subscribers can freely join in and leave the system at their will. The degree of dynamism

in the system depends on the underlying routing infrastructure. The Publish/Subscribe

system is also supposed to be fault resilient. Again, it is the responsibility of the

underlying routing infrastructure to ensure reasonable fault-tolerance.

   The application layer view of Publish/Subscribe is also very simple, making it very

easy to use in a distributed application. It must be clear by now that the Publish/Subscribe

systems have a layered architecture. On the top is the application layer that knows only




                                            18
about publications and subscriptions, and act as publishers or subscribers correspond-

ingly.   Below    the   Publish/Subscribe   API     is   the   routing   infrastructure.   The

Publish/Subscribe specification does not mandate any specific architecture for the

underlying routing infrastructure. It is here that many novel ideas can be studied and

implemented. The routing infrastructure in turn could be multi-layered. It is appropriate

to mention here that our Publish/Subscribe routing infrastructure is such a multi-layered

architecture.

    All the above features of Publish/Subscribe systems exhibit complete decoupling of

communicating parties in space, time and flow as stated in [30]. It also makes the

Publish/Subscribe paradigm an ideal communications model for sensor networks.


2.2.3    Types of addressing in Publish/Subscribe

•   Group-based addressing – In the earliest Publish/Subscribe systems, the subscription

    criteria used to be group-based (also called channel-based). In these systems, each

    node, publisher or subscriber, participates as a member in one or more predetermined

    groups. Subscribers subscribe to (and publishers publish in) all the groups they are

    members of. This groupism leads to restricted access in the system. Subscribers

    cannot receive publications from some publishers since they have no group in

    common. This approach does not support data-centric communication of messages as

    has been advocated so rhetorically by Publish/Subscribe.

•   Subject-based addressing – Subject-based (also called topic-based) Publish/Subscribe

    systems followed suit. These systems provided a little more flexibility. The

    publications/subscriptions have a subject in these systems. The subject belongs to a




                                            19
    pre-defined namespace of subjects. Subscribers subscribe for subjects that they are

    interested in, and publishers publish messages with subjects.

•   Content-based addressing – The later Publish/Subscribe systems have been content-

    based [4]. The subscription matching criteria are extracted from the message content

    itself. The content-based approach provides maximum flexibility in giving the

    subscription criteria. The subscriptions can be made more elaborate and composite

    now. The content-based approach is of considerable importance to the sensor network

    environment as it renders much-desired flexibility in giving subscription criteria.


A more detailed elaboration on the aptness of Publish/Subscribe in sensor networks can

be found in [30]. Overall, it can be concluded that Publish/Subscribe is a very elegant

communications paradigm for sensor networks.




                                             20
                                     CHAPTER 3

                                    MOTIVATION


On close observation of the sensor network environment, it can be realized that there are

just two tasks of the sensor networks at the core of their functionality – sensing

information and communicating it. Communication is central, and actually the only

subject of this thesis. In chapter 2 it was also proposed that the choice of the

Publish/Subscribe communications paradigm is very apt to the sensor network

environment. In this chapter, we discuss the architecture of an already studied

Publish/Subscribe communications infrastructure [30]. The architecture is of a distributed

brokering system infrastructure. A brief discussion is also made about the utility of this

infrastructure and how it is appropriate to the dynamic sensor network environment.

Followed by all this discussion, a significant drawback of scalability is pointed out. This

shortcoming of the protocol (as we may rightly call it) was the main inspiration for the

current thesis effort.


3.1   A Distributed Brokering System Infrastructure for Publish/Subscribe

An approach to the routing infrastructure for Publish/Subscribe in sensor networks based

on a Brokering System has been studied and implemented in [30]. This section is

dedicated to the review of this approach. This brokering system infrastructure has been an

initial backbone and inspiration for the effort of our thesis. The infrastructure that is

being discussed in this section is based on content-based messaging, fault-tolerance,

multicasting and loop-free routing aspects required for Publish/Subscribe in the



                                            21
                                1                    2




               6                                                         3




                            5                            6




               Figure 3.1 – A content-based Publish/Subscribe system
               with a distributed Broker routing infrastructure. The infra-
               structure contains two Broker Advertisement Trees rooted
               at Brokers 1 and 3 respectively.



dynamic and distributed sensor networks.

   The Distributed Brokering System contains a set of special broker nodes distributed

randomly in the network. The set of broker nodes is dynamic and keeps on changing over

time. The number of brokers in the system is roughly a pre-determined fraction of the

network size. Initially, a fraction of the nodes in the sensor network volunteer to act as

brokers. This volunteering is highly probabilistic and the probability of a node becoming

a broker is pre-configured. A spanning tree for each broker, called the broker’s

advertisement tree, is constructed. The advertisement tree is constructed in a more or less

breadth-first search manner, resulting in the depth of the advertisement tree to be equal to

the diameter of the sensor network at most. Every advertisement tree is rooted at the

broker that initiated the construction of that tree. Figure 3.1 shows a network of 6 nodes.


                                            22
It contains the advertisement trees for two brokers in the network. The collection of these

advertisement trees is the routing infrastructure for nodes communicating in the

Publish/Subscribe system on this network.

   So how does the routing of publications happen in the distributed brokering system

infrastructure? Each node in the system designates a single broker node as its most

favored broker in the system. The choice of the most favored broker depends on the

proximity of the node to the broker and the reliability of the links leading to it. Since

every broker advertisement tree is a spanning tree, ideally each node has the information

of all the brokers in the system. This facilitates every node to choose its best possible

most favored broker from the set of available brokers. Consequently, each node selects

the nearest and most reliable broker in the system as its most favored broker. In the

example shown in Figure 3.1, nodes 2 and 4 could designate broker node 3 as their most

favored broker; and nodes 5 and 6 could designate broker node 1 as their most favored

broker. A broker node designates itself as its most favored broker.

   A publisher node sends every publication as a unicast message to its most favored

broker over that broker’s advertisement tree. It would be the responsibility of this most

favored broker to forward the publication to the appropriate subscribers. There are two

extreme scenarios that can be considered while designing the routing of publications from

brokers to the subscribers in the sensor network.

   In the first scenario, the number of subscribers in the system is very low, and the

number of subscriptions in the system is also very less. In such a case, it would be

feasible enough for each node in the system to maintain the list of subscriptions locally. It

is necessary that the subscribers, in a multi-hop fashion, broadcast the subscriptions over




                                             23
the entire network using the most favored broker’s advertisement tree. The brokers will

use the subscription tables, thus formed, to direct the publications to the appropriate

subscribers using their individual advertisement trees.

   The second extreme scenario, suggests that the subscribers are a fairly large

percentage of nodes in the network. Correspondingly the number of subscriptions is also

high in the system. Maintaining a global subscription table at each individual node will

not be feasible in this setup. Publications can be disseminated over the network in a

multicast fashion using broker advertisement trees. Here the publications are virtually

broadcasted to all the nodes in the network. There is no need of communicating the

subscriptions made by a subscribing application over the network. The Publish/Subscribe

layer itself maintains the subscription table for each subscriber node. This apparently

might seem to be a very wasteful communications method. But given the assumption that

a fairly large percentage of nodes in the network are subscribers, the approach turns out

to be sensible.

   Given the two extreme approaches mentioned, there can be some hybrid approaches

that could be used for routing publications to the relevant subscribers. But all these

approaches use the underlying distributed brokering system for routing messages (either

publications or subscriptions).

   As the routing is done over a tree structure, there are no issues of loops in the routing

algorithm. Since different nodes have different most favored brokers, the distribution of

messages going from publishers to their most favored brokers and subsequently to the

appropriate subscribers is generally not concentrated. This exhibits a reasonable amount

of load balancing in the system as well. Load balancing is a very crucial aspect for any




                                            24
routing infrastructure in sensor networks since it prevents the pre-mature death of nodes

due to excessive depletion of energy reserves. The existence of multiple broker

advertisement trees also aids in providing fault tolerance in the infrastructure. Whenever

a node fails to communicate with its most favored broker due to the failure of some link

or node in between, it has the option of choosing another broker as its most favored

broker. All the publications from that node are subsequently routed to its new most

favored broker. Ideally, every subscriber node is a part of the advertisement tree of each

broker. Hence, a subscriber node keeps receiving publications from all the brokers over

their advertisement trees. Even if there is a fault over the path to one broker, the

subscriber keeps receiving publications from other brokers. As might be clear by now,

this approach does not provide 100% fault tolerance – all the publications sent by a

publisher to the old, unreachable, most favored broker (due to some fault in the path to

that broker) are lost till the point in time when the publisher decides to designate another

broker in the system as its most favored broker. Nevertheless, it does provide with a

reasonable amount of fault tolerance.

   In the dynamic sensor network environment, nodes will eventually start failing (due

to the depleting battery charge) and new nodes might be added to the network (in case a

new set of sensors is deployed in the field). Even the broker nodes themselves may fail.

Consequently, there are continuous topological changes happening in the network. A

static broker advertisement tree infrastructure cannot assimilate these changes. A periodic

process of purging broker advertisement trees and regenerating new advertisement trees

has been implemented in this design. For maintaining a random distribution of the

brokers in the network, some non-broker nodes randomly (with a pre-configured




                                            25
probability) choose to become brokers. Correspondingly, the previously existing brokers

step down from their broker status to a non-broker status. This is followed by the process

of purging out the pre-existing broker advertisement trees and building up the broker

advertisement trees for the new broker nodes. The purging and regeneration process of

broker advertisement trees makes the entire infrastructure highly adaptive to dynamic

environments, which is desirable in the sensor network environment.


3.2   Scalability issue with the Brokering System Infrastructure

The architecture of the brokering system discussed in the previous section, tough fault-

tolerant and robust, has one major drawback – scalability. As has been previously

mentioned, the number of brokers in the system is a fraction of the networks size. This

tends to increase linearly with the increase in the network size as a whole. Since each

broker creates a distributed spanning advertisement tree of its own, each node in the

network knows about the existence of every broker ideally. The knowledge is maintained

as an entry in the node’s routing table. Hence, the routing table at each node has one

entry for every broker in the system. An important scalability issue in terms of routing

table size at each node now pops up. For example, if the broker-fraction in the network is

5% and the network size is ten thousands nodes, then the routing table at each node will

have 500 entries – a substantially large routing table size! The scalability issue becomes

even more crucial due to the scarce amount of on-chip memory (in the order of KBs)

available for the sensor nodes. Additionally, it should be observed that the denser

distribution of messages (hence, more load balancing and fault tolerance) across the

system happens with more broker-fraction in system. Surely a brokering system with

10% brokers will provide with more load balancing and fault tolerance than a brokering



                                           26
system with 5% brokers. The memory used by the routing infrastructure as a whole must

be a very small fraction, so that a significant amount of memory may be devoted to the

applications using the routing infrastructure.

   This scalability issue has motivated a design effort of a new brokering system based

on Interval Routing, which is the central theme of our thesis.




                                             27
                                      CHAPTER 4

                                 THE NEW DESIGN


4.1   Goals for the New Design

The redesign effort for a scalable, memory, bandwidth and energy efficient

Publish/Subscribe communications protocol for ad hoc sensor networks has three goals –

1.    Memory conservation: The routing infrastructure must use a limited (possibly in the

      order of O(1)) memory.

2.    Adaptability to dynamic environments: The new infrastructure must be highly

      adaptive and fault-tolerant to the dynamic environment of sensor networks.

3.    Energy conservation: The new infrastructure must not be wasteful in bandwidth and

      energy utilization. This last goal of maximum energy conservation and minimum

      bandwidth usage is hard to attain since there is no centralized knowledgebase about

      the subscriptions made in the network.


4.2   The Design Overview

Our newly proposed Interval Routing enabled Brokering System based Publish/Subscribe

communications protocol is a multi-layered structure as shown in Figure 4.1. Interval

routing is a memory efficient communications protocol over a distributed system. The

interval routing algorithm is a very crucial basis in the new design. This algorithm

defines a routing infrastructure for messages to be relayed across the network. A dynamic

and fault tolerant interval routing algorithm has not yet been devised. Our design uses

certain services like neighbor detection, leader election and fault detection in the network



                                            28
for the functioning of interval routing in the dynamic setup of sensor networks. Interval

routing is static in nature. It is very difficult to break down the rigid structure of the

routing algorithm to accommodate for dynamic topological changes in the environment.

Use of a single interval routing structure is not sufficient for dealing with the dynamism

of sensor network environment. An idea of using multiple interval routing structures in

the network is proposed. This approach leads to better fault tolerance. Additionally, to

assimilate the permanent topological changes in the network, a periodic process of

purging out one interval routing structure at a time and rebuilding it is needed. The inter-

purge period may be pre-configured or set dynamically.

   The already reviewed brokering system is layered on top of the dynamic interval

routing infrastructure. There is no need of separate broker advertisement trees in the

brokering system now, since the routing structures are already made available by the

interval routing structures. Each broker advertisement tree is overlayed on one of the

interval routing structures. The brokers do exist in the quantities as in the previous

brokering system. The routing of messages in the new brokering systems can be done in

several application dependent ways. We discuss a few of these methodologies. The

Publish/Subscribe layer is stacked on top of the new brokering system. The applications

sitting above the Publish/Subscribe layer need to only make the publication and

subscription calls to the Publish/Subscribe layer.




                                            29
                               Application Layer

                       publications,          matching publications
                       subscriptions

                           Publish/Subscribe Layer


                       publications,          publications
                       subscriptions

                                Brokering System

                    brokering messages,
              publications, subscriptions


                                Interval Routing
                                    Structure
           interval routing messages,         interval routing nodeId
                 brokering messages,
          publications, subscriptions

                           Network          Neighbor
                                            Detection
                                             Service

               Figure 4.1 – The layered architecture design of the new
               Interval Routing enabled Brokering System based
               Publish/Subscribe communications protocol.



4.3   Interval Routing

The first goal of this research work is to reduce the size of the routing table at each node,

which linearly increases with the number of brokers in the brokering system discussed in

Chapter 3. The interval routing architecture [25, 26, 38] is a very good choice for

accomplishing this goal. Interval routing is a highly memory efficient and scalable

routing technique for communication in distributed systems. It has been under a

considerable amount of research for more than a decade now and has also been adopted


                                             30
in a commercial routing chip [29]. In interval routing, the routing table size at each node

in the network is equal to the number of immediate neighbors of that node (degree of the

node). The most obvious implication of this is the scalability of the interval routing

algorithm, primarily since the routing table doesn’t increase linearly with the number of

nodes in the network. The general approach in the brokering system infrastructure results

in a linear growth rate of the number of brokers with the network size, consequently

leading to a linear growth rate of the routing table size at each node as well. But the right

amalgamation of the Brokering System and Interval Routing can prevent this increased

consumption of memory corresponding to the increase in the network size.


4.3.1   Construction of the Interval Routing Structure

The interval routing algorithm works as follows in brief – Starting from a pre-designated

node, a depth-first spanning tree is constructed in the distributed system. The pre-

designated node is called the Initiator node. During the construction of this tree, the

newly added nodes are assigned unique interval routing node numbers as shown in

Figure4.2. The node numbers start from 0 (the node number assigned to the initiator

node) and increase incrementally with the inclusion of new nodes in the interval routing

tree. Also, during the node numbering process, each outgoing link from a node to its

immediate neighbors is labeled with a link number. The collection of these link numbers

constitutes the routing table at each node. The links are labeled such that each outgoing

link routes a message to an Interval of the interval routing node numbers (see Figure 4.4).

Each link leads a message to different node intervals; and collectively, all the outgoing




                                             31
                                           0



                1                          2                     8



                 3                     4            7             9




                                       5


                     6


               Figure 4.2 – An example Depth-first Interval Routing tree
               over a distributed network of 10 nodes.


links (their intervals) span over all the nodes in the interval routing tree structure. No

node intervals overlap.


4.3.2   Drawback in Interval Routing

There is a penalty to be paid for this reduction of consumption in memory though. This

penalty is the sub-optimal routing path for a message from its source to its destination.

For example, in Figure 4.2, the path from node 6 to 7 could be 6-5-4-2-7 instead of the

shortest path 6-2-7. In the broker advertisement tree based model there is generally a

shortest path existing between the source node and its most favored broker. In interval

routing though, the message normally follows a sub-optimal path between the source

node and its most favored broker along the interval routing structure. Additionally,




                                               32
                       [8,2)                           [6,7)




                         [3,4)         [4,6)         [7,8)

               Figure 4.3 – Intervals of Node 2 from Figure 4.2.


interval routing cannot assimilate any of the dynamic changes in the topology of the

distributed system.


4.3.3   Frond Links

The side effects of the sub-optimal routing overhead can be decreased by the use of frond

links at each node in the interval routing tree structure. The dotted links in Figure 4.2 are

the frond links. These frond links are shortcuts over the interval routing structure between

any node and any of its ancestor nodes. For example, in Figure 4.2, the path from node 6

to 3 is 6-2-3 that uses the frond link between nodes 6 and 2 rather than the path 6-5-4-2-3

that uses the links over the interval routing tree structure. Henceforth, the interval routing

tree structure along with its frond links will be called as the interval routing structure in

this thesis report. These fronds are formed only owing to the proximity of the two nodes

connected by them. The depth-first nature of the node number assignment in the interval

routing structure mandates the fronds to exist from a node to its ancestor and vice-versa.

This nature of the interval routing structure coupled with the message routing technique

rules out the possibilities of having cycles in the path of messages traversing it.




                                               33
      It is reasonable to assume that the increase in the network size does not lead to the

increase in the network density. This assumption makes interval routing a highly scalable

routing infrastructure, in terms of memory utilization. The interval routing algorithm has

two main phases of execution – the distributed interval routing structure construction

with node and link numbering, and the routing of messages over this architecture. More

details on interval routing can be found in [25]. We have presented a distributed

algorithm for the interval routing structure construction procedure.


4.4    Distributed Interval Routing Structure Construction

This section is about the interval routing structure construction procedure in a distributed

fashion. A special Initiator node initiates the interval routing structure construction

algorithm. Pre-designating an initiator node in static networks is possible. The sensor

networks are ad hoc in nature, and hence the pre-designation of an initiator is not

feasible. The initiator node has to be determined dynamically. This problem is similar to

the conventional leader election problem in ad hoc distributed systems. The problem of

initiator election is dealt with in section 4.7.2 ahead. For simplicity it is assumed that just

one initiator exists in the sensor network.

      The interval routing structure has the initiator node as the root. The node numbering

in the structure starts from this initiator node, which gets the node number 0. The

distributed interval routing structure construction is a depth-first algorithm that uses the

token passing methodology. The token in the algorithm is the interval routing node

number to be assigned to the next node to be included in the interval routing structure.

The construction algorithm starts off with the token at the initiator node with the value 0.

By default all the nodes in the network wait for the token. The node owning the token at



                                              34
any time does the processing. The node owning the token at any time is called the current

node in the network for the time that it holds the token. The processing done by the

current node consists of – assigning the newly visited node (current) by the token with

the interval routing node number from the value of the token, assigning values to the

links of the current node, selecting the next link over which to forward the token and

sending the token over such a link. There are three types of messages a node may receive

– AdvertMsg, AdvertAlreadyExistsMsg and SubtreeBelowDoneMsg. Each type of the

messages includes the token also. All the nodes in the system, except for the initiator

node, do processing only on receiving one of the above three messages. The initiator

node starts off the algorithm without the reception of any message. Consequently, it does

processing only after receiving a message of the above three types. The algorithm below

is event driven, and distributed in nature. The pseudo code presented ahead executes at

node p in the distributed system.

   The distributed construction algorithm (shown in Figure 4.4 through Figure 4.7) is

somewhat like a PIF (Propagation of Information with Feedback) protocol. The initiator

node initiates the interval routing structure construction task, which propagates over the

entire network in a wave-like form and returns back to the initiator eventually, just like a

PIF protocol. Of course ours is not a self-stabilizing algorithm. The initializations for

tasks to be carried out after the construction of the interval routing structure can be made

during the construction process itself. But faults in nodes could result in corrupted values

at several nodes. For such a case, the initiator can use the termination detection event to

execute a self-stabilizing PIF algorithm in the network to make initializations [8].




                                             35
    In the above algorithm, the case of fault-tolerant interval routing structure cons-

truction is not considered. A fault tolerant variant of this algorithm is briefly discussed in

the paper later.


Initial state of node p:
VisitedSet = {}, UnvisitedSet = {set of all the immediate
neighbors of p}, visited = false, token = 0, token is generated by
the initiator node.



<Init>:
/* event occurring at the initiator node only, to trigger off the
 * algorithm
 */
Begin
        p.visited = true;
        p.irn = token;
        token = token + 1;
        if (p.UnvisitedSet = {})
                call Terminate;
        else
        Begin
                Select node q from p.UnvisitedSet;
                Remove q from p.UnvisitedSet;
                Add q to p.VisitedSet;
                p.link[q] = token;
                Send message <AdvertMsg, token, p> to q;
        End
End

                   Figure 4.4 – The Init algorithm for initializing the process
                   of constructing the Interval Routing Structure.




                                               36
Received message <AdvertMsg, token, r>:
/* this message is received by node p when the sender tries to
 * include it in the interval routing structure
 */
Begin
        if (p.visited = false)
        Begin
                 p.parent = r;
                 p.visited = true;
                 p.irn = token;
                 token = token + 1;
                 Remove r from p.UnvisitedSet;
                 Add r to p.VisitedSet;
                 if (p.UnvisitedSet = {})
                 Begin
                         p.link[p.parent] = token;
                         Send message <SubtreeBelowDoneMsg, token, p> to p.parent;
                 End
                 else
                 Begin
                         Select q from p.UnvisitedSet;
                         Remove q from p.UnvisitedSet;
                         Add q to p.VisitedSet;
                         p.link[q] = token;
                         Send message <AdvertMsg, token, p> to q;
                 End
        End
        else
        Begin
                 p.link[q] = q.irn;
                 Send message <AdvertAlreadyExistsMsg, token, p.irn, p> to r
        End
End

              Figure 4.5 – The RcvAdvertMsg algorithm to handle the
              event of reception of the AdvertMsg message.




                                         37
Received message <AdvertAlreadyExistsMsg, token, irn, r>
/* this message is received by node p when the sender, r, wants to
 * tell p that it is already included in the interval routing structure
 */
Begin
         p.link[r] = irn;
         if(p.UnvisitedSet = {})
         Begin
                  if (p <> initiator)
                  Begin
                          p.link[p.parent] = token;
                          Send message<SubtreeBelowDoneMsg, token, p> to p.parent;
                  End
                  else
                  Begin
                          call Terminate;
                  End
         End
         else
         Begin
                  Select q from p.UnvisitedSet;
                  Remove q from p.UnvisitedSet;
                  Add q to p.VisitedSet;
                  p.link[q] = token;
                  Send message <AdvertMsg, token, p> to q;
         End
End

              Figure 4.6 – The RcvAdvertAlreadyExistsMsg algorithm to
              handle the event of reception of the AdvertAlreadyExists-
              Msg message.




                                          38
Received message <SubtreeBelowDoneMsg, token, r>
/* this message is received by node p when its child declares that
 * the sub-tree rooted at that child has been completely scanned
 */
Begin
        if (p.UnvisitedSet = {})
        Begin
                if (p <> initiator)
                Begin
                        p.link[p.parent] = token;
                        Send message<SubtreeBelowDoneMsg, token, p> to p.parent;
                End
                else
                Begin
                        call Terminate;
                End
        End
        else
        Begin
                Select q from p.UnvisitedSet;
                Remove q from p.UnvisitedSet;
                Add q to p.VisitedSet;
                p.link[q] = token;
                Send message <AdvertMsg, token, p> to q;
        End
End

              Figure 4.7 – The RcvSubtreeBelowDoneMsg algorithm to
              handle the event of reception of the SubtreeBelowDoneMsg
              message.




                                         39
4.5    Routing Messages

The depth-first nature of the interval routing structure construction algorithm determines

the nature of the routing of messages in the network. At every node, each link connecting

that node to its neighbors is assigned a unique interval of interval routing node numbers

that a message can reach over that link. The intervals cover the entire set of nodes in the

interval routing structure and do not overlap. Given a node p, an interval over a link, l, is

the range of numbers between the interval routing link number for that link and the

minimum interval routing link number greater than the current link’s number –


Interval(Link(l)) = [Link(l), min(Link(i))]
                       where, i ε NeighborSet(p) and Link(i) > Link(l)


      If no such Link(i) is found, then the interval is computed as –


Interval(Link(l)) = [Link(l), min(Link(i))]
                       where, i ε NeighborSet(p)


      As one can notice, the later interval wraps around the maximum assigned interval

routing number in the interval routing structure.

      Given a destination node number, choosing a link over which a message is to be

routed is a trivial task – select the link to whose interval the destination node number

belongs. Of course, this algorithm does not give the optimal path to the destination node.

But, at certain instances, the frond links serve to be reasonably good shortcuts over the

interval routing structure. It should be noted that interval routing is fundamentally a static

routing protocol in nature. This static nature is the only impediment in using interval

routing in dynamic and ad hoc environments. In spite of this it is conjectured that, with



                                               40
some add-ons and extra services, interval routing can be used effectively in dynamic

environments. The remaining sections of this chapter primarily explore a dynamic

publish/subscribe routing infrastructure design based on interval routing.


4.6    Fault Tolerant Routing

Interval routing has been subjected to considerable study since its initial proposal [38].

But most of the research has been done in the direction of optimizing the routing

overhead [41, 42, 36]. Multi-label interval routing schemes have been suggested for

optimal routing as well [23]. There have been very few efforts toward study about fault

tolerance. The algorithm on Prefix Routing [3] deals with arbitrary node insertions in

dynamic networks, but not with node deletions. An effort has been made toward

implementing fault-tolerant dynamic networks in [15], using a multi-node label interval

routing scheme. The approach taken in our design is to use multiple interval routing

structures to provide fault-tolerance.

      Due to the nature of the interval routing structure, there exists just one path taken by a

message, using the routing algorithm, from any source node to any destination node.

Introduction of faults in a link or node results in a set of nodes being unable to

communicate with another set of nodes in the network, and in the worst case situation the

network gets partitioned. This reveals the susceptibility of the interval routing algorithm

to faults. It can be easily observed that the interval routing algorithm given in the

previous two subsections is not fault tolerant in nature and is very hard to be made so.

Instead, what has been suggested in this thesis is to use a constant number of multiple

interval routing structures. At any time after initialization, there must be a constant k




                                               41
number of interval routing structures existing in the network. The deployment density of

the nodes in the network can determine the value of k.

   Since this k is a constant, the size of all the routing tables collectively is O(1)*O(d),

where d is the degree of the network, which is the same as O(d). If a source node is

unable to route a message to a destination node over an interval routing structure, it

chooses another interval routing structure to do so. There is a fair possibility that, in a

network of reasonable connectivity, there will exist multiple paths over multiple interval

routing structures from a source node to a destination node. The end-to-end fault detect-

ion is explored in the neighbor detection service later.

   Another issue is of arbitrary node insertions in the network. Prefix Routing [3]

handles this issue at the cost of increased size of the node Ids. What is proposed in our

design is a simple trick. The trick is to number the nodes, during the interval routing

structure construction, with constant gaps of a power of 2 in between (something like 24).

Whenever an already existing node detects a newly inserted node in the network, it will

assign the new node a number from the available number space it has between itself and

the next node in the network. This idea would allow a reasonable amount of assimilation

of newly inserted nodes in the network.

   The two strategies, mentioned above, for dealing with arbitrary node insertions and

node/link deletions have a limited time of workability. After a significant number of

insertions and deletions, the algorithm will not be able to sustain further modifications in

the topology of the network. For this reason, a periodic interval routing tree purging and

regeneration process occurs. In this process, an interval routing tree is selected and

purged out completely. This is followed by the generation of a new interval routing tree




                                             42
in the network. This new tree will naturally be based on the most recent topology of the

network. The inter-purge period has to be reasonably large so as to avoid making the

computation expensive.


4.7     Underlying Services

Some underlying services are needed for the functioning of our interval routing

infrastructure.

•       Neighbor detection

•       Initiator (Leader) election

•       Interval routing structure construction

•       Detection and propagation of changes in the network


4.7.1    Neighbor Detection

Neighbor Detection is the most fundamental service needed in the Interval Routing

enabled Publish/Subscribe protocol. The entire interval routing structure construction

algorithm depends on the knowledge of the neighbors of each node in the network.

Additionally, detection of new links/nodes and faults can also be done using the neighbor

detection service. This service determines the overall topology of the network, which

keeps on changing in due course. The neighbor detection service must execute

periodically in the background to monitor the on-going changes in the network topology.

      Apparently, it seems that there is not much to be discussed here. But all the subtle

issues that come up due to the ad hoc nature of the sensor networks will make an

elaborate discussion more apt. The most simplistic and obvious approach to this problem

is Heartbeats [17]. Each node periodically broadcasts a signal notifying its presence in




                                                  43
the network to its neighbors. Each node in turn receives broadcasts from its neighbors

notifying their presence. Each node maintains a table of its immediate neighbors and

updates it according to these messages received, thus keeping track of its neighbors.


Issues – The sensors in sensor networks communicate with each other using radio

transceivers. The messages are communicated as radio signals. Here are the problems

with this:

•    Low power radio signals can easily succumb to external noise.

•    Even if we consider the sensors to be stationary, the environment around keeps on

     altering. Correspondingly the signal strengths of the sensors also vary.

•    In addition to this, due to the power conservation requirement in the sensors, the

     range of all the sensors, in terms of distance of propagation of radio signals, is also

     very low.

•    There is another issue of signal loss due to collisions of messages.

    All this put together leads to the existence of asymmetric links in the network

between neighboring sensors. Sometimes the links could even be unidirectional! The

topology of the network keeps on altering. Links disappear and reappear even if no

preexisting nodes are removed from and no new nodes are actually added to the network

respectively. In the presence of such an ad hoc and dynamic environment the

implementation of the Neighbor Detection service becomes a somewhat non-trivial.


Neighbor Detection algorithm design – The first step in this design is to define the idea of

a Link between neighboring nodes. Considering the fact that unidirectional links are

useless, only bi-directional links will be considered as existing links in the network. In




                                            44
addition to this, it is desired of these links to transmit messages successfully with a

reasonably high probability. A metric of Strength of a link in this context has been

defined. The link strength should be above a threshold level for the link to be usable. This

link strength has to be a composite of the link strengths in both the directions.

    Consider a link l between nodes a and b. Link l is composed of two halves. The link

< a,b > and < b,a >. The former is named as l1 and the later as l2. Link l is acceptable if:


        strength(l1) >= threshold, and strength(l2) >= threshold


    The value of threshold could be defined empirically.

    Now, to the problem of calculating the value of the strength of links l1 and l2. The

crudest method would be Sampling, which is in fact an empirical method. It is

conjectured that sampling could give a fairly good estimate about the strength of a link.

This method would also take in account the collision impact up to a reasonable extent on

the strength of the links.

    In the example above, node a receives signals from node b regarding its presence in

the neighborhood. This helps node a to get a rough estimate of the strength of link l2. But

it also needs to know the strength of link l1, which can happen only when node b informs

node a about the strength of link l1. This necessitates the use of a message, from node b,

going over link l2 stating the detected strength of link l1. Same is the case with node b

getting the knowledge of link l2.

    Along with the neighbor detection service, the end-to-end path verification service is

also very useful. Given a source node a and a destination node b, and a path of nodes

a=n1, n2, n3, ..., nk=b from a to b, it is useful to have the knowledge about the functional

existence of the path. This can be done using end-to-end heartbeat signal propagation.


                                             45
End-to-end path verification is particularly useful in detecting faults along a path between

nodes, so that an alternative path can be chosen.


4.7.2   Initiator (Leader) election and Interval Routing Structure construction

The interval routing structure construction algorithm is initiated by one pre-designated

node in the network. Such a presumption is fair for static networks. But for dynamic

networks, and especially the ad hoc sensor networks, such a presumption cannot be made.

The initiator must be dynamically elected from the network. This problem is of Leader

election in ad hoc networks. Algorithms have been suggested for leader election in ad hoc

networks [28]. This section discusses our algorithm that merges the two subtasks of

leader election and interval-routing tree construction in a single task.

   The algorithm starts off with a subset of nodes in the network voluntarily electing

themselves as leaders. Each leader initiates off the depth-first interval routing structure

construction algorithm. If a node, being a part of one interval routing structure, receives a

message from another structure, it selects one tree to join to among the two. This

selection is done using unique Ids of the leaders. The leader with a greater/lesser (based

upon the policy) node Id prevails. In this way, the structures dominate over each other till

one interval routing structure prevails over all other structures.

   The ad hoc nature of the sensor networks introduces faults in the network. Similarly,

it results in addition of new nodes and links in dynamically the network as well. Addition

of new nodes can be taken care of as specified in the fault tolerance section of interval

routing. Failure of links can be determined by the neighbor detection service. There are

two scenarios to be considered.




                                              46
•    The link (beyond which the depth-first structure construction has gone ahead, but

     not returned yet) fails – In this scenario, the node (on the initiator side of the failed

     link) detecting this failure will restart the depth-first construction, from where it

     spawned the previous depth-first construction over the failed link in the past. The

     node on the other side of the tree that detects the failure of that link propagates a

     message discarding the previous node numbering.

•    The initiator node dies – It is important to notice here that if the initiator has

     multiple children, it is the only connecting node between the sub trees rooted at

     these children. In this case, the failing of a link at the initiator results in the

     formation of multiple disjoint networks. The immediate children of the initiator

     become the roots of their individual interval routing sub-structures in the network.

     Due to the ad hoc nature of the sensor networks, there is a strong possibility that in

     due course a new link/node pops up in the network that joins these divided parts of

     the network. In such a case, the newly detected links turn out to be frond links, and

     no significant change in the interval routing structure is required. This is true for

     preexisting interval routing structures in the network. Newly created structures,

     after the division, span over the individual sub-networks only. But the interval

     routing structures created after the merge are safe from this division.

The algorithm above ensures the construction of one interval routing structure in the

network. But the aim is to construct a constant k number of structures. There are two

methods for doing this. The distribution of these trees should be uniform for the

infrastructure to be most effective. At the end of constructing the first structure, its root

(initiator) has a reasonable estimate about the number of nodes N, in the network. The




                                             47
root then broadcasts this structure-size to all the nodes in the network. Since there could

be links in the structure that have failed by this time, it is a good idea to flood this

structure-size message over the network. Each node, after receiving this value of N,

checks to see if its node Id is a multiple of N/k. If so, then that node selects itself to be the

initiator of the ith tree, where i is the quotient of (node Id / (N/k)). In this way, the

initiators for all the (k-1) trees can somewhat simultaneously start with their tree

construction algorithm’s that can run in parallel.


4.7.3    Detection and propagation of changes in the network

The end-to-end fault detection technique as explained in the neighbor detection service is

sufficient in detecting a fault between two communicating nodes in the network. This can

aid in selecting the interval routing tree over which the nodes can communicate.


4.8     The New Brokering System

As shown in Figure 4.1, the Brokering System is layered above the Interval Routing

Layer. Now each individual broker does not need to create its separate advertisement

tree. The broker chooses one of the available interval routing structures on which it

overlays its advertisement tree. The broker then relays its advertisements over this host

interval routing structure. The choice of the host interval routing structure can be

arbitrary. As a consequence, all the broker nodes have their advertisement trees overlayed

on one host interval routing structure or another. In general, due to the arbitration, there is

a reasonable probability that a small fraction of broker advertisement trees are always

overlayed on each interval routing structure. All the messages to be sent via a broker

node are relayed over that broker’s advertisement tree, which is in fact an interval routing

structure.


                                               48
    Each node in the network chooses its own most favored broker for every known

interval routing structure. Every node, among the advertisements it receives over the k

known interval routing structures, selects one favorite broker over each structure. Thus,

every node maintains information of k favorite brokers spanning over k interval routing

structures. Out of these k favorite brokers, the node selects its most favored broker

depending on the factors of hop count and reliability. The knowledge of k brokers helps

in fault tolerance – if a node is unable to communicate with its most favored broker, it

can select another broker as its new most favored broker. The selection of this new broker

turns out to be very effective since the broker belongs to a different interval routing

structure.

    The use of several brokers and their advertisement trees might be questioned. The

argument is that if each node invariably uses one interval routing, then what is the need

of having the brokering infrastructure at all? Why not use the interval routing structures

themselves to do the routing of publications and subscriptions? The most important role

of broker nodes conjectured for the future is as aggregation points. Since several sensor

nodes in the network will have a common most favored broker, the information sensed by

these sensor nodes can be aggregated and collectively analyzed at their most favored

broker or aggregation points, as we may rightly call them. In such a scenario, the term

aggregation points, is more suitable for what we have termed broker nodes so far.

Additionally, having several brokers also equips the routing infrastructure with better

load balancing, but the number of interval routing structures in the network always

restricts this.




                                           49
      The sensor network is very dynamic by nature. The topology of the network keeps

altering in due course of time. These topological changes are assimilated by the interval

routing structure as has been discussed in the past few sections. The Brokering System

also needs to adapt to the altercations in the environment of the sensor networks. The

broker advertisement tree infrastructure is itself static in nature like the interval routing

infrastructure. Having several brokers and their advertisement trees does provide support

for fault-tolerance, but not completely. Nor does it help in absorbing newly deployed

nodes in the environment. To do so, the broker advertisement tree infrastructure needs to

be rebuilt. Additionally, even broker nodes might fail eventually – due to some

permanent fault or depleting power. To accommodate all these changes in the network, a

periodic process of purging out broker advertisement trees and creating new

advertisement trees becomes necessary. The old brokers step down from their broker

status and new non-broker nodes choose to be brokers probabilistically. The new

brokering system is also dynamic consequently.


4.9     The Publish/Subscribe Layer

There can be several approaches taken to design a Publish/Subscribe layer on top of the

Brokering System layer. This section discusses three such approaches. All these

approaches are applicable for different scenarios.


4.9.1    Scarce Subscribers

In the first scenario, the number of subscribers and consequently, subscriptions, is very

low. An example for such a situation is the use of base-stations in a sensor network. The

subscriptions are made only by the base-stations. Since in most scenarios, the number of

subscriptions would be very few, it is feasible to store a local copy of all the subscriptions


                                             50
made by the subscribers (base-stations in our example) at every node in the network. In

effect, the subscriber nodes broadcast the subscriptions over the entire network. Publisher

nodes send unicast publish messages to their most favored brokers (most favored

aggregation points). After aggregating publications the aggregation point forwards the

aggregate publication, as a unicast message, over its most favored interval routing

structure to the appropriate subscribers.


4.9.2   Abundant Subscribers

The second scenario is the other extreme. When the number of subscribers and similar

subscriptions in the network is high, multicasting a publication made to virtually all the

nodes in the network seems to be a good idea. In such a scenario, publishers send their

unicast publications to their most favored aggregation points (brokers), over that

aggregation points most favored interval routing structure. This aggregation point, after

doing aggregations on a reasonable number of received unicast publications, multicasts

the aggregate publication to all the nodes in the network over its most favored interval

routing structure.


4.9.3   An Intermediate Scenario

The third approach is an intermediate one. The scenario is that subscribers are distributed

arbitrarily in the network. No concrete information is available about the number of

subscribers in the network. There could be many or a few of them. There is no centralized

knowledgebase that contains the information about the subscriptions made in the

network. The information of the subscriptions made is stored in a distributed fashion at

the broker nodes. The idea is to make the subscribers send their subscriptions to their

most favored broker. In such conditions, the brokers can be better termed as subscription


                                            51
brokers. Each subscription broker maintains a list of subscriptions it has received from

the nodes that have designated it to be their most favored subscription broker. Whenever

a subscription broker receives a publication, it can scan through the list of subscriptions it

has and forward the publication to the subscriber who’s subscription matches the

publication. An important fact to be noted here is that a subscription broker need not have

any idea about which nodes have designated it to be their most favored subscription

broker. This gives significant flexibility for nodes to designate any subscription broker as

their most favored subscription broker. It is the responsibility of a subscriber to

unsubscribe from its most favored subscription broker before designating another one as

its new most favored subscription broker; and also send new subscriptions to its new

most favored subscription broker.

   The discussion so far talks about how subscriptions are dealt with and how

publications are routed after reaching a subscription broker. Subscriptions are distributed

all over the network. The publisher has no idea about these subscriptions. The

Publish/Subscribe approach described in section 4.10.2 shows one approach of flooding

the publications over the broker advertisement trees in the entire network. The new

approach being discussed here uses the fact that the only nodes that collective can give

the entire picture about the subscriptions made are the subscription brokers. A publisher

sends all its publications ideally to all the subscription brokers in the network over its

most favored interval routing structure in a multicast fashion. Apparently this selective

broadcast might seem to be the same as the multicasting case in section 4.10.2. But it is

not so. A very small fraction of the nodes in the network are subscription brokers. If a

node has to route a publication over a link it must have an idea of whether there is at least




                                             52
one subscription broker downstream that link. If there is no subscription broker

downstream, then the publication need not be sent over that link. Determining whether

there is a subscription broker downstream is very simple. During the subscription broker

advertisement dissemination process, each node receiving an advertisement over a link

can mark that link to be leading to a subscription broker. When that node receives a

publication, it has to scan through the links to its neighbors to check the existence of any

link leading to a subscription broker downstream. If the number of brokers in the network

is significantly large, the bandwidth usage of this approach will be very similar to the

bandwidth usage of the multicasting scenario described above. With a significantly small

number of subscription brokers the resulting bandwidth usage is conjectured to be

significantly less.


    Most of the design proposed so far has been implemented in a sensor network

simulation environment called the TinyOS Simulator. The next chapter goes into the

detailed specification of this implementation. Some features in the design have not been

implemented since those turn out to be mostly optimizations to the basic design.




                                            53
                                      CHAPTER 5

                                 IMPLEMENTATION


This chapter is a detailed elaboration of the implementation of the Interval Routing

enabled Publish/Subscribe communications protocol discussed so far. Section 1 contains

a brief review of the TinyOS design. Section 2 contains the programming model of a

typical application on TinyOS. To implement our protocol, an application was developed

in the TinyOS environment. Section 3 chalks out the brief modular layout this appl-

ication. Lastly, section 4 gives a detailed API specification of the various components of

the application.


5.1    The Execution Environment - TinyOS Design

TinyOS, an acronym for Tiny Operating System, has been the first ever effort to build an

operating system for studying the behavior of the sensor networks. TinyOS is an

execution model supporting highly concurrent, non-blocking computations that are vital

in the real-time sensor network environment. TinyOS has been designed keeping in mind

the constraints posed by the sensor nodes. The first prototype of TinyOS was just 132

bytes in size! Subsequently, the size has increased – but in coherence with the improving

technological support for sensor nodes.

      TinyOS is primarily event driven in nature – exactly the kind of behavior the sensor

nodes exhibit. It supports two kinds of computational entities: tasks and event handlers.

Tasks are the computational entities performing the major work. They are unit

computations that are executed to completion when spawned. Due to this run-to-



                                            54
completion nature of tasks, they cannot block on some resource – it will stall the entire

system. An event is a generalization of an interrupt handler, a call that is made from a

lower-level component to a higher-level component in response to some actual event.

Event handlers are short computational bursts. They are directly or indirectly invoked by

hardware events like external interrupts, timer events, etc. In addition to doing some

minimal computations, event handlers can also spawn new tasks.

      The execution environment in TinyOS is multi-threaded - or multi-tasked as we may

rightly call it. The run-to-completion nature for tasks makes use of a single thread stack

for all the tasks possible. Tasks cannot preempt each other, nor can they preempt event

handlers. But the event handlers can preempt tasks. Event handlers are conceptually

thought to be very short computational bursts. The life of tasks is relatively longer. The

tasks are scheduled in a FIFO manner. The TinyOS scheduler is power aware. It puts the

processor to sleep when there are no tasks in the task-queue, but leaves the I/O

peripherals operative. Thus, the processor remains in the sleep state till an I/O interrupt is

generated.

      More information about the TinyOS design can be obtained from [19, 9].


5.2    The Programming Model

The TinyOS system, libraries, and applications are written in NesC [16, 40], a new

language for programming structured component-based applications. The NesC language

has been designed for developing applications for real-time event-driven sensor network

type of environments.

      A complete NesC application consists of a group of components connected to each

other, with the TinyOS scheduler at its heart. Components are the building blocks of the



                                             55
NesC applications. There are two types of components – modules and configurations. A

module consists of application code in a C-like syntax. A configuration is a composite

component. It is basically a wiring of other defined components. The configuration

components act as the component integrators required for the application as a whole.

Every NesC application has a single top-level configuration that specifies the set of

components in the application and how they invoke one another.

   Components are linked to each other using interfaces. An interface provides an

abstract definition of the interaction between components. Interfaces are defined

independent of any of the components. For any interface to be in used in any application,

there are two components related to it – the interface user and the interface provider. An

interface is a collection of function declarations categorized in two different types –

commands and events. The provider component of an interface must implement its

commands; and the user component must implement its events. This implies that the

implementation of an interface is bi-directional, i.e. the user and provider of an interface

both implement parts of it. An interface can be provided by multiple components and a

component may provide multiple interfaces; same is the case with user components.

   All the modularization semantics are a compile time issue. The real executable code

lies in the modules. Each module has its own set of functions. The module functions are

of three types – command handlers, event handlers, internal functions and tasks.

Command handlers are the implementations of the commands declared in the interfaces

provided by the module. Event handlers are the implementations of the events declared in

the interfaces used by the module. Internal functions are private to the module. Tasks are

similar to the regular functions; the only difference is in the method of making a call to a




                                            56
task. Instead of explicitly calling a task, the task is posted in the TinyOS scheduler queue.

Additionally, each module can have its own private data structures - only the module’s

functions can access these data structures.

      A brief tutorial on application development in TinyOS using the NesC programming

language is available at [40].


5.3    The Implementation Architecture

As has been mentioned in the previous section, all the applications on TinyOS have

modular semantics. An application can be visualized as a hierarchy of components linked

together, starting with the MAIN component as the root, by the configuration components

using interfaces.

      Figure 4.1 shows the layered view of the conceptualized design of the Interval

Routing enabled Publish/Subscribe communications protocol. As a software infra-

structure, TinyOS is a simple scheduler equipped with some preemption rules. It does not

provide with any kind of network discovery, transmission control or routing facilities.

The lightweight networking stack component provides with rudimentary transmission

and reception of messages and CRC checks on the messages. Every layer shown in

Figure 4.1 was implemented in our application.

      The application implemented was divided into 4 different modules – the application

layer module, the Publish/Subscribe and Brokering System module, the Interval Routing

Structure module and the Routing module. A detailed architecture diagram is provided in

Figure 5.1.




                                              57
Figure 5.1 – The architecture of the implementation of the
Interval Routing enabled Publish/Subscribe communica-
tions protocol. The figure contains configurations, modules
and the messages exchanged between the modules.




                            58
               Figure 5.2 – A block diagram of the App component.

5.3.1   The Application Layer

The Application layer is the smallest component in our implementation. The application

layer makes arbitrary publications and subscriptions. The TinyOS simulator environment

does not provide with actual simulation of sensing devices like photo sensors, heat

sensors, etc. Hence the application layer generates random numbers using the system

component Random, and publishes them. The event of making a publication is

probabilistic. The application generates publications at a probability of 0.1 after every 10

seconds. A subscription is pair of random numbers – this represents a range of

consecutive integers. Subscriptions are made with a probability of 0.05 every 10 seconds.

A publication matches a particular subscription only if it lies between that subscription’s




                                            59
range. Subscriptions made can also be removed using the unsubscribe command provided

by the publish/subscribe layer.

   The application layer is wired to the next lower Publish/Subscribe layer only. The

interface between the application layer and the publish/subscribe layer is solely for

publications and subscriptions. The application layer can make both publications and

subscriptions, but receives only matching publications made by other nodes in the

system.


5.3.2   The Publish/Subscribe and Brokering System Layer

In our implementation we have integrated the Publish/Subscribe and the Brokering

System layer into one single module. Internally the layer implementation can be logically

divided into two sub-layers – the publish/subscribe sub-layer and the brokering system

sub-layer.

   The brokering system sub-layer is responsible for maintaining the broker infra-

structure. Since in the design specification the number of brokers in the system is

supposed to be a fraction of the network size, in our implementation each node chooses to

be a broker with a probability of 0.1. This implies that at any moment in time, after some

initial bootstrapping for building the interval routing structures, approximately 10% of

the nodes are brokers. Each broker spawns its own advertisement tree overlayed on one

of the interval routing structures created in the lower layer. Once a node chooses to be a

broker it chooses one of the interval routing structures as the infrastructure for its

advertisement tree, and multicasts a broker advertisement over the chosen favorite

interval routing structure. Since the broker advertisement trees are ideally spanning trees,

each node in the system receives an advertisement from every broker in the system. Since



                                            60
              Figure 5.3 – A block diagram for the PubSub component.

it can be fairly assumed that every interval routing structure will be used as an

advertisement tree by at least a few brokers, each node in the system receives broker

advertisements over all these interval routing structures. Based on the proximity, each

node then selects its most favored broker for every interval routing structure. Of these

most favored brokers, the node chooses a single broker as its favorite broker.

Subsequently, all the publications made by that node are first relayed to this favorite

broker.

   Every broker (and its advertisement tree) has a fixed lifetime. After a node has lived

long enough as a broker, it volunteers to step down from its broker status. Correspond-



                                          61
ingly, it multicasts a broker remove message over its advertisement tree that takes care of

erasing that advertisement tree itself. It should also be noted that the broker’s

advertisement tree is erased even when its favorite interval routing structure is purged

out; the broker node also steps down from its broker status in such a situation. When pre-

existing broker nodes are stepping down from their broker status, non-broker nodes may

probabilistically volunteer to become brokers. There is an unsaid assumption of time

synchrony in this design.

   The Publish/Subscribe layer prototype implementation has been made only for the

type 2 scenario described briefly in section 4.10.2. The assumption is that there are an

abundant number of subscribers and subscriptions in the system. The subscription list of

each node is locally maintained at the publish/subscribe layer of that node’s routing stack.

Consequently, the subscriptions are not transmitted over the network. The publications

are transmitted either as unicast or as multicast messages. A publisher node (which is not

a broker itself) sends its publication as a unicast message to its favorite broker over that

broker’s advertisement tree. The routing of the message is done by the underlying

interval routing structure. On reception of a unicast publication, the favorite broker

forwards it as a multicast publication to all the nodes in its advertisement tree. Even when

a broker needs to make a publication, it does so as a multicast publish message over its

advertisement tree.

   Each node, on receiving a multicast publication, makes a lookup in its subscription

list for a match. If a match is found, the publication is handed over to the application

layer. The multicast publications are also forwarded over the advertisement tree of the

broker that initiated the multicast. It is important to understand that all the messages are




                                             62
               Figure 5.4 – The block diagram for the IRS component.

always broadcasted at the radio communications level. The interval routing structure, and

the interval routing node numbers of the source and the destination nodes are used to

filter out messages to give a unicast and multicast functionality.


5.3.3   The Interval Routing Structure Layer

By far, the Interval Routing Structure Layer occupies the maximum volume of the

application code. It interacts with the Publish/Subscribe Brokering System layer above

and the Routing layer below.


                                             63
   The initial bootstrapping function in the system, mentioned in the previous section, is

for the creation of multiple interval routing structures in the network. In our

implementation, we have supported the simultaneous existence of 8 interval routing

structures. Each interval routing structure occupies a slot in this set of 8 structures. One

interval routing structure is created for each slot. The slot of an interval routing structure

is that structure’s id. Each interval routing structure has its own initiator node. Say a

distributed interval routing structure of id i needs to be created. Several nodes in the

network probabilistically decide to spawn interval routing structures and become

initiators. These interval routing structures steadily grow over the network. But only one

such interval routing structure must prevail eventually for the interval routing structure

with id i. Consequently, the growing interval routing structures overwhelm each other on

the on the basis of a criterion so that one interval routing structure, with id i, dominates in

the end. This criterion is the node number of the initiator of every growing interval

routing structure. This node number is the physical node number of the initiator, and not

its interval routing node number. An underlying assumption is that the physical node

number is always unique for every node over the network. The interval routing structure,

of id i, with the least node number of its initiator prevails. This election for interval

routing structures happens for all the 8 slots.

   The spawning of an interval routing structure is primarily done using 3 different kinds

of messages – the IRS_ADVERT_MSG, the IRS_ADVERT_ALREADY_EXISTS_MSG and

the IRS_ADVERT_SUBTREE_DONE_MSG as has been mentioned in the design

specification in chapter 4. An important factor that was not considered for spawning

interval routing structures is the broadcast-like nature of radio communication, which




                                              64
leads to collision of messages. Due to this, advertisement messages may be lost. We have

implemented an acknowledgement mechanism for the advertisement messages. A node

sending an advertisement expects to receive an acknowledgement for the advertisement

sent. If it doesn’t receive one, it presumes that the destination neighbor did not receive

the advertisement and resends it after a timeout. If the link between the source and

destination has not died out, the destination is guaranteed to receive the advertisement

after in due course; it then sends back an acknowledgement. In case the

acknowledgement is garbled due to collisions or unstable environment, the advertisement

sender again times out and sends another advertisement. Recovery over transient failures

is guaranteed with this procedure. In case of permanent failures, the source node

presumes the link to the destination to be dead and chooses another neighbor to send an

advertisement to. The acknowledgement messages are – the IRS_ADVERT_ACK_MSG,

the IRS_ADVERT_ALREADY_EXISTS_ACK_MSG message and the IRS_ADVERT_SUB-

TREE_DONE_ACK_MSG.

   After the interval routing structures are created, the broker advertisement trees are

overlayed on them. The only interface between the Interval Routing Structure layer and

the Publish/Subscribe Brokering System layer above is through the unicast and multicast

messages, and the interval routing structure ids. Broker advertisements, publications and

interval routing structure ids are the only items exchanged between the two layers. There

is a heavy exchange of messages between the Interval Routing Structure layer and the

Routing layer though. All the interval routing structure messages, the neighbor detection

messages and the unicast and multicast data messages (here all the messages originating




                                           65
               Figure 5.5 – A block diagram for the Route component.

from and destined to the layers above the interval routing structure are referred to as data

messages) are exchanged between the two layers.


5.3.4   The Routing Layer

The Routing layer does all the message exchanges. Other than exchange of data and

interval routing structure messages, the routing layer is also responsible for neighbor

detection. Since the TinyOS simulation yields all bi-directional links we did not go into

the implementation of the bi-directionally functional links as described in 4.7.1. The


                                            66
neighbor detection policy is very trivial. Nodes periodically broadcast a DECLARE

message to its immediate neighbors. The DECLARE message contains the node’s

physical node number. The neighbors, on receiving the broadcast, add the source node of

the DECLARE message in their respective neighbor sets. Each neighbor has a lifetime

associated with it. If source node already exists in the neighbor set of a recipient node, its

life is reinitialized to 0. When the life of a neighbor crosses a threshold it is presumed that

at least the link to that neighbor has failed and the neighbor is removed from the neighbor

set. The interval routing structure layer above heavily uses the neighbor set. The routing

layer directly communicates with the system network stack to send and receive messages.


5.4    The API Specification

Our implementation has 5 components and 3 interfaces.


Components:

•     App component – The application layer implementation.

•     PubSub component – The publish/subscribe brokering systems layer implementation.

•     IRS component – The interval routing structure layer implementation.

•     Route component – The routing layer implementation.

•     Main component – This is a configuration. It wires all the above-enlisted components.

      The first 4 components are all modules.


Interfaces:

•     PubSubI interface – The interface between components App and PubSub.

•     IRSI interface – The interface between components PubSub and IRS.

•     RouteI interface – The interface between components IRS and Route.


                                                67
5.4.1   The App component

The App component does not need to maintain any internal data structures. Along with

the initialization functions, the App component implements two more functions:

   1. Timer.fired – This is an event handler for the system clock interrupts. This event

        handler is used by the App component to make publications and subscriptions.

   2. PubSubI.RCV_PUBLICATION – This function is called by the publish/subscribe

        layer below as an event of reception of a matching publication.


5.4.2   The PubSub component

The PubSub component is a module that maintains the subscription list and the most

favored broker-list (mfb-list) data structures. Along with the initialization functions, the

PubSub component implements the following functions:

   1. Timer.fired – This is another event handler for the system clock interrupts. It

        should be noted that the NesC and the TinyOS environments allow multiple event

        handlers to be triggered on the same event (the clock interrupt here). This event

        handler is used for probabilistically changing the node to broker/non-broker

        status. A change to broker status also results in the spawning of the broker’s

        advertisement tree. This spawning is done by multicasting out the BROKER_AD-

        VERT_MSG message. When a change to non-broker status happens the broker’s

        advertisement tree is purged out by multicasting out the BROKER_REM-

        OVE_MSG message.

   2. PubSubI.SUBSCRIBE – This is a command implementation to add a subscription

        to the existing subscription list. This command is called by the App component.




                                            68
3. PubSubI.UNSUBSCRIBE – This is a command implementation to remove an

   existing subscription from the subscription list. This command is called by the

   App component.

4. PubSubI.PUBLISH – This is a command implementation of making a unicast or

   multicast publication. The command is called by the App component when it

   wants to make a publication. Depending on the broker or non-broker status of the

   node, a multicast or unicast publication is made respectively.

5. IRSI.IRS_ADD_MSG – This is an event handler for the event of addition of an

   interval routing structure to the list of interval routing structures. It should be

   noted that there is always one most favored broker for an interval routing

   structure.

6. IRSI.IRS_REMOVE_MSG – This is an event handler for the event of removal of

   an interval routing structure from the interval routing structure list. If the removed

   structure has a designated most favored broker, the broker is also removed from

   the broker-list consequently.

7. IRSI.IRS_RCV_UNICAST_MSG – This is an event handler for the event of

   reception of a unicast publish message sent by the IRS component below. In our

   implementation, the publisher always directs a unicast publication to its favorite

   broker. When the corresponding favorite broker receives the publication it simply

   multicasts the publication over its advertisement tree.

8. IRSI.IRS_RCV_MULTICAST_MSG – This is an event handler for the event of

   reception of a multicast message sent by the IRS component below. There are 3

   kinds of multicast messages: BROKER_ADVERT_MSG, BROKER_REM-




                                        69
        OVE_MSG and MULTICAST_PUBLISH_MSG. When a BROKER_ADV-

        ERT_MSG message is received the broker is designated as the most favored

        broker (on the interval routing structure over which the advertisement was

        received) if its closer to the current node than an existing most favored broker.

        When a BROKER_REMOVE_MSG message is received the broker is removed

        from the most favored broker list if it exists in that list. In such a case, the node

        will not have a most favored broker for the corresponding interval routing

        structure until an advertisement by another broker is received over the same

        interval routing structure. When a MULTICAST_PUBLISH_MSG message is

        received a lookup in the subscription-list is done and a matching publication is

        forwarded to the App layer above. Also, the message is multicasted down the

        originating broker’s advertisement tree.


5.4.3   The IRS component

The IRS component is a module that maintains the table for interval routing structures

existing in the system at any time. It also maintains the neighbor-set data structure. The

interval routing structure is spawned using this neighbor-set. Along with the initialization

functions, the IRS component implements the following functions:

   1. Timer.fired – This event handler is used for nodes to volunteer themselves as

        initiators and spawn an interval routing structure. It is also used to measure the

        life of an existing interval routing structure at the initiator node and purge it out if

        the structure has lived longer than its expected lifetime.

   2. IRSI.IRS_SEND_UNICAST_MSG – This is a command implementation to send

        over a unicast message over the interval routing structure specified to a single



                                              70
   destination node. This command is called by the PubSub component above. In our

   implementation, the unicast message is always a publication.

3. IRSI.IRS_SEND_MULTICAST_MSG – This is a command implementation to

   send over a multicast message over the specified interval routing structure. The

   multicast message is subsequently relayed over the entire network. This command

   is called by the PubSub component above. A multicast message is a broker

   advertisement, a broker remove or a publication message.

4. IRS_SEND_ADVERT_MSG – This is a command implementation to send over

   an advertisement to a specific neighbor. An advertisement is sent in two

   scenarios:

   i. When the current node decides to become an initiator and starts spawning its

      interval routing structure, it chooses a neighbor to forward the

      IRS_ADVERT_MSG to.

  ii. When the current node receives an IRS_ADVERT_MSG, it forwards the

      advertisement to a neighboring node.

5. IRS_SEND_ADVERT_ACK_MSG – This command implementation is respon-

   sible for acknowledging a received IRS_ADVERT_MSG to the sender of that

   message.

6. IRS_SEND_ADVERT_ALREADY_EXISTS_MSG – This is a command to send

   over an advertisement already exists message to the sender of an advertisement

   message. If a node receiving an IRS_ADVERT_MSG finds out that it is already a

   part of the corresponding interval structure, it replies back with an advertisement

   already exists message.




                                       71
7. IRS_SEND_ADVERT_ALREADY_EXISTS_ACK_MSG – This command is

   responsible for acknowledging a received IRS_ADVERT_ALREADY_EXI-

   STS_MSG message to the sender.

8. IRS_SEND_ADVERT_SUBTREE_DONE_MSG – This command implement-

   ation is responsible for sending a sub-tree done message to the current node’s

   parent in the interval routing structure being spawned. When the current node

   notices that it has no more neighbors that have not been included in the interval

   routing structure being constructed, it sends the IRS_ADVERT_SUBTREE_DO-

   NE_MSG to its parent.

9. IRS_SEND_ADVERT_SUBTREE_DONE_ACK_MSG – This command ack-

   nowledges a received IRS_ADVERT_SUBTREE_DONE_MSG message to its

   sender.

10. RouteI.IRS_RCV_ADVERT_MSG – This is an event handler responsible for

   handling the event of reception of an IRS_ADVERT_MSG. This function verifies

   whether the message was destined to the current node. Next, the eligibility of the

   interval routing structure (that is being spawned) to be added as a table is verified.

   If the interval routing structure is acceptable there are 3 scenarios that come up:

   i. The current node is already a part of the interval routing structure over which

      it received this advertisement. In such a case, the current node updates the

      link-entry for the sender in the corresponding interval structure table and

      replies back an IRS_ADVERT_ALREADY_EXISTS_MSG.

   ii. The current node is not a part of the interval routing structure, but it does not

      have any other neighbors to forward the IRS_ADVERT_MSG message to.




                                        72
      Firstly, the interval routing structure table is created, the link-entry of the

      sender is updated in this in the newly created table; and finally an

      IRS_ADVERT_SUBTREE_DONE_MSG is replied to the sender.

  iii. The current node is not a part of the interval routing structure, and it does have

      potentially unvisited neighbors by the interval routing structure. After creating

      a interval routing structure table and a link-entry for the sender, the current

      node selects one neighbor and forwards the IRS_ADVERT_MSG to that

      neighbor.

11. RouteI.IRS_RCV_ADVERT_ACK_MSG – This is an event handler for the event

   of reception of an IRS_ADVERT_ACK_MSG message. The acknowledgement is

   a response to the IRS_ADVERT_MSG sent by the current node. Acknowledge-

   ments have been included to provide robustness to the interval routing structure

   construction algorithm. After sending an advertisement the current node starts a

   timeout for an advertisement acknowledgement. If the acknowledgement is

   received before the timeout the current node goes out of the wait state. Of course,

   this is all asynchronous in nature, so the current does not literally go in a wait

   state. It increments the timeout value every second, and after a threshold is

   crossed, it retransmits the advertisement.

12. RouteI.IRS_RCV_ADVERT_ALREADY_EXISTS_MSG – This is an event

   handler responsible for handling the event of reception of the IRS_ADVERT_AL-

   READY_EXISTS_MSG. If the current node is the destination of the advert-

   isement, the message is accepted, an acknowledgement is sent and one of the

   following operations is carried out:




                                          73
   i. If the current node finds a neighbor that has potentially not been included in

      the currently spawning interval routing structure it forwards an IRS_ADV-

      ERT_MSG to that neighbor.

  ii. If the current node does not find any neighbor that has not been included in

      the currently spawning interval routing structure it sends the IRS_ADVER-

      T_SUBTREE_DONE_MSG to its parent.

13. RouteI.IRS_RCV_ADVERT_ALREADY_EXISTS_ACK_MSG – This is an

  event handler for handling the event of reception of IRS_ADVERT_ALRE-

  ADY_EXISTS_MSG message. The acknowledgement is a response to the

  IRS_ADVERT_ALREADY_EXISTS_MSG sent by the current node.

14. RouteI.IRS_RCV_ADVERT_SUBTREE_DONE_MSG – This is an event

  handler responsible for handling the event of reception of the IRS_ADVERT_SU-

  BTREE_DONE_MSG. If the current node is the destination of the advertisement,

  the message is accepted, an acknowledgement is sent and one of the following

  operations is carried out:

   i. If the current node finds some other potentially unvisited neighbor, it forwards

      an IRS_ADVERT_MSG to that neighbor.

  ii. If the current node does not have an unvisited neighbor then it forwards an

      IRS_ADVERT_SUBTREE_DONE_MSG to its parent in the interval routing

      structure.

15. RouteI.IRS_RCV_ADVERT_SUBTREE_DONE_ACK_MSG – This is an event

  handler for handling the event of reception of IRS_ADVERT_SUBTREE_DO-




                                       74
        NE_ACK_MSG message. The acknowledgement is a response to the

        IRS_ADVERT_SUBTREE_DONE_MSG sent by the current node.

   16. RouteI.IRS_RCV_DECLARE_MSG – This is an event handler for handling the

        event of reception of the IRS_DECLARE_MSG message. IRS_DECLARE_MSG

        is the message used for neighbor detection. This event handler adds the sender

        node to the neighbor-set of the current node.


5.4.4   The Route component

The Route component is a module that maintains the neighbor-set of the current node.

The neighbor-set contains all those nodes from which the current node has received the

DECLARE_MSG. Along with the initialization functions; the Route component

implements the following functions:

   1. Timer.fired – This timer interrupt event handler is used to broadcast the

        DECLARE_MSG message to the immediate neighbors periodically.

   2. The IRS construction messages – The following is the set of commands in the

        Route component to send messages for the construction of an interval routing

        structure:

        i. RouteI.SEND_IRS_ADVERT_MSG – Sending the IRS_ADVERT_MSG

           message.

        ii. RouteI.SEND_IRS_ADVERT_ACK_MSG – Sending the IRS_ADVERT_A-

           CK_MSG message.

        iii. RouteI.SEND_IRS_ADVERT_ALREADY_EXISTS_MSG – Sending the

           IRS_ADVERT_ALREADY_EXISTS_MSG message.




                                            75
  iv. RouteI.SEND_IRS_ADVERT_ALREADY_EXISTS_ACK_MSG – Sending

      the IRS_ADVERT_ALREADY_EXISTS_ACK_MSG message.

  v. RouteI.SEND_IRS_ADVERT_SUBTREE_DONE_MSG                        –    Sending    the

      IRS_ADVERT_SUBTREE_DONE_MSG message.

  vi. RouteI.SEND_IRS_ADVERT_SUBTREE_DONE_ACK_MSG – Sending the

      IRS_ADVERT_SUBTREE_DONE_ACK_MSG message.

  The following is the set of event handlers in the Route component for the events

  of reception of messages for the construction of an interval routing structure.

   i. ReceiveMessage.RcvIRSAdvertMsg – To receive the IRS_ADVERT_MSG

      message.

  ii. ReceiveMessage.RcvIRSAdvertAckMsg – To receive the IRS_ADVERT_A-

      CK_MSG message.

  iii. ReceiveMessage.RcvIRSAdvertAlreadyExistsMsg – To receive the IRS_AD-

      VERT_ALREADY_EXISTS_MSG message.

  iv. ReceiveMessage.RcvIRSAdvertAlreadyExistsAckMsg – To receive the

      IRS_ADVERT_ALREADY_EXISTS_ACK_MSG message.

  v. ReceiveMessage.RcvIRSAdvertSubtreeDoneMsg – To receive the IRS_ADV-

      ERT_SUBTREE_DONE_MSG message.

  vi. ReceiveMessage.RcvIRSAdvertSubtreeDoneAckMsg              –   To    receive   the

      IRS_ADVERT_SUBTREE_DONE_ACK_MSG message.

3. RouteI.SEND_IRS_UNICAST_MSG – This command is used to send the

  IRS_UNICAST_MSG message to a neighboring node.




                                       76
    4. RouteI.SEND_IRS_MULTICAST_MSG – This command is used to send the

        IRS_MULTICAST_MSG message to a neighboring node.

    5. ReceiveMessage.RcvDeclareMsg – This is the event handler invoked on reception

        of the DECLARE_MSG message.

    6. ReceiveMessage.RcvIRSUnicastMsg – This is the event handler invoked on

        reception of the IRS_UNICAST_MSG message.

    7. ReceiveMessage.RcvIRSMulticastMsg – This is the event handler invoked on

        reception of the IRS_MULITCAST_MSG message.


5.4.5   The Main component

The Main component is a configuration component, linking all the modules together in

our application. The listing of the source code of the Main component is given in

Figure5.6.


5.4.6   The PubSubI interface

The PubSubI interface is provided by the PubSub module and used by the App module.

The source listing of this interface is given in Figure 5.7.


5.4.7   The IRSI interface

The IRSI interface is provided by the IRS module and used by the PubSub module. The

source listing of this interface is given in Figure 5.8.


5.4.8   The RouteI interface

The RouteI interface is provided by the Route module and used by the IRS module. The

source listing of this interface is given in Figure 5.9.




                                              77
/*
 * App: The main configuration component of the Publish/Subscribe Application. This
 * component defines the relationships between other components using interfaces.
 *
 * Author: Virendra Marathe
 *
*/
configuration Main {
}

implementation {

      #include "./Constants.h"

      components Main, AppM, PubSubM, IRSM, RouteM, TimerC, RandomLFSR,
      GenericComm as Comm;

      Main.StdControl -> AppM;
      AppM.Timer -> TimerC.Timer[unique("Timer")];
      AppM.PubSubStdControl -> PubSubM.StdControl;
      AppM.PubSubI -> PubSubM;
      AppM.Random -> RandomLFSR;

      PubSubM.Timer -> TimerC.Timer[unique("Timer")];
      PubSubM.IRSStdControl -> IRSM.StdControl;
      PubSubM.IRSI -> IRSM;
      PubSubM.Random -> RandomLFSR;

      IRSM.RouteStdControl -> RouteM.StdControl;
      IRSM.Timer -> TimerC.Timer[unique("Timer")];
      IRSM.RouteI -> RouteM;
      IRSM.Random -> RandomLFSR;

      RouteM.CommControl -> Comm;
      RouteM.Timer -> TimerC.Timer[unique("Timer")];
      RouteM.RcvDeclareMsg -> Comm.ReceiveMsg[DECLARE_MSG_ID];
      RouteM.RcvIRSAdvertMsg -> Comm.ReceiveMsg[IRS_ADVERT_MSG_ID];
      RouteM.RcvIRSAdvertAckMsg -> Comm.ReceiveMsg[IRS_ADVERT_ACK_MSG_ID];
      RouteM.RcvIRSAdvertExistsMsg ->
            Comm.ReceiveMsg[IRS_ADVERT_EXISTS_MSG_ID];
      RouteM.RcvIRSAdvertExistsAckMsg ->
            Comm.ReceiveMsg[IRS_ADVERT_EXISTS_ACK_MSG_ID];
      RouteM.RcvIRSAdvertSubtreeDoneMsg ->
            Comm.ReceiveMsg[IRS_ADVERT_SUBTREE_DONE_MSG_ID];
      RouteM.RcvIRSAdvertSubtreeDoneAckMsg ->
            Comm.ReceiveMsg[IRS_ADVERT_SUBTREE_DONE_ACK_MSG_ID];
      RouteM.RcvIRSPurgeMsg -> Comm.ReceiveMsg[IRS_PURGE_MSG_ID];
      RouteM.RcvIRSUnicastMsg -> Comm.ReceiveMsg[IRS_UNICAST_MSG_ID];
      RouteM.RcvIRSMulticastMsg -> Comm.ReceiveMsg[IRS_MULTICAST_MSG_ID];
      RouteM.SendMsg -> Comm;

}

              Figure 5.6 – Main configuration component of the Interval
              Routing enabled Publish/Subscribe protocol application.


                                             78
/*
 * PubSubI: The interface linking the main App component and the Publish/Subscribe Brokering
 * System components.
 *
 * Author: Virendra Marathe
 *
 */
interface PubSubI {

      event result_t RCV_PUBLICATION(void *);

      command result_t SUBSCRIBE(void *);
      command result_t UNSUBSCRIBE(void *);
      command result_t PUBLISH(void *);

}

              Figure 5.7 – The Publish/Subscribe Interface linking the
              App module component to the Publish/Subscribe Brokering
              System module component.




/*
 * IRSI: The interface linking the Publish/Subscribe Brokering System component to the
 * Interval Routing Structure component.
 *
 * Author: Virendra Marathe
 *
 */
interface IRSI {

      event result_t RCV_IRS_ADD(int, int);
      event result_t RCV_IRS_REMOVE(int);
      event result_t RCV_IRS_MULTICAST_MSG(void *);
      event result_t RCV_IRS_UNICAST_MSG(void *, int);

      command result_t SEND_IRS_MULTICAST_MSG(void *, int);
      command result_t SEND_IRS_UNICAST_MSG(void *, int, int);
}

              Figure 5.8 – The Interval Routing Structure Interface
              linking the Publish/Subscribe Brokering System module
              component to the Interval Routing Structure module
              component.



                                             79
/*
 * RouteI: The interface linking the Interval Routing Structure component to the Routing
 * component.
 *
 * Author: Virendra Marathe
 *
 */
interface RouteI {

       event result_t ADD_NEIGHBOR(int);
       event result_t REMOVE_NEIGHBOR(int);
       event result_t RCV_IRS_ADVERT(void *);
       event result_t RCV_IRS_ADVERT_ACK(void *);
       event result_t RCV_IRS_ADVERT_EXISTS(void *);
       event result_t RCV_IRS_ADVERT_EXISTS_ACK(void *);
       event result_t RCV_IRS_ADVERT_SUBTREE_DONE(void *);
       event result_t RCV_IRS_ADVERT_SUBTREE_DONE_ACK(void *);
       event result_t RCV_IRS_PURGE(void *);
       event result_t RCV_IRS_UNICAST_MSG(void *);
       event result_t RCV_IRS_MULTICAST_MSG(void *);

       command result_t SEND_IRS_ADVERT(void *);
       command result_t SEND_IRS_ADVERT_ACK(void *);
       command result_t SEND_IRS_ADVERT_EXISTS(void *);
       command result_t SEND_IRS_ADVERT_EXISTS_ACK(void *);
       command result_t SEND_IRS_ADVERT_SUBTREE_DONE(void *);
       command result_t SEND_IRS_ADVERT_SUBTREE_DONE_ACK(void *);
       command result_t SEND_IRS_PURGE(void *);
       command result_t SEND_IRS_UNICAST_MSG(void *);
       command result_t SEND_IRS_MULTICAST_MSG(void *);
}

               Figure 5.9 – The Route Interface linking the Interval
               Routing Structure module component to the Route module
               component.




                                               80
                                     CHAPTER 6

                         RESULTS AND DISCUSSIONS


The primary motivation for almost all research activities undertaken so far in human

history has been to solve some existing problem. The outcome of the research activity is

ideally expected to solve the problem under study. So what eventually matters is the

outcome or result of the research endeavor. Without results, any research task remains

incomplete. The next logical step after designing a solution for a problem is to prove that

the solution does what it is expected to do. In Computer Science research, there have

been two widely accepted methods for proving the correctness of a solution to a given

problem – mathematical reasoning, and implementation results. Mathematical reasoning

is more of formalization and theorem proving for the correctness of the proposed

solution. Implementation results involve an implementation of the proposed solution

followed by analysis of gathered statistical information produced by the executions of the

implementation. Additionally, implementation results also bring up some subtle

interesting hidden facts about the proposed solution. We have followed the latter method

to produce substantial evidence that our proposed design does produce the desired

behavior.

   The previous two chapters of this thesis report have been probing into the details of

the design and implementation of our Interval Routing enabled Publish/Subscribe routing

protocol for communication over the ad hoc sensor network environment. This chapter

dwells into the discussion of some results produced by the simulations carried out with

the implemented design. By no means do these results give a comprehensive picture of


                                            81
       typedef struct {
         int status;                              // a valid/invalid status flag for the table
         int irs_id;                              // the IRS number
         int root_id;                             // the unique physical id of the root node
         int irs_my_id;                           // the node’s IRS id
         int irs_life;                            // the duration of validity of the IRS
         int irs_node_count;                      // # of neighbors in the IRS table
         int node_id[NEIGHBORS];                  // the unique physical ids of the neighbors
         int irs_link[NEIGHBORS];                 // the link id (created and used by IRS routing)
         bool irs_link_exists[NEIGHBORS];         // status of link (for transient failures)
         // variables used only for IRS construction
         int ad_flag[NEIGHBORS];                  //ack. flags for advertisements
         int ad_subtree_flag[NEIGHBORS];          // ack flags for sub-tree done advertisements
         int ad_exists_flag[NEIGHBORS];           // ack flags for already exists advertisements
         int irs_latest_out_id;                   // used for IRS construction
         int latest_out_id;                       // used for IRS construction
       } IRS_TABLE;

       typedef struct {
         int mfb_irs_id;                         // the most favored broker’s IRS number
         int irs_id[IRS_COUNT];                  // the IRS numbers
         int irs_my_id[IRS_COUNT];               // the node’s IRS id
         int irs_mfb_id[IRS_COUNT];              // the most favored brokers for different IRSs
         int irs_mfb_hop_count[IRS_COUNT];       // hop count to each most favored broker
       } BROKER_SET;

               Figure 6.1 – The data structures representing the Interval
               Routing Structure Table at every node in the network for
               one Interval Routing Structure, and the Most Favored
               Broker Set for all the Interval Routing Structures
               collectively.


the study of our proposed design. There are still some more aspects yet to be explored

and some tertiary problems to be solved.

   The original work on Interval Routing [25, 38] contains mathematical proofs for the

correctness of Interval Routing. Our analyses are more related to the study of the

Publish/Subscribe paradigm equipped with the Interval Routing Infra-structure for

communicating messages over the network.




                                                82
6.1     Memory Usage

The primary goal of our thesis has been memory efficient routing of messages in a

Publish/Subscribe environment. The first data structure in Figure 6.1 is the Interval

Routing Structure (IRS) Table at every node. This table is presented in the NesC-

language syntax. NEIGHBORS is a constant number of neighbors that the IRS table can

account for. A reasonable estimate of the value of NEIGHBORS can be predicted for a

given environment. The size of the IRS table is always in the order of O(NEIGHBORS).

For our simulations, the value assigned to NEIGHBORS was 8, and the number of IRSs

simulated was also 8. The beauty about interval routing is that this memory requirement

for the routing table is independent of the network size. And since it can be reasonably

presumed in most cases that increase in network size does lead to an increase in the

network density, interval routing turns out to be a highly scalable routing methodology,

in terms of memory.

      The second data structure shown in Figure 6.1 is the data structure for the Most

Favored Broker Set over all the Interval Routing Structures collectively. This data

structure is also presented in the NesC-language syntax. It should be noted that for every

IRS table, there is one corresponding entry in the Most Favored Broker Set. This implies

that any node remembers just one broker over every IRS. IRS_COUNT is the constant

number of IRSs that can coexist at a time at any node. The memory usage of the Broker

Set is thus in the order of O(IRS_COUNT). In our simulations, the value of IRS_COUNT

is 8.




                                           83
               Figure 6.2 – A graph representing the distribution of
               Publications, Broker Advertisements and Interval Routing
               Structure construction messages for networks of random
               topologies.

6.2   Distribution of Messages

Figure 6.2 is the graph of the distribution of messages over networks of different sizes. It

is easily noticed that the bandwidth (and subsequently the energy consumed) used by the

infrastructure messages (the broker advertisements and the interval routing structure

construction messages) is not significant. The network traffic is dominated by

publications. The growth rate of infrastructure messages is very slow as compared to the

publish messages as the network size increases. These simulations are for the networks

that have a fairly large number of publishers and subscribers – hence publications are

always multicasted over the broker advertisement trees to all the nodes in the network.


                                            84
              Figure 6.3 – Distribution of Publications, Broker Advertis-
              ements and Interval Routing Structure construction
              messages over time for a network of a random topology.

6.3   Distribution of Messages over Time

Figure 6.3 above is the graph representing the distribution of messages over time. The

topology of the network is random, and the network contains 100 nodes. It can be

observed that there is a burst of IRS construction messages in the beginning. It is during

this time that all the 8 IRSs are constructed. The graph for the broker advertisements can

be observed to be rising and falling at regular periods. This is because of nodes

volunteering to be brokers and stepping down from their broker status at periodically.

After the initial setup of IRS and Broker Advertisements, the bandwidth is mostly

dominated by the publish messages.



                                           85
               Figure 6.4 – A graph representing the number of brokers
               coexisting in the system over time in networks of random
               topologies.

6.4   Broker Distribution over Time

In our protocol each node, with a probability of 0.1, volunteers to become a broker.

Correspondingly, pre-existing broker nodes step down from their broker status after a

fixed period of time. Hence, an approximate 10% of the nodes in the network are all

brokers at any time after the initial setup of interval routing structures. The graph above

obtained from simulations agrees with this. The three graphs represent simulations over

random topologies for networks of size 100, 150 and 200 nodes respectively.




                                            86
              Figure 6.5 – A graph showing the increase in effectiveness
              of reception of publications with increased subscribers. It
              also shows the uniform distribution of messages over
              subscribers. The network has a grid topology.

6.5   Effectiveness of Routing and Uniform Distribution of Publications

The graph above represents the publications received at subscribers. The simulations

were made with varying percentage of subscribers over a network of 100 nodes. It turns

out that the number of messages received by different subscribers is approximately the

same. This implies a uniform distribution of messages, which is a consequence of the

publications being multicasted over the entire network. The overall picture shows the

effectiveness of the protocol implementation.




                                           87
 Time      IRS 0      IRS 1     IRS 2     IRS 3         IRS 4    IRS 5     IRS 6     IRS 7     Total
          Brokers    Brokers   Brokers   Brokers    Brokers     Brokers   Brokers   Brokers   Brokers
  400        0          0        0         0             0        0         0         0         0
  500        1          3        0         1             1        1         1         1         9
  600        0          2        2         2             1        1         2         1         11
  700        2          3        1         0             1        0         4         1         12
  800        2          4        1         1             1        0         1         2         12
  900        0          2        2         3             1        0         2         1         11
 1000        1          2        0         0             1        2         2         2         10
 1100        3          5        3         1             0        1         2         0         15
 1200        1          0        3         1             0        0         1         2         8
 1300        0          1        3         2             1        1         3         2         13
 1400        1          2        1         0             0        1         1         1         7
 1500        3          1        5         0             1        2         3         0         15

                   Table 6.1 – Distribution of Brokers over Interval Routing
                   Structures (Network Size = 150).




6.6     Distribution of Brokers over Interval Routing Structures

Given table above shows the distribution of brokers over interval routing structures in a

simulation of a grid topology network of 150 nodes. The overall distribution of

bandwidth in the network is fairly uniformly shared between the different interval routing

structures. This result shows the potential for a good amount of load balancing of

network traffic. The node numbering was done in a random fashion. Eight interval

routing structures were used in the simulation. The table does not reflect complete

uniformity due to the randomized selection of the favorite IRSs for routing chosen by

broker nodes. It is over these favorite IRSs that the broker spawns its advertisement tree.

All the multicast publications are subsequently relayed over these advertisement trees.




                                                   88
                                        Fault Zone




               Figure 6.6 – A grid topology network of 100 nodes. The
               Fault Zone above the group of nodes that fail simulta-
               neously.

6.7   Fault Tolerance

The secondary goal of our thesis work has been ensuring fault tolerance in the Interval

Routing enabled Publish/Subscribe communications protocol. Figure 6.6 above repre-

sents a grid topology network containing 100 nodes. After some t seconds of execution,

the nodes from the Fault Zone fail simultaneously (there is an assumption of time

synchronization in this scenario). Due to this failure of a large group of nodes (36%),

every interval routing structure is disrupted. The publications made by publishers

henceforth are lost due to the virtual collapse of the routing infrastructures. This is

probably one of the ultimate tests of our routing infrastructure for fault tolerance. It can

be easily observed that this mass failure of nodes does not partition the network into



                                            89
               Figure 6.7 – The graph showing the behavior of the net-
               work at the occurrence of a mass failure of nodes.

disjoint sub-networks. All the nodes do have chains of links connecting them to all the

other alive nodes in the network. Our communications infrastructure does theoretically

deal with such faults by the process of periodically purging interval routing

infrastructures and rebuilding new ones. The result of such a simulation can be seen in

Figure 6.7. It can be clearly observed that the number of publications received in the

system suddenly plummets to a phenomenally low range immediately after a group of

nodes fails. At a later point in time, when the reconstruction process of IRSs starts, the

number of publications received steadily increases to an acceptable level. This result

reveals the accomplishment of the second goal of the design of our communications

infrastructure. It is arguably the most interesting and satisfying result of our simulations.


                                              90
               Figure 6.8 – A graph representing the effectiveness of our
               Publish/Subscribe system under arbitrary faults.

6.8   Effectiveness under arbitrary faults

Figure 6.8 shows a graph depicting the behavior of our Publish/Subscribe communica-

tions protocol from a high level. The results have been taken from six different simula-

tions for random topology networks containing 100 nodes. Each simulation has a

different percentage of arbitrarily failing nodes. As one can see, the loss of messages due

to node failures is not very high when the number of nodes failing is less (around 10%).

When the percentage of failing nodes increases, the loss of messages happens mainly

because of the resulting partitions of the network into disjoint sub-networks.




                                            91
               Figure 6.9 – A graph showing the routing efficiency for the
               chain-like interval routing structure scenario and a random
               interval routing structure scenario.


6.9   Routing efficiency (Chained VS Random structures)

The basic Interval Routing protocol defined in [25] is a tree structure. Figure6.9 shows

a comparison graph on routing efficiency. One curve represents the average number of

hops taken by messages to reach their respective destinations for a scenario where the

interval routing tree structure is a single chain (a worst-case scenario in terms of the

depth of the tree structure). The other curve stands for the average number of hops taken

by messages for a random interval routing tree structure scenario. Interestingly, it turns

out that the routing efficiency of the chain topology of the interval routing tree is better

than a random tree structure scenario. This reflects the formidable


                                            92
              Figure 6.10 – A graph comparing the routing efficiency
              over Interval Routing Structures of a grid topology with a
              hypercube topology. In both cases, the node numbering is
              random.

effect of frond links for routing messages. Additionally, the average number of hops is

also fairly close to the optimum. This is not an accurate estimate of the average number

of hops since there is a fair amount of loss of messages due to collisions (20%

approximately).


6.10 Routing efficiency (Grid VS Hypercube topologies)

The graph in Figure 6.10 clearly shows the effect of increased connectivity on routing

efficiency in the network topology. Again, the node numbering of the nodes in the two

topologies is random, which leads to formation of random topology interval routing




                                          93
structures. The graphs are not 100% accurate because of the loss of a fraction of

messages due to collisions.

   Another important observation that has been made during our simulations is that the

time required for the construction of the Interval Routing Structures linearly increases

with the increase in network size. This is due to the depth-first expansion nature of the

Interval Routing Structures. An alternate method for faster construction of interval

routing structures has been suggested in [15]. This method constructs a breadth-first

spanning tree structure over the network followed by the parallel interval routing

structure construction over the breadth-first tree. The interval routing structure

construction takes time linearly proportional to the diameter of the network. But this

efficiency is accompanied by a major drawback – use of frond links in the routing

infrastructure becomes highly restricted (not all existing links in the network can be used

in the infrastructure). But there is definitely some scope of research in that direction.




                                              94
                                     CHAPTER 7

                                   CONCLUSION


The Publish/Subscribe communications paradigm has been under considerable study

recently in the context of the ad hoc sensor network environment. This thesis has

primarily been motivated by the need of conserving memory for Publish/Subscribe in

sensor networks. Interval Routing is an ingenious method for memory efficient routing

for static distributed systems. To the best of our knowledge, this thesis is the first ever

attempt of studying the use of the Interval Routing Infrastructure for the

Publish/Subscribe paradigm over the ad hoc sensor networks. The primary goal of

memory conservation is satisfied by layering the Publish/Subscribe Brokering System on

top of the Interval Routing Infrastructure. The periodic purging and reconstruction

process of the interval routing structures yields us with the second goal of fault tolerant

routing for the dynamic sensor network environment. It has also been proved with

simulations that the routing infrastructure maintenance process does not consume a

significant amount of bandwidth and energy. Even though the interval routing structure

does not yield the system with optimal routing, the routing performance is reasonably

acceptable. The distribution of messages over the coexisting interval routing structures is

also reasonably uniform giving better load balancing functionality in the system. All in

all, this thesis has been a reasonably good start towards exploring the use of Interval

Routing in the Publish/Subscribe paradigm for sensor networks.




                                            95
7.1    Future Research opportunities

The discussions made in this thesis are by no means a complete account about the

applicability of Interval Routing enabled Publish/Subscribe systems. In fact, we believe

that this work is just the beginning for more interval routing based dynamic

publish/subscribe systems. More subtle aspects of our publish/subscribe protocol may be

revealed after a deeper understanding of the dynamics of the system in the ad hoc sensor

network environment. We can foresee quite a few research opportunities for further study

from the system proposed in this thesis itself.

      It may be argued that the current system is not highly scalable since the routing

infrastructure construction process takes time linearly proportional to the network size.

The idea mentioned in concluding paragraph of the previous chapter is worth evaluating

and refining. The resulting construction improvement could act as a very strong support

for scalability. A trade-off between the construction efficiency with the overall message

routing efficiency can be studied in great depth.

      Due to the depth-first construction process in the current system the proximity of

communicating nodes in the routing infrastructure is not really the same as the

geographical proximity of the nodes. The frond links do help in using the geographic

proximity at times, but not always. An addendum can be made to the existing system so

that every node may maintain additional information about the nearby nodes for ensuring

more efficient routing.

      Another important routing effectiveness and performance enhancement could be done

by incorporating localization in publish/subscribe. For example, if the number of fixed

subscribers in the system is very few, the nearby nodes may alternately execute the role



                                             96
of a broker node. Localization coupling enables the Publish/Subscribe system with better

knowledge of the environment of the sensor networks. Use of localization can also

enhance performance by reducing message delays, increasing throughput and conserving

energy.

   An important issue that has been ignored in the discussions of this thesis is mobility.

How does our Interval Routing enabled Publish/Subscribe communications protocol

perform in the presence of mobile publishers/subscribers? In this thesis we have always

assumed that the nodes in the system are immobile. So what is needed in the existing

system to support mobility of nodes? This is a broadly open question that could be

answered after a considerable amount of study and understanding of the dynamics of the

Interval Routing enabled Publish/Subscribe systems.

   In the research of routing protocols for sensor networks, there have been three major

aspects studied – energy conservation, fault tolerance and scalability. As regards to better

scalability, our proposed routing infrastructure is not very highly scalable. The primary

reason being that the entire system is one flat infrastructure. Contemplating scalability

over networks containing millions of nodes or even hundreds of thousands of nodes

seems to be difficult. So how can interval routing publish/subscribe infrastructures ever

be highly scalable? A hybrid of clustering systems with interval routing publish/subscribe

systems is worth studying in this direction. Imagine a network of millions of nodes. Then

is there a possibility of creating clusters of interval routing publish/subscribe infra-

structures, which interact with each other over a higher-level interval routing infrastruc-

ture? In other words, can we have a hierarchy of interval routing infrastructures?




                                            97
   An extremely important aspect of the publish/subscribe brokering systems is the use

of brokers in a dynamic fashion. The role of brokers is varied. We have identified brokers

as aggregation points, and publication/subscription brokers. In our design, we have

allotted the broker/non-broker status to nodes in a randomized manner. For the sake of

efficiency and better load balancing, the best possible brokering system will have clusters

of nodes around brokers uniformly distributed over the network. This implies a clustered

orientation of the routing infrastructure. Every cluster may have a single main broker

node and probably a few backup broker nodes. Periodically, brokers within clusters may

step down and pass over the broker role to another node in their respective clusters.

Given such a required behavior, how does it fit into an interval routing enabled routing

infrastructure?

   In this thesis work, an unmentioned problem in the implementation of comm-

unications infrastructures has been the issue of message collisions. The broadcast-like

nature of the wireless communications raises this issue. Even after adding redundancy in

sent messages, there was an overall loss of 20% messages observed in our simulations.

The problem of collisions causes unwarranted loss of bandwidth and power drain. Some

kind of time division multiplexing for broadcasting messages could be implemented in

neighborhood of every node in the system. This would require a distributed scheduler

implementation in the sensor networks. But, does the scheduler assume time-

synchronization between neighboring nodes? And how scalable will a scheduler be?

Maybe a global scheduler is not a good idea after all due to weak support for scalability.

Can an idea like graph coloring be implemented? Does this lead to clusters of time-




                                            98
synchronized nodes? Several factors come up while even starting to contemplate about

this problem.

   Security in sensor networks is a topic that is also being researched lately. The limited

resources available over sensor nodes pose new challenges for the use of cryptography in

communications. Denial-of-service attacks are the most threatening since they can be

used to hog out the energy reserves of sensor nodes very easily. The security problem,

being a very broad subject per se, is also applicable for our interval routing enabled

publish/subscribe system.




                                           99
                                  BIBLIOGRAPHY

[1]   G. Asada, T. Dong, F. Lin, G. Pottie, W. Kaiser, and H. Marcy. Wireless integrated

      network sensors: Low power systems on a chip. In European Solid State Circuits

      Conference, The Hague, Netherlands, October 1998.

[2]   B. Atwood. B. Warneke, and K.S.J. Pister. Preliminary circuits for smart dust. In

      Proceedings of the 2000 Southwest Symposium on Mixed-Signal Design, San

      Diego, California, USA, February 27-29, 2000.

[3]   E.M. Bakker, J. Van Leeuwen, and R.B. Tan. Prefix routing in dynamic networks.

      Computer Networks and ISDN Systems, 26, 403-421. 1993.

[4]   G. Banavar, T. Chandra, B. Mukherjee, J. Nagarajarao, et al. An Efficient Multicast

      Protocol for Content-Based Publish-Subscribe Systems. In Proceedings of the 19th

      International Conference on Distributed Computing Systems, 262-272. 1999.

[5]   K. Bult, A. Burstein, D. Chang, M. Dong, et al. Low power systems for wireless

      microsensors, Proceedings of the 1996 international symposium on Low power

      electronics and design, p.17-21, August 12-14, 1996, Monterey, California, United

      States.

[6]   N. Bulusu, J. Heidemann, and D. Estrin. Adaptive beacon placement. Proc. Int.

      Conf. Distributed Computing Systems, pp. 489-498, 2001.

[7]   N. Bulusu, J. Heidemann, and D. Estrin. GPS-less low cost outdoor localization for

      very small devices. IEEE Personal Communications, 7(5): 28-34, October 2000.

      Special Issue on Smart Spaces and Environments.




                                          100
[8]   A. Cournier, A.K. Datta, F. Petit, and V. Villain. Snap-stabilizing PIF Algorithm in

      Arbitrary Networks. In 22nd International Conference on Distributed Computing

      Systems (ISDCS’02). 199-206. 2002.

[9]   D.E. Culler, J. Hill, P. Buonadonna, R. Szewczyk, et al. A Network-Centric

      Approach to Embedded Software for Tiny Devices. EMSOFT 2001, Oct 2001.

[10] Dale Skeen. Vitria’s Publish-Subscribe Architecture: Publish-Subscribe Overview.

      http://www.vitria.com/.

[11] M.J. Dong , K.G. Yung , W.J. Kaiser. Low power signal processing architectures

      for network microsensors, Proceedings of the 1997 international symposium on

      Low power electronics and design, p.173-177, August 18-20, 1997, Monterey,

      California, United States.

[12] J. Elson, and D. Estrin. Time synchronization for wireless sensor networks. In Proc.

      Workshop on Parallel and Distributed Computing Issues in Wireless Networks and

      Mobile Computing, Sept. 2001.

[13] D. Estrin, R. Govindan, J. Heidemann, and S. Kumar. Next century challenges:

      Scalable coordination in sensor networks. In MOBICOM 1999, pages 263-270.

      ACM Press.

[14] J. Feng, F. Koushanfar, and M. Potkonjak. System-Architectures for Sensor

      Networks: Issues, Alternatives, and Directions. ICCD, Special Session on Sensor

      Networks, Freiburg, Germany, Sep 2002. p.112-121.

[15] X. Feng, and C. Han. A Fault-Tolerant Routing Scheme in Dynamic Networks. In

      Journal of Computer Science and Technology. vol 16, no 4, 371-380. 2001.




                                           101
[16] D. Gay, P. Levis, R.V. Behren, M. Welsh, et al. The nesC Language: A Holistic

     Approach to Networked Embedded Systems. To appear in Proceedings of

     Programming Language Design and Implementation (PLDI), 2003, June 2003.

[17] S. Han, and K. Shin. Experimental Evaluation of Failure-Detection Schemes in

     Real-time Communication Networks. In IEEE Trans. On Parallel and Distributed

     Systems, vol.10, no.6, 122-131. 1999.

[18] H. Hassanein, and A. Zhou. Routing with load balancing in wireless Ad hoc

     networks. Proceedings of the 4th ACM international workshop on Modeling,

     analysis and simulation of wireless and mobile systems 2001, Rome, Italy.

[19] J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler, and K. Pister. System

     architecture directions for networked sensors. In Proceedings of the 9th ACM

     International Conference on Architectural Support for Programming Languages

     and Operating Systems, pages 93-104, Cambridge, Massachusetts, Nov. 2000.

[20] Y. Huang, and H. Garcia-Molina. Publish/Subscribe in a Mobile Environment,

     MobiDE 01.

[21] C. Intanagonwiwat, R. Govindan, and D. Estrin, Directed diffusion: A scalable and

     robust communication paradigm for sensor networks. In Proceedings of the Sixth

     Annual International Conference on Mobile Computing and Networking

     (MobiCOM), 2000.

[22] F. Koushanfar, M. Potkonjak, and A. Sangiovanni-Vincentelli. Fault Tolerance in

     Wireless Ad Hoc Sensor Networks. IEEE Sensors 2002 (the first IEEE inter-

     national conference on sensors), June 2002.




                                         102
[23] E. Kranakis, D. Krizanc, and S.S. Ravi. On multilabel interval routing schemes.

     Computer Journal, 39, 133-139. 1996.

[24] J. Kymissis, C. Kendall, J.A. Paradiso, and N. Gershenfeld. Parasitic power

     harvesting in shoes. ISWC, pages 132-139, 1998.

[25] J.V. Leeuwen, and B. Tan. Interval Routing. Computer Journal, 30, 298-307. 1987.

[26] J. Van Leeuwen, and R. Tan. Compact Routing Methods: a Survey. Research

     Report. Proc. Research Meeting on Structural Information and Communication

     Complexity, UniversitadiSiena. 1997.

[27] P. Levis, and D. Culler. Mate: a Virtual Machine for Tiny Networked Sensors. In

     Proceedings of the ACM Conference on Architectural Support for Programming

     Languages and Operating Systems (ASPLOS), 2002.

[28] N. Malpani, J. Welch, and N. Vaidya. Leader election algorithms for mobile ad hoc

     networks. In Proceedings of the 4th International Workshop on Discrete Algorithms

     and Methods for Mobile Computing and Communications. 96-103. 2000.

[29] M.D. May, P.W. Thompson, and P.H. Welch. Networks, Routers and Transputers.

     IOS Press, Amsterdam. 1993.

[30] V. Narayanmurthy. A Publish/Subscribe scheme for networked embedded systems.

     Masters Thesis, Dept. of Computer Science, The University of Iowa. August 2002.

[31] K.S.J. Pister, J.M. Kahn, and B.E. Boser. Smart Dust: Wireless networks of

     millimeter-scale sensor nodes. Electronics Research Laboratory Research

     Summary, UC Berkeley, 1999.




                                            103
[32] J.M. Rabaey, M.J. Ammer, J.L.Jr. da Silva, D. Patel, el at. PicoRadio supports ad

     hoc ultra-low power wireless networking. Computer, July 2000, vol.33, (no.7): 42-

     48.

[33] G.Pottie, and W. Kaiser. Wireless integrated network sensors. Comm-unications of

     the ACM, 43(5): 51–58, May 2000.

[34] V. Rodoplu, and T.H. Meng. Minimum Energy Mobile Wireless Networks. IEEE

     Journal on Selected Areas in Communications, Vol. 17, No. 8, pp. 1333-1344,

     August, 1999.

[35] R. Rozovsky, and P.R. Kumar. SEEDEX: A MAC protocol for ad hoc networks.

     MOBIHOC 2001. New York, NY, USA:, pg. 67-75.

[36] P. Ružièka. A note on the efficiency of an interval routing algorithm. Computer

     Journal, 34, 475-476. 1991.

[37] A. Salhieh, J. Weinmann, M. Kochha, and L. Schwiebert, Power efficient

     topologies for wireless sensor networks, ICPP 2001, pp. 156-163 18.

[38] N. Santoro, and R. Khatib. Labeling and implicit routing in networks. Computer

     Journal, 28, 5-8. 1985.

[39] TIBCO Inc. TIB/Rendezvous.

     http://www.tibco.com/products/rv/index.html.

[40] http://webs.cs.berkeley.edu/tos/api/tinyos-1.x/doc/tutorial/. A Tiny-OS Tutorial.

[41] S.S.H. Tse, and F.C.M. Lau. An optimal lower bound for interval routing in general

     networks. In Proceedings in 4th International Conference on Structural

     Information and Communication Complexity (SIROCCO’97), July, Ascona,

     Switzerland. 112-124. 1997.




                                           104
[42] S.S.H. Tse, and F.C.M. Lau. More on the efficiency of interval routing. The

     Computer Journal, 41, 238-242. 1998.

[43] A. Wang, W.R. Heinzelman, and A.P. Chandrakasan. Energy-scalable protocols for

     battery operated micro sensor networks. IEEE Workshop on Signal Processing

     Systems, Pages 483 - 490, 1999.

[44] Y. Wei, J. Heidemann, and D. Estrin. An energy-efficient MAC protocol for

     wireless sensor networks. Proceedings of INFOCOM 2002.

[45] F. Ye, H. Luo, J. Cheng, S. Lu, et al. A Two-tier Data Dissemination Model for

     Large-scale Wireless Sensor Networks. In the Proceedings of Mobicom02, Atlanta,

     GA, September 2002.

[46] H. Zhang, and A. Arora. GS3: Scalable Self-configuration and Self-healing in

     Wireless Networks. Computer Networks, special issue on Wireless Sensor

     Networks, 2003.




                                        105

						
Related docs
Other docs by nyut545e2