Data Aggregation

Document Sample
Data Aggregation Powered By Docstoc
					      Hierarchical Data Aggregation Model and Group
        Management for Wireless Sensor Networks

                   Sheng-Tzong Cheng,                                                        Red-Tom Lin
       Computer Science Information Engineering,                                  Networks and Multimedia Institute,
           National Cheng Kung University                                          Institute for Information Industry,
                    Tainan, Taiwan                                                            Tainan, Taiwan

Abstract—Disseminating data generated by sensors to sinks at           Aggregation Model to address this problem. Instead of
different locations is one of essential functions in wireless sensor   propagating queries from each user to all the sensors, HDAM
networks (WSNs). Recent research focuses much on how a source          exploits a hierarchical framework so that the data are
node forwards sensing data directly to the sink. In this paper, we     aggregated along the way back to sink and though different
propose a novel dissemination scheme that obtains five-layer data      network structure.
aggregation by introducing aggregator layers between sensors
and sinks. The five aggregation layers from bottom to top are             The remainder of this paper is organized as follows. Section
raw data aggregation, in-node data aggregation, group                  2 reviews related work. Section 3 describes the main design of
aggregation, aggregator aggregation and agent aggregation              the heterogeneous network architecture and five-layer data
layers. In addition, the proposed network architecture for data        aggregation. Section 4 analyzes the communication overhead
delivery comprises two different transfer tiers. The lower tier is     and power consumption. Simulation results are in section 5 to
WSN and the upper tier is WLAN. The simulation results show            evaluate the performance of our design. Section 6 concludes
that our scheme provides excellent throughput and reduces more         the paper.
power consumption for WSNs.

Keywords - WSN; Hierarchical Aggregation; data dissemination

                       I.    INTRODUCTION
   Sensor nodes will make distributed sensing, where
thousands, or even tens of thousands of small sensors are
distributed over a vast field to obtain fine-grained, high-
precision sensing data. In most cases, after being deployed,
sensor nodes are stationary at fixed locations. They collect
useful data such as acoustic, light, temperature, seismic                     Figure 1: Wireless sensor network architecture.
measurements, and then forward the data back to the base
station or the mobile sinks. These sensors are typically powered
                                                                                            II.   RELATED WORK
by limited disposable batteries and communicating with each
other over wireless channels. Large scale sensor networks can             Distributed sensor networks have received much attention in
be deployed in adverse physical environments such as toxic             recent years. Energy-efficient data dissemination is one of the
urban locations, battlefield or remote geographic regions.1            significant research issues being addressed. Related work to
                                                                       this paper includes Directed Diffusion [1], SPIN [2], TTDD [3],
   A user requires for sensor data by sending requests, and            and DEED [4]. Here we survey TTDD and DEED.
afterwards, the data which matches the request is then
“delivered” toward that user. There are usually a large number         A. TTDD
of users interested in collecting data. Since, the users are spread
over the sensor field, it is observed that transferring data from          TTDD is a grid-based data dissemination model. TTDD
sensors to users is a critical problem [1].                            provided scalable and efficient data delivery to multiple mobile
                                                                       sinks. Each source proactively builds a grid structure and
   Motivated by energy-efficiency and scaling requirements,            disseminates data to sinks along the grid line. It doesn’t
this paper proposes new data dissemination architecture for            optimize the path from the source to sinks. Sinks can receive
WSNs. We introduce a HDAM, a Hierarchical Data                         data on the move by flooding queries to its local cell. Queries
                                                                       form mobile sinks are confined within their local cells only.
1                                                                      This localization avoids unnecessary energy consumption and
 This research was supported by the Applied Information                network overhead from global flooding by multiple sinks.
Services Development & Integration project of Institute for
Information Industry and sponsored by MOEA ,R.O.C.
B. DEED                                                                 To forward sensing data back to the agent aggregator, a
    DEED is a dynamic delay-constrained minimum-energy              dissemination tree (d-tree) is needed for disseminating data. All
dissemination protocol for WSN, which minimizes energy              routing algorithms designed for WLAN can be divided into two
consumption while satisfies delay constraints. DEED uses            categories: proactive routing protocol (DSDV [7]) and reactive
greedy multi-forwarding protocol as its routing protocol. The       routing protocol (AODV [8]). Because of the immobility of
dissemination tree (d-tree) construction is the key function in     aggregators, proactive routing protocol is more suitable for the
DEED. The d-tree finds an energy-minimizing path satisfying         construction of d-tree. When the d-tree is built after the
delay constrains form one source to multiple mobile sinks. To       disposition of aggregators, it will be reconstructed only if there
manage the delay-constrained path for mobile sinks, DEED            are faulty aggregators leading to the blocking of the returning
refreshes paths when sinks move just as delay constraints are       path. This d-tree will be used for query and data forwarding.
about to be violated. In other words, the d-tree is updated in a    The following section explains how to divide nodes into
distributed way without regenerating the tree from scratch.         several groups in our network model.

                    III.   NETWORK MODEL
   Most recent research works on finding a path from data
source to sink [1] [3]. The researches focus on finding optimal
paths for sources and destinations. However, when the number
of data sources and sinks becomes large, the cost for building
an individual path from a data source to a sink becomes huge
due to the extreme power consumption and network overheads.
Dealing with this problem, we design novel data dissemination
with the distributed data aggregation, call HDAM,                               Figure 2: Two-tier network architecture.
Hierarchical Data Aggregation Model.
                                                                    B. Leader election
   The rest of this section presents the basic design of HDAM,
which works with the following network settings:                        The essential operation in sensor clustering is to select a set
                                                                    of group leaders among sensors in the network, and perform
       A vast field is covered by a large number of                clustering of the rest of nodes. Group leaders are responsible
        homogeneous sensor nodes which communicate with             for coordinating the nodes within their groups and
        each other through short-radio medium.                      communication with aggregators on behalf of their group. With
       Each sensor is aware of its own location (using GPS or      clustering, nodes transmit sensor data to their group leaders. A
        other locating techniques).                                 leader aggregates the received sensor data and forwards these
                                                                    data to the aggregator. Network lifetime is prolonged through
       Each sensor does the sensing task periodically. Once
        the time period expires, the sensor gets the sensing data          Reducing the nodes contending for channel access, and
        and processes it for further reporting. Every sensor has           Routing through an overlay network among group
        the same time period length. All time periods will                  leaders included relatively small network diameter.
        expire at different time due to various boot time.
                                                                        The goal of our approach is to prolong network lifetime.
       Both sensor nodes and aggregators are stationary at         We use a multi-round leader election to decide the leader of
        different fixed locations.                                  nodes. Multi-round leader election includes an initialization
    The above assumptions are consistent with the models for        phase which pre-elects a portion of nodes to be candidates (the
real sensors being built, such as Berkeley Motes [6].               potential competitors for leadership); others are excluded from
                                                                    candidates. In the next round election, only candidates take part
                                                                    in the election. Figure 3 demonstrates the operation of the two-
A. Heterogeneous netowrk architecture                               round leader election procedure. The left side is the first round,
    HDAM consist of five layers of data aggregation. These          and the right side finds the final leaders. After the first round,
five layers are mapped to the two different network tiers,          some nodes become candidates and take part in the next round.
WLAN and WSN. The mapping and the network structure is              The next round election chooses some candidates as leaders.
shown in Figure 2.
    WLAN is deployed over WSN in our network model. In the          C. Query forwarding
upper tier of Figure 2, an aggregator communicates with                Our query forwarding is based on the d-tree built in WLAN
another aggregator by using WLAN transmission protocol. The         to ensure efficiency and robustness. When a sink (potentially
bottom tier is WSN in which the transmissions happen among          mobile) want to get data, it floods a query within an area to
sensors, group leader, and aggregator. The proposed                 discover nearby aggregators. The sink specifies a maximum
architecture improves efficiency of data delivery and reduces       radius which is usually the length of its longest radio range.
the collision possibility when the transmission of sensor           The query flooding will stop at points that are about to exceed
between aggregator and the transmission of this aggregator          the maximum radius away from the sink. If no aggregator is
between other aggregator happen at the same time.                   found to be the agent aggregator after the query forwarding, the
sink should move around until it finds at least one aggregator in    different types of reports. By reports, we mean the processed
order to attach itself to the network to get the interested data.    information about sensed data after aggregation. The sensory
                                                                     data generated by each sensor is sent to the leader. This
                                                                     particular node sends leader report to the aggregator. Then the
                                                                     aggregator will transmit this report to all “agent aggregators”.
                                                                     That is, through the d-tree constructed in WLAN. The agent
                                                                     collects reports for each sink connected to it and sends to that

        Figure 3: Two-round leader election procedure.                           Figure 5: Data forwarding architecture.

                                                                              Figure 6: Five-layer data aggregation model.

      Figure 4: Two users connecting to two aggregators.             E. Five-layer data aggregation
   When an aggregator becomes the sink’s agent upon the                 Usually, data aggregation ratio for a given system is the size
receiving of the query, it is called "agent aggregator". The         of the original data divided by the aggregated data. Useful
agent aggregator then forwards this query to its neighbor            information could be lost if data aggregation is only done at the
aggregators. The upstream aggregator in turn forwards the            node level. If data aggregation is only done at a central site (ex.
query to its neighbors repeatedly until all of the aggregators get   aggregator or base station), a sensor network may waste energy
the query coming from the agent aggregator. Therefore, when          in transmitting data and suffer from network congestion and
aggregators have to report the sensed data, they know which          message losses [3]. To balance energy efficiency, we propose a
aggregators are set as destinations. In other words, all non-        five-layer data aggregation architecture as shown in Figure 6.
agent aggregators need to know only about which aggregators             The first layer (T1) is the raw data aggregation layer, which
are interested in their sensing task, they do not care for neither   takes the sensor reading from the individual ones. The second
the number of sinks that the agent aggregators are responsible       layer is the in-node aggregation, at which the sensor readings
for, nor the detail of the sinks (for example, the location of the   got at T1 are processed. Each group is represented by a group
sinks). Because all information of the sink is managed only by       leader at the third layer. The leader aggregates all the reports
its agent aggregator, if the sink moves too far away from its        from the member nodes and sends a leader report, consisting of
agent aggregator, it should find a new one.                          a representative data of this group, to the aggregator. The forth
   Figure 4 is an example of query forwarding. User 1 has            layer is aggregator aggregation layer. The aggregator processes
already connected to its agent aggregator in the right-below         the leader reports from each leader according to the queries
side. Then a new user, User 2, comes into the sensor field. It       from sinks. Finally, the fifth layer presents only at the agent
uses flooding to find its aggregator and sends query to its agent.   aggregators. An agent aggregator aggregates the individual
Then the agent aggregator spreads this query to all other            reports to obtain a final report to each sink connected to it.
aggregators through the d-tree in WLAN. The dotted lines               1) L1: raw data aggregation
show the query’s trail. The following two sections discuss the          The sensor data is the raw input to the sensor network’s
data forwarding from sensor nodes to users.                          computation work flow. It provides the base of the processing
                                                                     for any event in the network [1]. In the rest of this section, we
D. Data forwarding mechanism                                         will introduce different types of sensors and sensor data on
   Once the sensor has generated sensory data, this sensory          Tmote, and then discuss possible aggregation methods.
data will transmit from sensors to sinks through our data
forwarding mechanism. The mechanism includes links from (1)            We call the sensor reading at a specific time on a special
node to node, (2) node to aggregator, (3) aggregator to              node as a sample point. When the sensor network starts, each
aggregator and (4) aggregator to sink. Figure 5 shows the            sensor node in the network produces a sequence of sample
architecture of our data forwarding mechanism including              points. All the sample points produced by all nodes form a
global sample set. The global sample set is the complete              match not only one sink’s query. The agent aggregator knows
information about what happens in the network. If all the             which sinks are interested in this message. It buffers one copy
nodes report their sample points to sinks, sinks then can             for each interested sink. When the number of copy exceeds the
collect the global sample set and perform computation on it.          threshold, the agent sends that sink’s “package” to them.
However, due to resource and energy limitation, it will be               Recall the four types of reports in data forwarding
better to deliver same information with fewer bits. One               architecture illustrated in the previous section. First, the
potential way to reduce this enormous quantity is to let only         member report is generated by sensors after T1 and T2
portion of nodes active. This method is similar to SMAC, only         aggregation. Second, the group leader receives member report
active nodes do the sensing task to prolong system lifetime, to       and generates leader report using T3 data aggregation. Third,
reduce the probability of collision and redundant information.        after the aggregator collects leader reports from leaders, it
Another simpler way is to set a fewer sample points. In an            packages them to generate aggregator package by using T4
equal time period, the sensor nodes generate data in a slower         data aggregation. Finally, the agent aggregator periodically
frequency.                                                            sends agent reports to users after processing all packages with
  2) L2: in-node aggregation                                          T5 data aggregation.
   Data compressed in T1 is still too large if the transmission
happens at each sample time. One potential method of                                       IV.    PERFORMANCE ANALYSIS
aggregating the sensor date is to compute the differences
between consecutive sample times and the difference will be           A. Power consumption model
sent out. The advantage of sending only the differences is that          In this section, we derive a power consumption model for
the difference is usually in a small range and can be                 the communication subsystem in a WSN device. The physical
represented by a fewer bits than original sensor data. Another        communication rate is constant and assumed to be m bits per
way is to merge identical data and send out a representative          second in the model. The energy consumed for sending a
data, for example, summary or average of these data.                  packet of m bits over one hop wireless link can be expressed as
Aggregation method used in this layer depends on the type of
sensing task. Application types may be taken into account in                              P(m)  P  PR  PT (m, d )  PR (m)                  
                                                                                                   T                                
the aggregation algorithm.
                                                                      where P(m) is the total power consumption, PT and PR are the
  3) L3: group aggregation
  A simple way for sensors to report data is to send the              power consumption for transmission and receiving. PT and
detection results to a centralized base station. In our design, the   PR can be extend to PT (m, d ) and PR (m) . PT (m, d ) is a function of
base stations are the aggregators. However, such centralized          both transfer range and the number of transmitting bits,
scheme is inefficient in energy consumption and latency. It           while PR (m) is only depend on the number of transmitting bits.
results in excessive power consumption because of transmitting
multiple results to a centralized aggregator.                                        Table 1: Power Consumption Parameters.
  To avoid the above issue, we divide nodes into groups. Each             PTB       Power consumption in baseband DSP circuit for transmitting
group leader will be a reporter to the aggregator. Leaders are            PRB       Power consumption in baseband DSP circuit for receiving
pre-elected in the initialization phase. Our system uses a static
leader election. Leader election will take place again when the           PTRF      Power consumption in front-end for transmitting
leader node fails. Data will be processed by these group
                                                                          PRRF      Power consumption in front-end for receiving
leaders and send only aggregation results to the aggregator.
                                                                          PL        Power consumption of LNA for receiving
  4) L4: aggregator aggregation
   After receiving the leader reports from leader nodes, the              PTA d    Power consumption of the power amplifier
aggregator needs to perform further aggregation so that the
collected information become semantic pattern. The aggregator
                                                                        Figure 7 illustrates the internal structure of the
bridges WLAN and WSN. The aggregation functionalities can
                                                                      communication module found in a typical WSN node, and
be summarized as follows. First, the aggregator takes the input
                                                                      definition of the power consumption of each component.
from the group leaders through WSN. Second, according to the
queries received from agent aggregator, the aggregator filters
out useless message. Third, the aggregator makes use of the
remaining message to provide extra information for the sinks
according to the queries. The aggregator sends this information
back to other agent aggregators through the d-tree in WLAN.                         Figure 7: Communication Module Structure.

  5) L5: agent aggregation                                              Based on the structure and power consumption of each
   The agent aggregator receives message from aggregators.            component, and can be written as
The agent disassembles them and classifies them by sink’s ID.                         P (m, d )  P  P  P d   P 0  P d 
Once a package for one sink is completed, it is sent directly to                      T           TB  TRF TA      T     TA                    
that sink. In other words, message from one aggregator may                                   PR (m)  PRB  PRRF  PL  PR 0
                                                                                                                                              
where PTA d  is a function of both the transmission range (d),
and the number of transmitting bits. α is the path loss exponent
whose value varies from 2 (for free space) to 4 (for multi-path
channel models). Since PTB and PTRF do not depend on the
transmission range, these two parameters can be viewed as a
constant, PT 0 . Similarly, PRB and PRRF can be viewed as constant.
Since we assume the LNA is properly designed and biased to
provide the necessary sensitivity to reliably receive, PL is also a
constant. PRB , PRRF , and PL can be modeled as a constant, PR 0 [9].
                                                                           Figure 8: Leaders of one and two round leader election.
                  V.    EXPERIMENTAL RESULTS
    We use TOSSIM as the simulator for evaluating the
performance. In this section, the setting of control parameters
will be given. The performance of group management and data
aggregation will be illustrated as well.

    TOSSIM is a discrete event simulator for TinyOS WSN.
Instead of compiling a TinyOS application onto a Tmote, we
compile it into the TOSSIM modules running on a PC. This
allows users to debug, test, and analyze in a controlled and
repeatable environment. As TOSSIM runs on a PC, we can                   Figure 9: Leaders (fixed nodes and different back off time).
examine our TinyOS code using debuggers and other                                    Table 2: Symbols used in data aggregation.
development tools.                                                           Term                   Description             Unit   Value
                                                                              LGL       The length of Group leader report   Byte    9
B. Multi-Round Leader Election
    We evaluate multi-round leader election by comparing the                  LGM       The length of Group member report   Byte    2
number of leader between a one-round leadership resolution
                                                                              Lind      The length of individual report     Byte    2
and a two-round leader election. We implement the one-round
leader election and two-round leader election algorithms and
vary the number of nodes complete for the leadership from 25               In Figure 10, we can observe that using the results of
to 70.                                                                  election as group rules can reduce the traffic load. Moreover,
                                                                        using two-round leader election implies more efficient
    First, we examine how the maximum back off time affects             transmission. In Figure 11, we divide total number of message
the number of leader. In Figure 8, using the number of leaders          into member reports and leader reports .We observed that using
in the one-round election as the baseline, we can see that the          multi-round leader election leads to fewer reports. Fewer
number in the two-round algorithm can reduce the number of              reports imply fewer transmissions.
leaders. The average ratio is between 0.5 ~ 0.8.
    If the back-off time is fixed as a constant, as the number of
nodes increases, the number of leaders does not rise
dramatically. Less variation on the number of leaders implies
less variation of power consumption for sensor nodes. It also
means that a simpler network management is required for the
whole WSN. Figure 9 shows the number of leaders when the
back-off time is set as 25.

C. Data Aggregation
    The application we adopted for data aggregation is
temperature reading and collection. Sensor nodes report
                                                                           Figure 10: Traffic load of different sensor organizations.
temperature once per second and the sensed data follows a
normal distribution N (  ,  2 ) , where μ is the mean temperature     In Figure 12, we observe that the number of leader reports sent
at a location, and σ = 0.5. The setting is consistent with the          by each leader is larger than that using one-round leader
accuracy of typical thermistors in sensor networks [10]. All            election. There are fewer leaders but the number of total nodes
nodes are divided into several groups according to the                  is the same. The loading of each leader under two-round
experiments in the previous section. Table 2 lists notation and         leader election is heavier than that of one round leader election.
values used in the experiments in Tmote-based power                     The power consumption in each member report transmission is
set to be 52.2 mW. Figure 13 shows that power consuming in                                      VI.     CONCLUSION
transmitting member reports for each one is small. Using two-          We summarized the difference between TTDD and HDAM
round leader, election can reduce the power consumption but        in Table 3. In this paper, we proposed a data dissemination
complete the same sensing task. Besides, lower power               system base on WSN and WLAN including query and sensory
consumption implies longer system lifetime. Because power          data forwarding by introducing aggregators between sensors
consumption is about a critical problem in WSN, dividing           and users. We divided sensors into several groups by using
sensor nodes into several groups using multi-round leader          multi-round leader election. In order to achieve a more efficient
election is one of better solutions to prolong network lifetime.   transmission, we integrated data aggregation model into our
                                                                   data dissemination model. We observed that two-round leader
                                                                   election performs better in terms of not only traffic load but
                                                                   also power consumption. We illustrated that data forwarding
                                                                   using hierarchical data aggregation model and group
                                                                   management is more efficient than creating a direct path from
                                                                   sensors to sinks in WSN.
                                                                         Table 3: The comparisons between HDAM and TTDD.
                                                                    Aspects                           HDAM                 TTDD
                                                                    Transmission architecture         WLAN and WSN         WSN only
                                                                                                                           1. User
                                                                                                      1. User
                                                                    Query forwarding model                                 2. Sensor node
                                                                                                      2. Aggregator
            (a) Average number of member reports.                                                                          3. data source
                                                                                                      Multi-round leader
                                                                    Group management                                       None
                                                                                                      WLAN routing
                                                                    Dissemination tree                                     grid structure
                                                                                                      Five-tier data
                                                                    Data aggregation                                       Unknown
                                                                                                      aggregation model
                                                                                                      handled by agent     handled by normal
                                                                    Mobile user management
                                                                                                      aggregators          sensors

                                                                   [1]  C. Intanagonwiwat, R. Govindan, D. Estrin and J. Heidemann, “Directed
            (b) Average number of leader reports                        Diffusion for Wireless Sensor Networking,” IEEE/ACM Transaction on
                                                                        Networking, vol. 11, pp. 529–551, February 2003.
   Figure 11: Average member reports and leader reports.
                                                                   [2] W. Heinzelman, J. Kulik and H. Balakrishnanl, “Negotiation-based
                                                                        protocols for disseminating information in wireless sensor networks”,
                                                                        Wireless Networks, vol. 8, pp.169-185 , March 2002.
                                                                   [3] F. Ye, H. Luo, J. Cheng, S. Lu, and L.Zhang. “A two-tiew data
                                                                        dissemination model for large scale wireless sensor networks,” Wireless
                                                                        Networks, vol.11 pp. 161-175, January 2005
                                                                   [4] H. S. Kim and W. H. Kwon “Dynamic Delay-Constrained Minimum-
                                                                        Energy Dissemination in Wireless Sensor Networks”, ACM Transaction
                                                                        on Embedded Computing Systems Vol. 4, pp. 679-706, August 2005.
                                                                   [5] R.S. Chang and A.C. Lee, “An Energy Efficient Data Query
                                                                        Architecture for Large Scale Sensor Networks”, IEICE
                                                                        TRANSACTIONS ON COMMUNICATIONS, vol. 2, pp. 217-227,
                                                                        February 2007.
        Figure 12: Average number of leader reports.
                                                                   [6] J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler and K. Pister,
                                                                        “System architecture directions for networked sensors”, Proceedings of
                                                                        International Conference on Architectural Support for Programming
                                                                        Language and Operating Systems(ASPLOS-IX),2002
                                                                   [7] C. E. Perkins and P. Bhagwat “Highly dynamic Destination-Sequenced
                                                                        Distance-Vector routing (DSDV) for mobile computers”, ACM
                                                                        SIGCOMM Computer Communication Review , Proceedings of the
                                                                        conference on Communications architectures, protocols and applications
                                                                        SIGCOMM '94, pp. 234-244 Vol. 24, October 1994
                                                                   [8] C. E. Perkins and E. M. Royer, ”Ad-hoc On-Demand Distance Vector
                                                                        Routing “Proceedings of the Second IEEE Workshop on Mobile
                                                                        Computer Systems and Applications WMCSA '99 , 1999
                                                                   [9] Q. Wang,       M. Hempstead and W. Yang, “A Realistic Power
                                                                        Consumption Model for Wireless Sensor Network Devices”, In
     Figure 13: Average power consumption for member                    Proceedings of the Third Annual IEEE Communications Society
                       transmission.                                    Conference on Sensor, Mesh and Ad Hoc Communications and
                                                                        Networks(SECON). Reston, VA, September 2006.
                                                                   [10] MTS/MDA Sensor and Data Acquisition Boards User Manual,May 2003.

Shared By: