Hierarchical Data Aggregation Model and Group
Management for Wireless Sensor Networks
Sheng-Tzong Cheng, Red-Tom Lin
Computer Science Information Engineering, Networks and Multimedia Institute,
National Cheng Kung University Institute for Information Industry,
Tainan, Taiwan Tainan, Taiwan
Abstract—Disseminating data generated by sensors to sinks at Aggregation Model to address this problem. Instead of
different locations is one of essential functions in wireless sensor propagating queries from each user to all the sensors, HDAM
networks (WSNs). Recent research focuses much on how a source exploits a hierarchical framework so that the data are
node forwards sensing data directly to the sink. In this paper, we aggregated along the way back to sink and though different
propose a novel dissemination scheme that obtains five-layer data network structure.
aggregation by introducing aggregator layers between sensors
and sinks. The five aggregation layers from bottom to top are The remainder of this paper is organized as follows. Section
raw data aggregation, in-node data aggregation, group 2 reviews related work. Section 3 describes the main design of
aggregation, aggregator aggregation and agent aggregation the heterogeneous network architecture and five-layer data
layers. In addition, the proposed network architecture for data aggregation. Section 4 analyzes the communication overhead
delivery comprises two different transfer tiers. The lower tier is and power consumption. Simulation results are in section 5 to
WSN and the upper tier is WLAN. The simulation results show evaluate the performance of our design. Section 6 concludes
that our scheme provides excellent throughput and reduces more the paper.
power consumption for WSNs.
Keywords - WSN; Hierarchical Aggregation; data dissemination
Sensor nodes will make distributed sensing, where
thousands, or even tens of thousands of small sensors are
distributed over a vast field to obtain fine-grained, high-
precision sensing data. In most cases, after being deployed,
sensor nodes are stationary at fixed locations. They collect
useful data such as acoustic, light, temperature, seismic Figure 1: Wireless sensor network architecture.
measurements, and then forward the data back to the base
station or the mobile sinks. These sensors are typically powered
II. RELATED WORK
by limited disposable batteries and communicating with each
other over wireless channels. Large scale sensor networks can Distributed sensor networks have received much attention in
be deployed in adverse physical environments such as toxic recent years. Energy-efficient data dissemination is one of the
urban locations, battlefield or remote geographic regions.1 significant research issues being addressed. Related work to
this paper includes Directed Diffusion , SPIN , TTDD ,
A user requires for sensor data by sending requests, and and DEED . Here we survey TTDD and DEED.
afterwards, the data which matches the request is then
“delivered” toward that user. There are usually a large number A. TTDD
of users interested in collecting data. Since, the users are spread
over the sensor field, it is observed that transferring data from TTDD is a grid-based data dissemination model. TTDD
sensors to users is a critical problem . provided scalable and efficient data delivery to multiple mobile
sinks. Each source proactively builds a grid structure and
Motivated by energy-efficiency and scaling requirements, disseminates data to sinks along the grid line. It doesn’t
this paper proposes new data dissemination architecture for optimize the path from the source to sinks. Sinks can receive
WSNs. We introduce a HDAM, a Hierarchical Data data on the move by flooding queries to its local cell. Queries
form mobile sinks are confined within their local cells only.
1 This localization avoids unnecessary energy consumption and
This research was supported by the Applied Information network overhead from global flooding by multiple sinks.
Services Development & Integration project of Institute for
Information Industry and sponsored by MOEA ,R.O.C.
B. DEED To forward sensing data back to the agent aggregator, a
DEED is a dynamic delay-constrained minimum-energy dissemination tree (d-tree) is needed for disseminating data. All
dissemination protocol for WSN, which minimizes energy routing algorithms designed for WLAN can be divided into two
consumption while satisfies delay constraints. DEED uses categories: proactive routing protocol (DSDV ) and reactive
greedy multi-forwarding protocol as its routing protocol. The routing protocol (AODV ). Because of the immobility of
dissemination tree (d-tree) construction is the key function in aggregators, proactive routing protocol is more suitable for the
DEED. The d-tree finds an energy-minimizing path satisfying construction of d-tree. When the d-tree is built after the
delay constrains form one source to multiple mobile sinks. To disposition of aggregators, it will be reconstructed only if there
manage the delay-constrained path for mobile sinks, DEED are faulty aggregators leading to the blocking of the returning
refreshes paths when sinks move just as delay constraints are path. This d-tree will be used for query and data forwarding.
about to be violated. In other words, the d-tree is updated in a The following section explains how to divide nodes into
distributed way without regenerating the tree from scratch. several groups in our network model.
III. NETWORK MODEL
Most recent research works on finding a path from data
source to sink  . The researches focus on finding optimal
paths for sources and destinations. However, when the number
of data sources and sinks becomes large, the cost for building
an individual path from a data source to a sink becomes huge
due to the extreme power consumption and network overheads.
Dealing with this problem, we design novel data dissemination
with the distributed data aggregation, call HDAM, Figure 2: Two-tier network architecture.
Hierarchical Data Aggregation Model.
B. Leader election
The rest of this section presents the basic design of HDAM,
which works with the following network settings: The essential operation in sensor clustering is to select a set
of group leaders among sensors in the network, and perform
A vast field is covered by a large number of clustering of the rest of nodes. Group leaders are responsible
homogeneous sensor nodes which communicate with for coordinating the nodes within their groups and
each other through short-radio medium. communication with aggregators on behalf of their group. With
Each sensor is aware of its own location (using GPS or clustering, nodes transmit sensor data to their group leaders. A
other locating techniques). leader aggregates the received sensor data and forwards these
data to the aggregator. Network lifetime is prolonged through
Each sensor does the sensing task periodically. Once
the time period expires, the sensor gets the sensing data Reducing the nodes contending for channel access, and
and processes it for further reporting. Every sensor has Routing through an overlay network among group
the same time period length. All time periods will leaders included relatively small network diameter.
expire at different time due to various boot time.
The goal of our approach is to prolong network lifetime.
Both sensor nodes and aggregators are stationary at We use a multi-round leader election to decide the leader of
different fixed locations. nodes. Multi-round leader election includes an initialization
The above assumptions are consistent with the models for phase which pre-elects a portion of nodes to be candidates (the
real sensors being built, such as Berkeley Motes . potential competitors for leadership); others are excluded from
candidates. In the next round election, only candidates take part
in the election. Figure 3 demonstrates the operation of the two-
A. Heterogeneous netowrk architecture round leader election procedure. The left side is the first round,
HDAM consist of five layers of data aggregation. These and the right side finds the final leaders. After the first round,
five layers are mapped to the two different network tiers, some nodes become candidates and take part in the next round.
WLAN and WSN. The mapping and the network structure is The next round election chooses some candidates as leaders.
shown in Figure 2.
WLAN is deployed over WSN in our network model. In the C. Query forwarding
upper tier of Figure 2, an aggregator communicates with Our query forwarding is based on the d-tree built in WLAN
another aggregator by using WLAN transmission protocol. The to ensure efficiency and robustness. When a sink (potentially
bottom tier is WSN in which the transmissions happen among mobile) want to get data, it floods a query within an area to
sensors, group leader, and aggregator. The proposed discover nearby aggregators. The sink specifies a maximum
architecture improves efficiency of data delivery and reduces radius which is usually the length of its longest radio range.
the collision possibility when the transmission of sensor The query flooding will stop at points that are about to exceed
between aggregator and the transmission of this aggregator the maximum radius away from the sink. If no aggregator is
between other aggregator happen at the same time. found to be the agent aggregator after the query forwarding, the
sink should move around until it finds at least one aggregator in different types of reports. By reports, we mean the processed
order to attach itself to the network to get the interested data. information about sensed data after aggregation. The sensory
data generated by each sensor is sent to the leader. This
particular node sends leader report to the aggregator. Then the
aggregator will transmit this report to all “agent aggregators”.
That is, through the d-tree constructed in WLAN. The agent
collects reports for each sink connected to it and sends to that
Figure 3: Two-round leader election procedure. Figure 5: Data forwarding architecture.
Figure 6: Five-layer data aggregation model.
Figure 4: Two users connecting to two aggregators. E. Five-layer data aggregation
When an aggregator becomes the sink’s agent upon the Usually, data aggregation ratio for a given system is the size
receiving of the query, it is called "agent aggregator". The of the original data divided by the aggregated data. Useful
agent aggregator then forwards this query to its neighbor information could be lost if data aggregation is only done at the
aggregators. The upstream aggregator in turn forwards the node level. If data aggregation is only done at a central site (ex.
query to its neighbors repeatedly until all of the aggregators get aggregator or base station), a sensor network may waste energy
the query coming from the agent aggregator. Therefore, when in transmitting data and suffer from network congestion and
aggregators have to report the sensed data, they know which message losses . To balance energy efficiency, we propose a
aggregators are set as destinations. In other words, all non- five-layer data aggregation architecture as shown in Figure 6.
agent aggregators need to know only about which aggregators The first layer (T1) is the raw data aggregation layer, which
are interested in their sensing task, they do not care for neither takes the sensor reading from the individual ones. The second
the number of sinks that the agent aggregators are responsible layer is the in-node aggregation, at which the sensor readings
for, nor the detail of the sinks (for example, the location of the got at T1 are processed. Each group is represented by a group
sinks). Because all information of the sink is managed only by leader at the third layer. The leader aggregates all the reports
its agent aggregator, if the sink moves too far away from its from the member nodes and sends a leader report, consisting of
agent aggregator, it should find a new one. a representative data of this group, to the aggregator. The forth
Figure 4 is an example of query forwarding. User 1 has layer is aggregator aggregation layer. The aggregator processes
already connected to its agent aggregator in the right-below the leader reports from each leader according to the queries
side. Then a new user, User 2, comes into the sensor field. It from sinks. Finally, the fifth layer presents only at the agent
uses flooding to find its aggregator and sends query to its agent. aggregators. An agent aggregator aggregates the individual
Then the agent aggregator spreads this query to all other reports to obtain a final report to each sink connected to it.
aggregators through the d-tree in WLAN. The dotted lines 1) L1: raw data aggregation
show the query’s trail. The following two sections discuss the The sensor data is the raw input to the sensor network’s
data forwarding from sensor nodes to users. computation work flow. It provides the base of the processing
for any event in the network . In the rest of this section, we
D. Data forwarding mechanism will introduce different types of sensors and sensor data on
Once the sensor has generated sensory data, this sensory Tmote, and then discuss possible aggregation methods.
data will transmit from sensors to sinks through our data
forwarding mechanism. The mechanism includes links from (1) We call the sensor reading at a specific time on a special
node to node, (2) node to aggregator, (3) aggregator to node as a sample point. When the sensor network starts, each
aggregator and (4) aggregator to sink. Figure 5 shows the sensor node in the network produces a sequence of sample
architecture of our data forwarding mechanism including points. All the sample points produced by all nodes form a
global sample set. The global sample set is the complete match not only one sink’s query. The agent aggregator knows
information about what happens in the network. If all the which sinks are interested in this message. It buffers one copy
nodes report their sample points to sinks, sinks then can for each interested sink. When the number of copy exceeds the
collect the global sample set and perform computation on it. threshold, the agent sends that sink’s “package” to them.
However, due to resource and energy limitation, it will be Recall the four types of reports in data forwarding
better to deliver same information with fewer bits. One architecture illustrated in the previous section. First, the
potential way to reduce this enormous quantity is to let only member report is generated by sensors after T1 and T2
portion of nodes active. This method is similar to SMAC, only aggregation. Second, the group leader receives member report
active nodes do the sensing task to prolong system lifetime, to and generates leader report using T3 data aggregation. Third,
reduce the probability of collision and redundant information. after the aggregator collects leader reports from leaders, it
Another simpler way is to set a fewer sample points. In an packages them to generate aggregator package by using T4
equal time period, the sensor nodes generate data in a slower data aggregation. Finally, the agent aggregator periodically
frequency. sends agent reports to users after processing all packages with
2) L2: in-node aggregation T5 data aggregation.
Data compressed in T1 is still too large if the transmission
happens at each sample time. One potential method of IV. PERFORMANCE ANALYSIS
aggregating the sensor date is to compute the differences
between consecutive sample times and the difference will be A. Power consumption model
sent out. The advantage of sending only the differences is that In this section, we derive a power consumption model for
the difference is usually in a small range and can be the communication subsystem in a WSN device. The physical
represented by a fewer bits than original sensor data. Another communication rate is constant and assumed to be m bits per
way is to merge identical data and send out a representative second in the model. The energy consumed for sending a
data, for example, summary or average of these data. packet of m bits over one hop wireless link can be expressed as
Aggregation method used in this layer depends on the type of
sensing task. Application types may be taken into account in P(m) P PR PT (m, d ) PR (m)
the aggregation algorithm.
where P(m) is the total power consumption, PT and PR are the
3) L3: group aggregation
A simple way for sensors to report data is to send the power consumption for transmission and receiving. PT and
detection results to a centralized base station. In our design, the PR can be extend to PT (m, d ) and PR (m) . PT (m, d ) is a function of
base stations are the aggregators. However, such centralized both transfer range and the number of transmitting bits,
scheme is inefficient in energy consumption and latency. It while PR (m) is only depend on the number of transmitting bits.
results in excessive power consumption because of transmitting
multiple results to a centralized aggregator. Table 1: Power Consumption Parameters.
To avoid the above issue, we divide nodes into groups. Each PTB Power consumption in baseband DSP circuit for transmitting
group leader will be a reporter to the aggregator. Leaders are PRB Power consumption in baseband DSP circuit for receiving
pre-elected in the initialization phase. Our system uses a static
leader election. Leader election will take place again when the PTRF Power consumption in front-end for transmitting
leader node fails. Data will be processed by these group
PRRF Power consumption in front-end for receiving
leaders and send only aggregation results to the aggregator.
PL Power consumption of LNA for receiving
4) L4: aggregator aggregation
After receiving the leader reports from leader nodes, the PTA d Power consumption of the power amplifier
aggregator needs to perform further aggregation so that the
collected information become semantic pattern. The aggregator
Figure 7 illustrates the internal structure of the
bridges WLAN and WSN. The aggregation functionalities can
communication module found in a typical WSN node, and
be summarized as follows. First, the aggregator takes the input
definition of the power consumption of each component.
from the group leaders through WSN. Second, according to the
queries received from agent aggregator, the aggregator filters
out useless message. Third, the aggregator makes use of the
remaining message to provide extra information for the sinks
according to the queries. The aggregator sends this information
back to other agent aggregators through the d-tree in WLAN. Figure 7: Communication Module Structure.
5) L5: agent aggregation Based on the structure and power consumption of each
The agent aggregator receives message from aggregators. component, and can be written as
The agent disassembles them and classifies them by sink’s ID. P (m, d ) P P P d P 0 P d
Once a package for one sink is completed, it is sent directly to T TB TRF TA T TA
that sink. In other words, message from one aggregator may PR (m) PRB PRRF PL PR 0
where PTA d is a function of both the transmission range (d),
and the number of transmitting bits. α is the path loss exponent
whose value varies from 2 (for free space) to 4 (for multi-path
channel models). Since PTB and PTRF do not depend on the
transmission range, these two parameters can be viewed as a
constant, PT 0 . Similarly, PRB and PRRF can be viewed as constant.
Since we assume the LNA is properly designed and biased to
provide the necessary sensitivity to reliably receive, PL is also a
constant. PRB , PRRF , and PL can be modeled as a constant, PR 0 .
Figure 8: Leaders of one and two round leader election.
V. EXPERIMENTAL RESULTS
We use TOSSIM as the simulator for evaluating the
performance. In this section, the setting of control parameters
will be given. The performance of group management and data
aggregation will be illustrated as well.
TOSSIM is a discrete event simulator for TinyOS WSN.
Instead of compiling a TinyOS application onto a Tmote, we
compile it into the TOSSIM modules running on a PC. This
allows users to debug, test, and analyze in a controlled and
repeatable environment. As TOSSIM runs on a PC, we can Figure 9: Leaders (fixed nodes and different back off time).
examine our TinyOS code using debuggers and other Table 2: Symbols used in data aggregation.
development tools. Term Description Unit Value
LGL The length of Group leader report Byte 9
B. Multi-Round Leader Election
We evaluate multi-round leader election by comparing the LGM The length of Group member report Byte 2
number of leader between a one-round leadership resolution
Lind The length of individual report Byte 2
and a two-round leader election. We implement the one-round
leader election and two-round leader election algorithms and
vary the number of nodes complete for the leadership from 25 In Figure 10, we can observe that using the results of
to 70. election as group rules can reduce the traffic load. Moreover,
using two-round leader election implies more efficient
First, we examine how the maximum back off time affects transmission. In Figure 11, we divide total number of message
the number of leader. In Figure 8, using the number of leaders into member reports and leader reports .We observed that using
in the one-round election as the baseline, we can see that the multi-round leader election leads to fewer reports. Fewer
number in the two-round algorithm can reduce the number of reports imply fewer transmissions.
leaders. The average ratio is between 0.5 ~ 0.8.
If the back-off time is fixed as a constant, as the number of
nodes increases, the number of leaders does not rise
dramatically. Less variation on the number of leaders implies
less variation of power consumption for sensor nodes. It also
means that a simpler network management is required for the
whole WSN. Figure 9 shows the number of leaders when the
back-off time is set as 25.
C. Data Aggregation
The application we adopted for data aggregation is
temperature reading and collection. Sensor nodes report
Figure 10: Traffic load of different sensor organizations.
temperature once per second and the sensed data follows a
normal distribution N ( , 2 ) , where μ is the mean temperature In Figure 12, we observe that the number of leader reports sent
at a location, and σ = 0.5. The setting is consistent with the by each leader is larger than that using one-round leader
accuracy of typical thermistors in sensor networks . All election. There are fewer leaders but the number of total nodes
nodes are divided into several groups according to the is the same. The loading of each leader under two-round
experiments in the previous section. Table 2 lists notation and leader election is heavier than that of one round leader election.
values used in the experiments in Tmote-based power The power consumption in each member report transmission is
set to be 52.2 mW. Figure 13 shows that power consuming in VI. CONCLUSION
transmitting member reports for each one is small. Using two- We summarized the difference between TTDD and HDAM
round leader, election can reduce the power consumption but in Table 3. In this paper, we proposed a data dissemination
complete the same sensing task. Besides, lower power system base on WSN and WLAN including query and sensory
consumption implies longer system lifetime. Because power data forwarding by introducing aggregators between sensors
consumption is about a critical problem in WSN, dividing and users. We divided sensors into several groups by using
sensor nodes into several groups using multi-round leader multi-round leader election. In order to achieve a more efficient
election is one of better solutions to prolong network lifetime. transmission, we integrated data aggregation model into our
data dissemination model. We observed that two-round leader
election performs better in terms of not only traffic load but
also power consumption. We illustrated that data forwarding
using hierarchical data aggregation model and group
management is more efficient than creating a direct path from
sensors to sinks in WSN.
Table 3: The comparisons between HDAM and TTDD.
Aspects HDAM TTDD
Transmission architecture WLAN and WSN WSN only
Query forwarding model 2. Sensor node
(a) Average number of member reports. 3. data source
Group management None
Dissemination tree grid structure
Data aggregation Unknown
handled by agent handled by normal
Mobile user management
 C. Intanagonwiwat, R. Govindan, D. Estrin and J. Heidemann, “Directed
(b) Average number of leader reports Diffusion for Wireless Sensor Networking,” IEEE/ACM Transaction on
Networking, vol. 11, pp. 529–551, February 2003.
Figure 11: Average member reports and leader reports.
 W. Heinzelman, J. Kulik and H. Balakrishnanl, “Negotiation-based
protocols for disseminating information in wireless sensor networks”,
Wireless Networks, vol. 8, pp.169-185 , March 2002.
 F. Ye, H. Luo, J. Cheng, S. Lu, and L.Zhang. “A two-tiew data
dissemination model for large scale wireless sensor networks,” Wireless
Networks, vol.11 pp. 161-175, January 2005
 H. S. Kim and W. H. Kwon “Dynamic Delay-Constrained Minimum-
Energy Dissemination in Wireless Sensor Networks”, ACM Transaction
on Embedded Computing Systems Vol. 4, pp. 679-706, August 2005.
 R.S. Chang and A.C. Lee, “An Energy Efficient Data Query
Architecture for Large Scale Sensor Networks”, IEICE
TRANSACTIONS ON COMMUNICATIONS, vol. 2, pp. 217-227,
Figure 12: Average number of leader reports.
 J. Hill, R. Szewczyk, A. Woo, S. Hollar, D. Culler and K. Pister,
“System architecture directions for networked sensors”, Proceedings of
International Conference on Architectural Support for Programming
Language and Operating Systems(ASPLOS-IX),2002
 C. E. Perkins and P. Bhagwat “Highly dynamic Destination-Sequenced
Distance-Vector routing (DSDV) for mobile computers”, ACM
SIGCOMM Computer Communication Review , Proceedings of the
conference on Communications architectures, protocols and applications
SIGCOMM '94, pp. 234-244 Vol. 24, October 1994
 C. E. Perkins and E. M. Royer, ”Ad-hoc On-Demand Distance Vector
Routing “Proceedings of the Second IEEE Workshop on Mobile
Computer Systems and Applications WMCSA '99 , 1999
 Q. Wang, M. Hempstead and W. Yang, “A Realistic Power
Consumption Model for Wireless Sensor Network Devices”, In
Figure 13: Average power consumption for member Proceedings of the Third Annual IEEE Communications Society
transmission. Conference on Sensor, Mesh and Ad Hoc Communications and
Networks(SECON). Reston, VA, September 2006.
 MTS/MDA Sensor and Data Acquisition Boards User Manual,May 2003.