Author Topic: Data Aggregation and Query Processing in WSN  (Read 1894 times)

0 Members and 1 Guest are viewing this topic.

IJSER Content Writer

  • Sr. Member
  • ****
  • Posts: 327
  • Karma: +0/-1
    • View Profile
Data Aggregation and Query Processing in WSN
« on: August 21, 2011, 06:31:27 am »
Author : Ayoni Mukherjee, Sanjit Setua
International Journal of Scientific & Engineering Research Volume 2, Issue 6, June-2011
ISSN 2229-5518
Download Full Paper : PDF

Abstract - A wireless sensor network (WSN) has a wide range of important applications such as remote 
environment monitoring , target tracking etc. This has been enabled by the availability of sensors that are smaller, cheaper and  intelligent. These sensor nodes are equipped with wireless interfaces with which they can communicate with one another to form a network. The design of a WSN depends significantly on the application, and it considers factors like the environment, the application’s design objectives, cost, hardware and system constraints. The goal of our work is to present a comprehensive approach of  cluster based data aggregation and  processing of various aggregation queries in WSN. The aggregation operations in WSN  is  an  attractive  research topic  now -a- days. We have considered variety of queries involving aggregate functions and proposed an algorithm to calculate the ’MAX’ aggregate function .Our algorithm successfully eliminates  the limitations of existing algorithms to perform the same operation. We have used temperature sensing application for this purpose.
Index Terms - aggregation, base station(BS), Clusters , Cluster head (CH), MAX, nodes , query ,WSN.
AWSN  typically  has little or no infrastructure. It consists of a number of sensor nodes (few tens to thousand) working together to monitor a region to obtain data about the environment. Data gathering, processing and aggregation belong  to the management and control  services  of  WSN and  aggregation process enhances the overall  efficiency  of  WSN. Data aggregation is necessary for WSN applications which have large amount of data to send across the network. Data is collected from multiple sensor nodes and combined together to transmit to the base station. Here aggregated data is more important than individual readings. Depending on the importance of the data and criticality of the application a trade-off among different methods is necessary. Each of these methods addresses the issue of energy, robustness, scalability, accuracy and efficiency.

The general requirements for sensor network is presented in section 2 . The query processing and data aggregation technique are in section 3  and Section 4 concludes this paper.

Low Power Consumption : Sensor nodes are usually battery powered. Manual replacement of
batteries is often not possible, which makes sensor nodes dependent on their battery life.As a result, minimization of energy consumption becomes critical to achieve a robust system.
Scalability: WSNs with thousands of sensor nodes can become common. Although stationary
in many cases, mobile sensors  may also be used in the military or environmental application.The scalability of the system hence becomes a major concern. Self - Organization Ability: Wireless sensor networks are large in size and work in  the environment that causes the increase in   failures of individual nodes. Mechanisms are needed  for joining the network randomly, as well as reorganizing the network upon failures and hence, self-organization ability is essential. Resilience: Sensor nodes may unpredictably stop operating due to  environmental reasons   or to the battery consumption. Routing  protocols should cope with this eventuality so when a current-in-use sensor node fails, an alternative route could be discovered.
Device Heterogeneity: Although  most of  the civil applications of wireless sensor network rely
on homogenous sensor nodes,the introduction of different kinds of sensor nodes could report significant benefits. The use of sensor nodes with different  processors, transceivers,  power units  or sensing components may  improve  the characteristics of  the network.The scalability  of
the network, the energy drainage or the bandwidth are potential candidates to benefit from the heterogeneity of sensor nodes .
Querying Ability : Due  to the network size, the amount of the aggregated data may be too large for transmitting through the whole network.Because of that, the data collection in a particular region or from certain sensor nodes is needed instead. Certain sensor nodes need to   be
dedicated for collecting the data  from regions,creating a summary and forwarding information.Querying function is used to identify collection sensor nodes and the corresponding regions.

The  basic  aggregation functions[1], [2],[3],[4] are   MIN , MAX , AVERAGE , SUM  and  COUNT.  Among them  only  MIN and MAX  are  duplicate  insensitive and all others  are  duplicate  sensitive. It is considered that each  sensor node  has  an unique  identification number. For numbering the sensor nodes, a global  sequential  counter  is used ,starting from the root node of the  network tree .

3.1  Query processing
We have followed cluster based approach for data aggregation which indicates efficient data collection ,management and aggregation in WSN. Each cluster will consist of a number of small sensor nodes(SN), one cluster head (CH), one beacon node(BN) . There is a base-station (BS) which acts as an user interface for the WSN. Here BS is analogous to the root node. In reality we require different BSs for distributed processing and communication for any WSN.  The sensor nodes deployed in the wide area, form many cluster groups for efficient network organization. The processing of queries is independent within a cluster group. Different  clusters take part in parallel computation and finally  different CHs communicate with each other to produce the result required at the BS. The queries are provided at the base station(BS) through user interface or generated automatically at the BS. The  BS  broadcasts  the query to reach each  CH ( within the transmission range of BS ).Here the processing  of the queries is not same for all sensor nodes in the network. The responsibility is hierarchically distributed among  CH and other regular sensor nodes present in a cluster. Finally the aggregated result is obtained at the BS. We have proposed algorithms for communication and distribution of responsibilities for computation, result transmission , among the CHs  to provide a single valid result of an aggregation query to the BS . The main goal is efficient query processing and reduction of response time. Time required for result calculation  by a cluster group is predicted beforehand  by the depth of each cluster .We have considered  different types of queries:

i.   SELECT   MAX( temperature )
       FROM sensors ;

ii.   SELECT  AVERAGE( temperature )
      FROM sensors ;

iii.   SELECT  MAX( temperature)
FROM sensor
WHERE   room  in  ( SELECT   room
         WHERE    floor = ’3’ ) ;

iv.   SELECT  MAX( temperature)
 FROM sensor

3.2 MAX   function  implementation  in  WSN
It is known  that monotonic and  exemplary digest functions [5]can be computed efficiently by localized information exchanges between one hop neighbors.  MAX   and  MIN  functions are both  monotonic and exemplary  digests.
A digest function is denoted by f(v1, v2,_ _ _ ,vn),   where vi is the value contributed by  ith sensor node. Additionally,  f  is  decomposable by a function g , such that :
f(v1, v2,_ _ _ ,vn), = g(f(v1, v2,_ _ _ ,vk),  f(vk+1, _ _ _ ,vn)) 
A decomposable digest function is one in which the final result can be calculated from partial results. The values v may either be scalars or vectors. The problem of digest computation is that each node i provides a value vi as its contribution to the digest function f, where vi may change over time. The goal of the digest computation mechanism is for each sensor node in the network to contain a continuous estimate for the current value of   f.
A digest function is monotonic if  only if, when two partial results r1 and  r2 are combined by a function r = g(r1, r2), such that the result r satisfies  the relation for  all  i = 1,  2.. , r >= ri for an ordering relationship > =. It is exemplary if the final result can be determined from one single contribution value.
Here  in-network aggregation [4] approach  has been  considered  instead  of  naive centralized  approach. Each sensor node computes a partial result of the digest function and passes that result to other neighboring sensor nodes. In-network aggregation has better energy-efficiency characteristics, communication overhead is less and the computation is evenly distributed.

Read More: Click here...