Latest Research in Sensor Networks - PowerPoint

Document Sample
Latest Research in Sensor Networks - PowerPoint Powered By Docstoc
					  Implementation and Research
   Issues in Query Processing
  for Wireless Sensor Networks


      Wei Hong                  Sam Madden
Intel Research, Berkeley           MIT
whong@intel-research.net    madden@csail.mit.edu
                                           1
                    ICDE 2004
                          Motivation
• Sensor networks (aka sensor webs, emnets) are here
   – Several widely deployed HW/SW platforms
       • Low power radio, small processor, RAM/Flash
   – Variety of (novel) applications: scientific, industrial, commercial
   – Great platform for mobile + ubicomp experimentation


• Real, hard research problems to be solved
   – Networking, systems, languages, databases

• We will summarize:                                            Berkeley
   – The state of the art                                          Mote
   – Our experiences building TinyDB
   – Current and future research directions

                                                                       2
              Sensor Network Apps
Habitat Monitoring: Storm
  petrels on Great Duck Island,
  microclimates on James
  Reserve.
                                     Earthquake monitoring in shake-
                                               test sites.
Vehicle detection: sensors along a
road, collect data about passing
vehicles.




                                              Traditional monitoring
                                                                3
                                                          apparatus.
             Declarative Queries
• Programming Apps is Hard
  –   Limited power budget
  –   Lossy, low bandwidth communication
  –   Require long-lived, zero admin deployments
  –   Distributed Algorithms
  –   Limited tools, debugging interfaces
• Queries abstract away much of the complexity
  – Burden on the database developers
  – Users get:
       • Safe, optimizable programs
       • Freedom to think about apps instead of details

                                                          4
TinyDB: Prototype declarative
query processor
• Platform: Berkeley Motes + TinyOS
• Continuous variant of SQL : TinySQL

• Power and data-acquisition based in-
  network optimization framework
• Extensible interface for aggregates, new
  types of sensors

                                        5
                Agenda
• Part 1 : Sensor Networks (50 Minutes)
  – TinyOS
  – NesC
• Short Break
• Part 2: TinyDB (1 Hour)
  – Data Model and Query Language
  – Software Architecture
• Long Break + Hands On
• Part 3: Sensor Network Database Research
  Directions (1 Hour, 10 Minutes)

                                          6
               Part 1
• Sensornet Background
• Motes + Mote Hardware
  – TinyOS
  – Programming Model + NesC
• TinyOS Architecture
  – Major Software Subsystems
  – Networking Services


                                7
    A Brief History of Sensornets
• People have used sensors for a long time
• Recent CS History:
  – (1998) Pottie + Kaiser: Radio based networks of
    sensors
  – (1998) Pister et al: Smart Dust
      • UCLA / USC / Berkeley Continue to Lead Research
     • Initial focus on optical communication
          •Many other players now
     • By 1999, radio based networks, COTS Dust, “Motes”
          •TinyOS/Motes as most common platform
  – (1999) Estrin + Govindan
     • •Ad-hoc networks of sensors
        Emerging commercial space:
         • Culler/Hill et Dust, Sensicast, Moteiv,
  – (2000)Crossbow, Ember,al: TinyOS + Motes Intel
  – (2002) Hill / Dust: SPEC, mm^3 scale computing

                                                           8
               Why Now?
• Commoditization of radio hardware
  – Cellular and cordless phones, wireless
    communication


• Low cost -> many/tiny -> new applications!

• Real application for ad-hoc network
  research from the late 90‟s

• Coming together of EE + CS communities

                                               9
            Motes

Mica Mote    4Mhz, 8 bit Atmel RISC uProc
             40 kbit Radio
             4 K RAM, 128 K Program
             Flash, 512 K Data Flash
             AA battery pack
Mica2Dot     Based on TinyOS




                                       10
                   History of Motes
• Initial research goal wasn‟t hardware
  – Has since become more of a priority with emerging
    hardware needs, e.g.:
     • Power consumption
     • (Ultrasonic) ranging + localization
          – MIT Cricket, NEST Project
     • Connectivity with diverse sensors
          – UCLA sensor board
  – Even so, now on the 5th generation of devices
     •   Costs down to ~$50/node (Moteiv, Dust)
     •   Greatly improved radio quality
     •   Multitude of interfaces: USB, Ethernet, CF, etc.
     •   Variety of form factors, packages

                                                            11
       Motes vs. Traditional
           Computing
• Lossy, Adhoc Radio Communication
• Sensing Hardware
• Severe Power Constraints




                                     12
           Radio Communication
• Low Bandwidth Shared Radio Channel
    – ~40kBits on motes
    – Much less in practice
       • Encoding, Contention for Media Access (MAC)
• Very lossy: 30% base loss rate
    – Argues against TCP-like end-to-end retransmission
       • And for link-layer retries
• Generally, not well behaved



                                                              13
From Ganesan, et al. “Complex Behavior at Scale.” UCLA/CSD-TR 02-0013
              Types of Sensors
   • Sensors attach via daughtercard

•Weather               •Vibration
  –Temperature            –2 or 3 axis
  –Light x 2 (high        accelerometers
  intensity PAR, low   •Tracking
  intensity, full
  spectrum)               –Microphone (for ranging
                          and acoustic signatures)
  –Air Pressure
                          –Magnetometer
  –Humidity
                       • GPS

                                                     14
        Power Consumption and
               Lifetime
• Power typically supplied by a small battery
  – 1000-2000 mAH
  – 1 mAH = 1 milliamp current for 1 hour
     • Typically at optimum voltage, current drain rates
  – Power = Watts (W) = Amps (A) * Volts (V)
  – Energy = Joules (J) = W * time

• Lifetime, power consumption varies by application
  – Processor: 5mA active, 1 mA idle, 5 uA sleeping
  – Radio: 5 mA listen, 10 mA xmit/receive, ~20mS / packet
  – Sensors: 1 uA -> 100‟s mA, 1 uS -> 1 S / sample

                                                           15
                             Energy Usage in A Typical
                              Data Collection Scenario
                       mote Breakdown
            • EachConsumptioncollects 1 sample of Energy Breakdown
                 Power                        Processor
              (light,humidity) data every 10 seconds,
              forwards it
              90                           50




                                                                   Percentage of Total Energy
              80                           45
              Each mote can “hear” 10 40
            • 70                           other motes
Percentage of Total Power




                                           35
              Process:
            • 60                           30
                            50                                                                  25
                              – Wake up, collect samples (~ 1 second)                           20
                            40
                              –
                            30 Listen to radio for messages to forward (~1
                                                                                                15
                                                                                                10
                            20 second)                                                           5

                              –
                            10 Forward data                                                      0
                                                              Idle Waiting Waiting                                         Sending
                            0                                                                        for Radio     for
                                 Radio       Sensors   Processor                                                 Sensors
                                         Hardware Element                                            Processing Phase

                                                                                                                               16
  Sensors: Slow, Power Hungry,
             Noisy                              Time of Day vs. Light
                                                Time of Day vs. Light

              200
                                                                       Chamber Sensor
                                                                                   Chamber Sensor
                                                                       Sensor 69 (Median of Last 10)
              180                                                                  Sensor 69


              160


              140


              120
Light (Lux)




              100
    Lux




               80


               60


               40


               20


                0
                20:09   20:38   21:07   21:36   22:04   22:33   23:02
                                                                23:02       23:31
                                                                            23:31      0:00
                                                                                       0:00      0:28
                                                                                                 0:28   0:57
                                                                                                        0:57        1:26
                                                                                                                    1:26
              -20                                                                                              17
                                                            Time
                                                         Time of Day
  Programming Sensornets:
         TinyOS
• Component Based Programming Model
• Suite of software components
  –   Timers, clocks, clock synchronization
  –   Single and multi-hop networking
  –   Power management
  –   Non-volatile storage management




                                              18
   Programming Philosophy
• Component Based
  – “Wiring” to components together via interfaces,
    configurations
• Split-Phased
  – Nothing blocks, ever.
  – Instead, completion events are signaled.
• Highly Concurrent
  – Single thread of “tasks”, posted and scheduled
    FIFO
  – Events “fired” asynchronously in response to
    interrupts.

                                                 19
                            NesC
• C-like programming language with component model
  support
  – Compiles into GCC-compatible C
• 3 types of files:
  – Interfaces
     • Set of function prototypes; no implementations or variables
  – Modules
     • Provide (implement) zero or more interfaces
     • Require zero or more interfaces
     • May define module variables, scoped to functions in module
  – Configurations
     • Wire (connect) modules according to requires/provides
       relationship

                                                                20
      Component Example: Leds
module LedsC {               ….
                               async command result_t Leds.redOn() {
  provides interface Leds;
                                 dbg(DBG_LED, "LEDS: Red on.\n");
}
                                 atomic {
implementation
                                   TOSH_CLR_RED_LED_PIN();
{                                  ledsOn |= RED_BIT;
  uint8_t ledsOn;                }
                                 return SUCCESS;
 enum {                        }
   RED_BIT = 1,              ….
   GREEN_BIT = 2,            }
   YELLOW_BIT = 4
 };


                                                                   21
         Configuration Example
configuration CntToLedsAndRfm {
}
implementation {
  components Main, Counter, IntToLeds, IntToRfm, TimerC;

    Main.StdControl -> Counter.StdControl;
    Main.StdControl -> IntToLeds.StdControl;
    Main.StdControl -> IntToRfm.StdControl;
    Main.StdControl -> TimerC.StdControl;
    Counter.Timer -> TimerC.Timer[unique("Timer")];
    IntToLeds <- Counter.IntOutput;
    Counter.IntOutput -> IntToRfm;
}

                                                           22
                   Split Phase Example
module IntToRfmM { … }
implementation { …
command result_t IntOutput.output
                     (uint16_t value) {            event result_t Send.sendDone
   IntMsg *message = (IntMsg *)data.data;                             (TOS_MsgPtr msg,
   if (!pending) {                                                     result_t success) {
      pending = TRUE;                                  if (pending && msg == &data) {
      message->val = value;                                pending = FALSE;
      atomic {                                             signal IntOutput.outputComplete
        message->src = TOS_LOCAL_ADDRESS;                                    (success);
      }                                                  }
      if (call Send.send(TOS_BCAST_ADDR,               return SUCCESS;
                         sizeof(IntMsg), &data))     }
        return SUCCESS;                            }
      pending = FALSE;
   }
   return FAIL;                                    }
 }

                                                                                     23
         Major Components

• Timers: Clock, TimerC, LogicalTime
• Networking: Send, GenericComm,
  AMStandard, lib/Route
• Power Management:
  HPLPowerManagement
• Storage Management: EEPROM,
  MatchBox
                                       24
                     Timers

• Clock: Basic abstraction over hardware
  timers; periodic events, single frequency.

• LogicalTime: Fire an event some number of
  H:M:S:ms in the future.

• TimerC: Multiplex multiple periodic timers
  on top of LogicalTime.

                                               25
                     Radio Stack
• Interfaces:                           IntMsg *message = (IntMsg *)data.data;
                                        …
  – Send                                message->val = value;
     • Broadcast, or to a specific ID   atomic {
     • split phase                        message->src = TOS_LOCAL_ADDRESS;
  – Receive                             }
     • asynchronous signal              call Send.send(TOS_BCAST_ADDR,
                                                       sizeof(IntMsg), &data))
• Implementations:
  – AMStandard
     • Application specific messages    event TOS_MsgPtr ReceiveIntMsg.
     • Id-based dispatch                           receive(TOS_MsgPtr m) {
  – GenericComm                            IntMsg *message = (IntMsg *)m->data;
     • AMStandard + Serial IO              call IntOutput.output(message->val);
  – Lib/Route                            }
                                           return m;
     • Mulithop

                                    Wiring to equate IntMsg to ReceiveIntMsg
                                                                       26
               Multihop Networking
     • Standard implementation “tree based
       routing”
Problems:                                      A
                                                       B
Parent Selection             R:{…}
                                       B                   R:{…}
Asymmetric Links                                   B
Adaptation vs. Stability               B                   C
                                           B
                               B
  Node D       Node C      R:{…}                       B         B
Neigh Qual   Neigh Qual
B    .75     A     .5                  D       B                     R:{…}
C    .66     B    .44
E    .45     D    .53
                           R:{…}
                               B           B           B
F    .82     F    .35                                        F
                                                                             27
                                   E                   B
                     Geographic Routing
    • Any-to-any routing via geographic
      coordinates
          – See “GPSR”, MOBICOM 2000, Karp + Kung.

                                                                          •Requires coordinate
                                                                          system*
                                                             B
                                                                          •Requires endpont
                                                                          coordinates
                     A                                                    •Hard to route
                                                                          around local minima
                                                                          (“holes”)
                                                                                                       28
*Could be virtual, as in Rao et al “Geographic Routing Without Coordinate Information.” MOBICOM 2003
             Power Management
• HPLPowerManagement
  – TinyOS sleeps processor when possible
  – Observes the radio, sensor, and timer state

• Application managed, for the most part
  – App. must turn off subsystems when not in use
  – Helper utility: ServiceScheduler
     • Peridically calls the “start” and “stop” methods of an app
  – More on power management in TinyDB later
  – Approach works because:
     • single application
     • no interactivity requirements
                                                                    29
           Non-Volatile Storage
• EEPROM
  – 512K off chip, 32K on chip
  – Writes at disk speeds, reads at RAM speeds
  – Interface : random access, read/write 256 byte
    pages
  – Maximum throughput ~10Kbytes / second
• MatchBox Filing System
  – Provides a Unix-like file I/O interface
  – Single, flat directory
  – Only one file being read/written at a time


                                                     30
    TinyOS: Getting Started
• The TinyOS home page:
  – http://webs.cs.berkeley.edu/tinyos
  – Start with the tutorials!
• The CVS repository
  – http://sf.net/projects/tinyos
• The NesC Project Page
  – http://sf.net/projects/nescc
• Crossbow motes (hardware):
  – http://www.xbow.com
• Intel Imote
  – www.intel.com/research/exploratory/motes.htm.
                                                    31
          Part 2

The Design and Implementation
          of TinyDB



                                32
             Part 2 Outline
•   TinyDB Overview
•   Data Model and Query Language
•   TinyDB Java API and Scripting
•   Demo with TinyDB GUI
•   TinyDB Internals
•   Extending TinyDB
•   TinyDB Status and Roadmap




                                    33
              TinyDB Revisited
• High level abstraction:            SELECT MAX(mag)
                                     FROM sensors
   – Data centric programming        WHERE mag > thresh
   – Interact with sensor            SAMPLE PERIOD 64ms
     network as a whole
   – Extensible framework                    App
• Under the hood:
   – Intelligent query processing:     Query,
     query optimization, power         Trigger
                                                      Data
     efficient execution
   – Fault Mitigation:                       TinyDB
     automatically introduce
     redundancy, avoid problem
     areas                              Sensor Network


                                                             34
        Feature Overview
• Declarative SQL-like query interface
• Metadata catalog management
• Multiple concurrent queries
• Network monitoring (via queries)
• In-network, distributed query processing
• Extensible framework for attributes,
  commands and aggregates
• In-network, persistent storage



                                             35
                        Architecture

                                      TinyDB GUI
                                                     JDBC
                                 TinyDB Client API
                                                            DBMS
PC side
Mote side                    0
                                 0
                                             TinyDB query
                                               processor
                    1                2
                                               83

                4        5               6
            Sensor network
                                  7
                                                                   36
                   Data Model
• Entire sensor network as one single, infinitely-long logical
  table: sensors
• Columns consist of all the attributes defined in the network
• Typical attributes:
   – Sensor readings
   – Meta-data: node id, location, etc.
   – Internal states: routing tree parent, timestamp, queue length,
     etc.
• Nodes return NULL for unknown attributes
• On server, all attributes are defined in catalog.xml
• Discussion: other alternative data models?




                                                                      37
 Query Language (TinySQL)
SELECT <aggregates>, <attributes>
[FROM {sensors | <buffer>}]
[WHERE <predicates>]
[GROUP BY <exprs>]
[SAMPLE PERIOD <const> | ONCE]
[INTO <buffer>]
[TRIGGER ACTION <command>]


                                    38
     Comparison with SQL
• Single table in FROM clause
• Only conjunctive comparison predicates in
  WHERE and HAVING
• No subqueries
• No column alias in SELECT clause
• Arithmetic expressions limited to column
  op constant
• Only fundamental difference: SAMPLE
  PERIOD clause


                                              39
              TinySQL Examples

                   “Find the sensors in bright
                   nests.”


1                                            Sensors
                                   Epoch Nodeid nestNo Light
    SELECT nodeid, nestNo, light
    FROM sensors                   0     1       17    455
    WHERE light > 400              0     2       25    389
    EPOCH DURATION 1s
                                   1     1       17    422

                                   1     2       25    405

                                                          40
      TinySQL Examples (cont.)

2 SELECT AVG(sound)              “Count the number occupied
  FROM sensors
                                 nests in each loud region of
                                 the island.”
  EPOCH DURATION 10s


                                   Epoch   region   CNT(…)   AVG(…)
3 SELECT region, CNT(occupied)    0        North    3        360
         AVG(sound)
                                  0        South    3        520
  FROM sensors
                                  1        North    3        370
  GROUP BY region
  HAVING AVG(sound) > 200         1        South    3        520

  EPOCH DURATION 10s
                                 Regions w/ AVG(sound) > 200       41
      Event-based Queries
• ON event SELECT …
• Run query only when interesting events
  happens
• Event examples
  – Button pushed
  – Message arrival
  – Bird enters nest
• Analogous to triggers but events are user-
  defined


                                           42
      Query over Stored Data
•   Named buffers in Flash memory
•   Store query results in buffers
•   Query over named buffers
•   Analogous to materialized views
•   Example:
    – CREATE BUFFER name SIZE x (field1 type1, field2
      type2, …)
    – SELECT a1, a2 FROM sensors SAMPLE PERIOD d INTO
      name
    – SELECT field1, field2, … FROM name SAMPLE PERIOD d




                                                       43
          Using the Java API
• SensorQueryer
   – translateQuery() converts TinySQL string into
     TinyDBQuery object
   – Static query optimization
• TinyDBNetwork
   – sendQuery() injects query into network
   – abortQuery() stops a running query
   – addResultListener() adds a ResultListener that is invoked
     for every QueryResult received
   – removeResultListener()
• QueryResult
   – A complete result tuple, or
   – A partial aggregate result, call mergeQueryResult() to
     combine partial results
• Key difference from JDBC: push vs. pull
                                                              44
 Writing Scripts with TinyDB
• TinyDB‟s text interface
  – java net.tinyos.tinydb.TinyDBMain –run
    “select …”
  – Query results printed out to the console
  – All motes get reset each time new query
    is posed
• Handy for writing scripts with shell,
  perl, etc.


                                           45
     Using the GUI Tools
• Demo time




                           46
                   Inside TinyDB

SELECT                         T:1, AVG: 225
AVG(temp) Queries      Results T:2, AVG: 250
WHERE
light > 400     Multihop
                Network
             Query Processor
               Aggavg(temp)

        ~10,000 Lines Embedded C Code
              Filter               Name: temp
                    light >
         ~5,000 LinesSamples got(„temp‟) Time to sample: 50 uS
                 400
   get („temp‟) Tables (PC-Side) Java
                                       Cost to sample: 90 uJ
        ~3200 Bytes RAM (w/ 768 byte heap) Table: 3
                 Schema                Calibration
getTempFunc(…)
                                       Units: Deg. F
        ~58 kB compiled code
                 TinyOS
                                       Error: ± 5 Deg F
        (3x larger than 2nd largest TinyOSf Program)
                                       Get : getTempFunc()…  47
               TinyDB
               Tree-based Routing

                                Q:SELECT …
• Tree-based routing                            A
  – Used in:                                            Q
     • Query delivery             R:{…}
                                        Q                   R:{…}
     • Data collection
                                                    Q
     • In-network aggregation           B                   C
  – Relationship to indexing?               Q
                                    Q
                                R:{…}                   Q         Q

                                        D       Q                 R:{…}
                                R:{…}
                                    Q       Q           Q
                                                              F
                                    E                   Q
                                                                  48
  Power Management Approach
 Coarse-grained app-controlled communication scheduling
                Epoch (10s -100s of seconds)
Mote ID
  1   … zzz …          … zzz …

  2

  3

  4

  5
      time
                                                    49
                      2-4s Waking Period
        Time Synchronization
• All messages include a 5 byte time stamp indicating system
  time in ms
   – Synchronize (e.g. set system time to timestamp) with
       • Any message from parent
       • Any new query message (even if not from parent)
   – Punt on multiple queries
   – Timestamps written just after preamble is xmitted
• All nodes agree that the waking period begins when (system
  time % epoch dur = 0)
   – And lasts for WAKING_PERIOD ms

• Adjustment of clock happens by changing duration of sleep
  cycle, not wake cycle.



                                                               50
          Extending TinyDB
• Why extending TinyDB?
  –   New sensors  attributes
  –   New control/actuation  commands
  –   New data processing logic  aggregates
  –   New events
• Analogous to concepts in object-
  relational databases



                                           51
         Adding Attributes
• Types of attributes
  – Sensor attributes: raw or cooked sensor
    readings
  – Introspective attributes: parent,
    voltage, ram usage, etc.
  – Constant attributes: constant values
    that can be statically or dynamically
    assigned to a mote, e.g., nodeid, location,
    etc.


                                              52
    Adding Attributes (cont)
• Interfaces provided by Attr component
  – StdControl: init, start, stop
  – AttrRegister
     •   command registerAttr(name, type, len)
     •   event getAttr(name, resultBuf, errorPtr)
     •   event setAttr(name, val)
     •   command getAttrDone(name, resultBuf, error)
  – AttrUse
     •   command startAttr(attr)
     •   event startAttrDone(attr)
     •   command getAttrValue(name, resultBuf, errorPtr)
     •   event getAttrDone(name, resultBuf, error)
     •   command setAttrValue(name, val)


                                                           53
      Adding Attributes (cont)
•    Steps to adding attributes to TinyDB
    1) Create attribute nesC components
    2) Wire new attribute components to
       TinyDBAttr configuration
    3) Reprogram TinyDB motes
    4) Add new attribute entries to catalog.xml
•    Constant attributes can be added on the
     fly through TinyDB GUI




                                                  54
               Adding Aggregates
   • Step 1: wire new nesC components
           TinyDB Aggregation Framework


                                                                       Aggregate
                                                                                      SumM.nc

                                           AggregateUseM.nc

                                            stateSize(ID, ...)

                                             merge(ID, ...)
                            AggregateUse                               Aggregate
Operator
           AggOperator.nc                    update(ID, ...)                          CountM.nc

                                             hasData(ID,...)

                                             finalize(ID,...)

                                               init(ID, ...)

                                            getProperties(ID)          Aggregate
                                                                                      AvgM.nc




                                                                 AggOperatorConf.nc
                                                                                             55
       Adding Aggregates (cont)
• Step 2: add entry to catalog.xml
   <aggregate>
     <name>AVG</name>
     <id>5</id>
     <temporal>false</temporal>
     <readerClass>net.tinyos.tinydb.AverageClass</readerClass>
   </aggregate>
• Step 3 (optional): implement reader class in Java
   – a reader class interprets and finalizes aggregate state
     received from the mote network, returns final result as a
     string for display.



                                                                 56
                 TinyDB Status
• Latest released with TinyOS 1.1 (9/03)
   – Install the task-tinydb package in TinyOS 1.1 distribution
   – First release in TinyOS 1.0 (9/02)
   – Widely used by research groups as well as industry pilot
     projects
• Successful deployments in Intel Berkeley Lab and
  redwood trees at UC Botanical Garden
   – Largest deployment: ~80 weather station nodes
   – Network longevity: 4-5 months




                                                                  57
       The Redwood Tree Deployment
• Redwood Grove in UC Botanical
  Garden, Berkeley
• Collect dense sensor readings to
  monitor climatic variations
  across
   –   altitudes,
   –   angles,
   –   time,
   –   forest locations, etc.
• Versus sporadic monitoring
  points with 30lb loggers!
• Current focus: study how dense
  sensor data affect predictions
  of conventional tree-growth
  models
                                     58
        Data from Redwoods
                                                                                   Humidity vs. Time
                                                                                     101     104      109      110      111

                                                  95




                           Rel Humidity (%)
                                                  85
36m
                                                  75

      33m: 111                                    65

      32m: 110                                    55


      30m: 109,108,107                            45

                                                  35



      20m: 106,105,104                                                            Temperature vs. Time

                                                  33

      10m: 103, 102, 101                          28
                                Temperature (C)




                                                  23

                                                  18

                                                  13

                                                  8
                                                   7/7/03 7/7/03 7/7/03 7/7/03 7/7/03 7/8/03 7/8/03 7/8/03 7/8/03 7/8/03 7/8/03 7/9/03 7/9/03 7/9/03 7/9/03
                                                    9:40 13:11 16:43 20:15 23:46 3:18 6:50 10:21 13:53 17:25 20:56 0:28 4:00 7:31 11:03

                                                                                                   Date                                   59
TinyDB Roadmap (near term)
• Support for high frequency sampling
   – Equipment vibration monitoring, structural monitoring,
     etc.
   – Store and forward
   – Bulk reliable data transfer
   – Scheduling of communications
• Port to Intel Mote
• Deployment in Intel Fab equipment monitoring
  application and the Golden Gate Bridge monitoring
  application



                                                              60
        For more information
• http://berkeley.intel-research.net/tinydb
  or
  http://triplerock.cs.bekeley.edu/tinydb




                                        61
            Part 3



Database Research Issues in Sensor
             Networks




                                     62
  Sensor Network Research
• Very active research area
  – Can‟t summarize it all
• Focus: database-relevant research topics
  – Some outside of Berkeley
  – Other topics that are itching to be scratched
  – But, some bias towards work that we find
    compelling




                                                    63
                Topics
• In-network aggregation
• Acquisitional Query Processing
• Heterogeneity
• Intermittent Connectivity
• In-network Storage
• Statistics-based summarization and
  sampling
• In-network Joins
• Adaptivity and Sensor Networks
• Multiple Queries
                                       64
                Topics
• In-network aggregation
• Acquisitional Query Processing
• Heterogeneity
• Intermittent Connectivity
• In-network Storage
• Statistics-based summarization and
  sampling
• In-network Joins
• Adaptivity and Sensor Networks
• Multiple Queries
                                       65
             Tiny Aggregation (TAG)
• In-network processing of aggregates
   – Common data analysis operation
        • Aka gather operation or reduction in || programming
   – Communication reducing
        • Operator dependent benefit
   – Across nodes during same epoch

• Exploit query semantics to improve
  efficiency!



 Madden, Franklin, Hellerstein, Hong. Tiny AGgregation (TAG), OSDI 2002.   66
                 Basic Aggregation
• In each epoch:
   – Each node samples local sensors once
   – Generates partial state record (PSR)             1
       • local readings
       • readings from children
   – Outputs PSR during assigned comm. interval   2       3

• At end of epoch, PSR for whole network              4
  output at root
• New result on each successive epoch
                                                          5
• Extras:
   – Predicate-based partitioning via GROUP BY


                                                              67
                  Illustration: Aggregation

             SELECT COUNT(*)
             FROM sensors                          Interval 4
                       Sensor #
                                               1
                                                       Epoch
                   1   2   3   4       5

              4                    1       2           3
              3
Interval #




              2
                                               4
              1
                                                   1
              4
                                                        5   68
                  Illustration: Aggregation

             SELECT COUNT(*)
             FROM sensors                                  Interval 3
                       Sensor #
                                                   1
                   1   2   3       4       5

              4                        1       2            3
              3                2
Interval #




                                                       2
              2
                                                   4
              1

              4
                                                             5   69
                  Illustration: Aggregation

             SELECT COUNT(*)
             FROM sensors                                              Interval 2
                       Sensor #
                                                               1
                   1       2       3       4       5       1       3
              4                                1       2                3
              3                        2
Interval #




              2        1       3
                                                               4
              1

              4
                                                                         5   70
                  Illustration: Aggregation

             SELECT COUNT(*)
             FROM sensors                                     5   Interval 1
                       Sensor #
                                                              1
                      1       2       3       4       5

              4                                   1       2        3
              3                           2
Interval #




              2           1       3
                                                              4
              1   5

              4
                                                                    5   71
                  Illustration: Aggregation

             SELECT COUNT(*)
             FROM sensors                                         Interval 4
                       Sensor #
                                                              1
                      1       2       3       4       5

              4                                   1       2           3
              3                           2
Interval #




              2           1       3
                                                              4
              1   5
                                                                  1
              4                                   1
                                                                      5   72
                  Aggregation Framework

• As in extensible databases, TinyDB supports any
  aggregation function conforming to:
     Aggn={finit, fmerge, fevaluate}
     Finit {a0}            <a0>                  Partial State Record (PSR)
     Fmerge {<a1>,<a2>}  <a12>
     Fevaluate {<a1>}      aggregate value

  Example: Average
  AVGinit     {v}                       <v,1>
  AVGmerge {<S1, C1>, <S2, C2>}         < S1 + S2 , C1 + C2>
  AVGevaluate{<S, C>}                   S/C

  Restriction: Merge associative, commutative                           73
           Taxonomy of Aggregates

• TAG insight: classify aggregates according to various
  functional properties
   – Yields a general set of optimizations that can automatically be
     applied

      Property              Examples                      Affects
  Partial State
                               Drives an API!
                  MEDIAN : unbounded,
                  MAX : 1 record
                                                Effectiveness of TAG

  Monotonicity    COUNT : monotonic             Hypothesis Testing, Snooping
                  AVG : non-monotonic
  Exemplary vs.   MAX : exemplary               Applicability of Sampling,
  Summary         COUNT: summary                Effect of Loss
  Duplicate       MIN : dup. insensitive,       Routing Redundancy
  Sensitivity     AVG : dup. sensitive
                                                                      74
                    Use Multiple Parents
• Use graph structure
    – Increase delivery probability with no communication overhead
• For duplicate insensitive aggregates, or
• Aggs expressible as sum of parts
    – Send (part of) aggregate to all parents             SELECT COUNT(*)
         • In just one message, via multicast
    – Assuming independence, decreases variance                    R

                                # of parents = n
P(link xmit successful) = p                                 B            C
P(success from A->R) = p2
                                E(cnt) = n * (c/n * p2)
E(cnt) = c *   p2                                            c/n
                                                             c         c/n
                                Var(cnt) = n *  (c/n)2*
Var(cnt) = c2 * p2 * (1 – p2)
                                    p2 * (1 – p2) = V/n            A
         V                                               n=2           75
        Multiple Parents Results

            No previous
• Better than Splitting                        With Splitting
  analysis expected!                               Benefit of Result Splitting
                    Critical
• Losses aren‟t     Link!
                                                       (COUNT query)

  independent!
                                            1400

                                            1200

• Insight: spreads data                     1000                                  Splitting

                               Avg. COUNT
  over many links                           800                                   No Splitting
                                            600

                                            400

                                            200

                                              0     (2500 nodes, lossy radio model, 6 parents per
                                                                        node)


                                                                                              76
             Acquisitional Query
             Processing (ACQP)
• TinyDB acquires AND processes data
  – Could generate an infinite number of samples
• An acqusitional query processor controls
  – when,
  – where,
  – and with what frequency data is collected!
• Versus traditional systems where data is provided
  a priori
             Madden, Franklin, Hellerstein, and Hong. The Design of An
             Acqusitional Query Processor. SIGMOD, 2003.             77
        ACQP: What’s Different?
• How should the query be processed?
  – Sampling as a first class operation
• How does the user control acquisition?
  – Rates or lifetimes
  – Event-based triggers
• Which nodes have relevant data?
  – Index-like data structures
• Which samples should be transmitted?
  – Prioritization, summary, and rate control



                                                78
   Operator Ordering: Interleave Sampling +
                  Selection

SELECT light, mag        At 1 sample / sec, total power savings
FROM sensors             E(sampling mag) >> E(sampling
                       • could be as much as 3.5mW  light)
WHERE pred1(mag)           1500 uJ vs. processor!
                         Comparable to 90 uJ
AND pred2(light)
EPOCH DURATION 1s             Correct ordering
                        (unless pred1 is very selective
   Traditional DBMS           and pred2 is not):
      (pred1)                     (pred1)               (pred2)
                              Costly
      (pred2)        ACQP
                                            mag                 light


                                   (pred2)               (pred1)
                              Cheap         light               mag
                                                                 79
     mag     light
            Exemplary Aggregate
                Pushdown
SELECT WINMAX(light,8s,8s)
FROM sensors
WHERE mag > x                  WINMAX           • Novel, general
                                                   pushdown
                                                   technique
EPOCH DURATION 1s
    Traditional DBMS
                                (mag>x)
                                                 • Mag sampling is
      WINMAX          ACQP              mag
                                                   the most
                                                   expensive
                                                   operation!

       (mag>x)               (light > MAX)
                                         light
                                                            80
     mag      light
                     Topics
•   In-network aggregation
•   Acquisitional Query Processing
•   Heterogeneity
•   Intermittent Connectivity
•   In-network Storage
•   Statistics-based summarization and sampling
•   In-network Joins
•   Adaptivity and Sensor Networks
•   Multiple Queries



                                                  81
      Heterogeneous Sensor
            Networks
• Leverage small numbers of high-end nodes
  to benefit large numbers of inexpensive
  nodes
• Still must be transparent and ad-hoc
• Key to scalability of sensor networks
• Interesting heterogeneities
  –   Energy: battery vs. outlet power
  –   Link bandwidth: Chipcon vs. 802.11x
  –   Computing and storage: ATMega128 vs. Xscale
  –   Pre-computed results
  –   Sensing nodes vs. QP nodes
                                                    82
       Computing Heterogeneity with
                 TinyDB
• Separate query processing from sensing
     – Provide query processing on a small number of nodes
     – Attract packets to query processors based on “service
       value”
• Compare the total energy consumption of the
  network
          •   No aggregation
          •   All aggregation
          •   Opportunistic aggregation
          •   HSN proactive
              aggregation

Mark Yarvis and York Liu, Intel’s Heterogeneous Sensor
Network Project,                                                     83
ftp://download.intel.com/research/people/HSN_IR_Day_Poster_03.pdf.
5x7 TinyDB/HSN Mica2
       Testbed




                       84
                                               Data Packet Saving
                                                                                                                                                      Data Packet Saving
                                                                                                                            0.00%

                                                                                                                            -5.00%




                                                                                           % Change in Data Packet Count
                                                                                                                           -10.00%



• How many
                                                                                                                           -15.00%

                                                                                                                           -20.00%
                                                                                                                                                     11% aggregators
  aggregators are                                                                                                          -25.00%
                                                                                                                                                    achieve 72% of max
  desired?
                                                                                                                           -30.00%

                                                                                                                           -35.00%
                                                                                                                                                      data reduction
• Does placement
                                                                                                                           -40.00%

                                                                                                                           -45.00%


  matter?                                                                                                                  -50.00%
                                                                                                                                     1          2       3       4         5        6        All (35)
                                                                                                                                                            Number of Aggregator


                                                    Data Packet Saving - Aggregator Placement
                                      0.00%

                                      -5.00%
    % Change in Data Packet Counnt




                                     -10.00%

                                     -15.00%

                                     -20.00%

                                     -25.00%
                                     -30.00%          Optimal placement 2/3
                                     -35.00%
                                     -40.00%
                                                       distance from sink.
                                     -45.00%

                                     -50.00%
                                               25           27         29          31                                                    All (35)
                                                                     Aggregator Location
                                                                                                                                                                                       85
           Occasionally Connected
                Sensornets
internet
              TinyDB Server
                                          Mobile GTWY


                                                        GTWY
                                                          TinyDB QP




           Mobile GTWY
                                 Mobile GTWY      GTWY

                          GTWY                      TinyDB QP
                    TinyDB QP



                                                          86
               Occasionally Connected
               Sensornets Challenges
• Networking support
    – Tradeoff between reliability, power consumption
      and delay
    – Data custody transfer: duplicates?
    – Load shedding
    – Routing of mobile gateways
• Query processing
    – Operation placement: in-network vs. on mobile
      gateways
    – Proactive pre-computation and data movement
• Tight interaction between networking and QP


 Fall, Hong and Madden, Custody Transfer for Reliable Delivery in Delay Tolerant
                                                                                        87
 Networks, http://www.intel-research.net/Publications/Berkeley/081220030852_157.pdf .
Distributed In-network Storage
• Collectively, sensornets have large amounts
  of in-network storage
• Good for in-network consumption or
  caching
• Challenges
  – Distributed indexing for fast query
    dissemination
  – Resilience to node or link failures
  – Graceful adaptation to data skews
  – Minimizing index insertion/maintenance cost


                                                  88
                       Example: DIM
• Functionality
  – Efficient range query for
    multidimensional data.
• Approaches                              E2= <0.6, 0.7>
  – Divide sensor field into bins.                                             E1 = <0.7, 0.8>

  – Locality preserving mapping
    from m-d space to
    geographic locations.
  – Use geographic routing such
    as GPSR.
• Assumptions
  – Nodes know their locations             Q1=<.5-.7, .5-1>

    and network boundary
  – No node mobility


      Xin Li, Young Jin Kim, Ramesh Govindan and Wei Hong, Distributed Index    89
      for Multi-dimentional Data (DIM) in Sensor Networks, SenSys 2003.
      Statistical Techniques
• Approximations, summaries, and sampling
  based on statistics and statistical models
• Applications:
  – Limited bandwidth and large number of nodes ->
    data reduction
  – Lossiness -> predictive modeling
  – Uncertainty -> tracking correlations and
    changes over time
  – Physical models -> improved query answering


                                                 90
               Correlated Attributes
• Data in sensor networks is correlated; e.g.,
  –   Temperature and voltage
  –   Temperature and light
  –   Temperature and humidity
  –   Temperature and time of day
  –   etc.




                                           91
                                    IDSQ
• Idea: task sensors in order of best
  improvement to estimate of some value:
   – Choose leader(s)
        • Suppress subordinates
        • Task subordinates, one at a time
            – Until some measure of goodness (error bound) is met
               » E.g. “Mahalanobis Distance” -- Accounts for
                 correlations in axes, tends to favor minimizing
                 principal axis



 See “Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous
 Sensor Networks.” Chu, Haussecker and Zhao. Xerox TR P2001-10113. May, 2001. 92
        Graphical Representation
                                           Principal Axis
Model location
estimate as a point
with 2-dimensional              S1
Gaussian
uncertainty.
     Residual 1    Residual 2        S2

                                          Preferred
                                          because it
                                          reduces error
                                          along principal 93
 Area of residuals is equal               axis
   MQSN: Model-based
Probabilistic Querying over
    Sensor Networks                           Joint work with Amol
                                    Desphande, Carlos Guestrin, and
                                                    Joe Hellerstein




 Model           Query Processor




                   1
                            3
         2                                 4


                                5
                                               6
                        7

             8
                                                            94
                                       9
   MQSN: Model-based
Probabilistic Querying over
    Sensor Networks               Probabilistic Query
                                  select NodeID, Temp ± 0.1C
                                  where NodeID in [1..9]
                                  with conf(0.95)



         Consult
 Model    Model
                       Query Processor


                                   Observation Plan
                                   [Temp, 3], [Temp, 9]

                         1
                                   3
             2                                  4


                                       5
                                                    6
                              7

                   8
                                                               95
                                            9
   MQSN: Model-based
Probabilistic Querying over
    Sensor Networks               Probabilistic Query
                                  select NodeID, Temp ± 0.1C
                                  where NodeID in [1..9]
                                  with conf(0.95)



         Consult
 Model    Model
                       Query Processor


                                   Observation Plan
                                   [Temp, 3], [Temp, 9]

                         1
                                   3
             2                                  4


                                       5
                                                    6
                              7

                   8
                                                               96
                                            9
   MQSN: Model-based
Probabilistic Querying over
    Sensor Networks      Query Results
                                                          30




                                            Temperature
                                                          25

                                                          20

                                                          15

                                                          10
         Update                                                    1   2             3        4
 Model   Model
                      Query Processor
                                                                           Node ID


                                      Data
                                      [Temp, 3] = …, [Temp, 9] = …

                        1
                                  3
            2                                                  4


                                        5
                                                                   6
                              7

                  8
                                                                                         97
                                                          9
                   Challenges
• What kind of models to use ?
• Optimization problem:
   – Given a model and a query, find the best set of
     attributes to observe
   – Cost not easy to measure
      • Non-uniform network communication costs
      • Changing network topologies
   – Large plan space
      • Might be cheaper to observe attributes not in query
          – e.g. Voltage instead of Temperature
      • Conditional Plans:
          – Change the observation plan based on observed values




                                                                   98
      MQSN: Current Prototype
• Multi-variate Gaussian Models
   – Kalman Filters to capture correlations across time
• Handles:
   – Range predicate queries
      • sensor value within [x,y], w/ confidence
   – Value queries
      • sensor value = x, w/in epsilon, w/ confidence
   – Simple aggregate queries
      • AVG(sensor value)  n, w/in epsilon, w/confidence
• Uses a greedy algorithm to choose the observation plan




                                                            99
                     In-Net Regression
• Linear regression : simple way to predict
                   X vs Y w/ Curve Fit
         values, identify outliers
  future12
                                          y = 0.9703x - 0.0067
            10
• Regression can be acrossRlocal or remote
                            2
                              = 0.947
          multiple dimensions, or with high
  values, 8
  degree6polynomials
  – E.g., node A readings vs. node B‟s
           4
  – Or, location (X,Y), versus temperature
           2
        E.g., over many nodes
             0
                 1           3            5            7            9
  Guestrin, Thibaux, Bodik, Paskin, Madden. “Distributed Regression: an Efficient
                                                                                100
     Framework for Modeling Sensor Network Data .” Under submission.
   In-Net Regression (Continued)
• Problem: may require data from all sensors to
  build model
• Solution: partition sensors into overlapping
  “kernels” that influence each other
  – Run regression in each kernel
     • Requiring just local communication
  – Blend data between kernels
  – Requires some clever matrix manipulation
• End result: regressed model at every node
  – Useful in failure detection, missing value estimation


                                                        101
          Exploiting Correlations in
             Query Processing
• Simple idea:
   – Given predicate P(A) over expensive attribute A
   – Replace it with P‟ over cheap attribute A‟ such that P‟ evaluates
     to P
   – Problem: unless A and A‟ are perfectly correlated, P‟ ≠ P for all
     time
      • So we could incorrectly accept or reject some readings
• Alternative: use correlations to improve selectivity
  estimates in query optimization
   – Construct conditional plans that vary predicate order based on
     prior observations



                                                                    102
Exploiting Correlations (Cont.)
•    Insight: by observing a (cheap and correlated) variable not involved
     in the query, it may be possible to improve query performance
      – Improves estimates of selectivities
•    Use conditional plans
•    Example



                              Light >
                             Light >           Temp <
                                              Temp <           Expected Cost = 110
                                                               Expected Cost = 150
                              100 Lux
                             100 Lux            20° C
                                               20° C
                  T
                           Cost = 100
                          Cost = 100        Cost = 100
                                            Cost = 100
      Time in              Selectivity = .1 Selectivity = .9
                          Selectivity = .5 Selectivity = .5
    [6pm, 6am]

                  F          Temp <
                              Temp <        Light >
                                             Light >           Expected Cost = 110
                                                               Expected Cost = 150
                               20° C
                              20° C          100 Lux
                                            100 Lux

                           Cost = 100
                          Cost = 100        Cost = 100
                                            Cost = 100                     103
                           Selectivity = .1 Selectivity = .9
                          Selectivity = .5 Selectivity = .5
  In-Network Join Strategies
• Types of joins:
  – non-sensor -> sensor
  – sensor -> sensor
• Optimization questions:
  – Should the join be pushed down?
  – If so, where should it be placed?
  – What if a join table exceeds the memory
    available on one node?


                                         104
        Choosing Where to Place
               Operators
• Idea : choose a “join node” to run the operator
• Over time, explore other candidate placements
  – Nodes advertise data rates to their neighbors
  – Neighbors compute expected cost of running the
    join based on these rates
  – Neighbors advertise costs
  – Current join node selects a new, lower cost node




 Bonfils + Bonnet, Adaptive and Decentralized Operator   105
 Placement for In-Network QueryProcessing IPSN 2003.
                Topics
• In-network aggregation
• Acquisitional Query Processing
• Heterogeneity
• Intermittent Connectivity
• In-network Storage
• Statistics-based summarization and
  sampling
• In-network Joins
• Adaptivity and Sensor Networks
• Multiple Queries
                                       106
 Adaptivity In Sensor Networks
• Queries are long running
• Selectivities change
  – E.g. night vs day
• Network load and available energy vary
• All suggest that some adaptivity is needed
  – Of data rates or granularity of aggregation
    when optimizing for lifetimes
  – Of operator orderings or placements when
    selectivities change (c.f., conditional plans for
    correlations)
• As far as we know, this is an open problem!
                                                        107
  Multiple Queries and Work
            Sharing
• As sensornets evolve, users will run many
  queries simultaneously
  – E.g., traffic monitoring
• Likely that queries will be similar
  – But have different end points, parameters, etc
• Would like to share processing, routing as
  much as possible
• But how? Again, an open problem.


                                                 108
            Concluding Remarks
• Sensor networks are an exciting emerging technology,
  with a wide variety of applications

• Many research challenges in all areas of computer science
   – Database community included
   – Some agreement that a declarative interface is right

• TinyDB and other early work are an important first step

• But there‟s lots more to be done!



                                                            109

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:21
posted:4/15/2011
language:English
pages:109
Description: Latest Research in Sensor Networks document sample