Distributed Databases

Document Sample
Distributed Databases Powered By Docstoc
					              22nd International Conference on Advanced Information Networking and Applications - Workshops

       Distributed Databases in Dynamic R-Tree for Vehicle Information Systems

                 Yusuke Murase∗ , Ailixier Aikebaier∗ , Tomoya Enokido† and Makoto Takizawa∗
                                   Dept of Computers and Systems Engineering
                                           Tokyo Denki University, Japan
                                   Dept of Faculty of Bussiness Administration
                                               Rissho University, Japan
                       E-mail {murase, alixir}, †,

                            Abstract                                     1. Introduction

                                                                            Various types of vehicle information and communication
                                                                         systems are being developed like ETC (Electronic Toll Col-
                                                                         lection) [9], car navigation systems [3], and FlexRay net-
                                                                         works in cars [7]. We can obtain various types of required
     Information and communication systems for vehicles are              information to derive a car vehicle to a destination in car
  getting significant like ETC (Electronic Toll Collection) and           navigation systems and mobile communications. For ex-
  car navigation systems. In the next generation navigation              ample, we can find Japanese restaurants which satisfy our
  systems, each vehicle can not only receive various types of            requirements through car navigation and can download mu-
  information like maps and traffic but also obtain traffic in-            sic to which we would like to listen in a car. This is based
  formation around the vehicle by using the sensors and send             on center-to-vehicle communications, that is, a center is a
  them to a navigation center. It is critical to discuss how             database server and a vehicle is a client. In the next gen-
  to store information collected by vehicles in databases and            eration navigation systems, each vehicle can obtain infor-
  how vehicles access the information in the database in the             mation not only from centers in mobile networks but also
  presence of a huge number of vehicles on roads. In this                by itself on a road where the vehicle is moving, e.g. traf-
  paper, we propose an enhanced dynamic R-tree (EDR-tree)                fic jams, car accidents, average speeds of nearly cars, and
  scheme to store and retrieve traffic data collected by vehi-            road constructions by using sensors of the vehicle. Thus,
  cles in dynamic distributed database systems. In distributed           vehicles can upload the obtained traffic information to the
  tree-structured indexes like R-tree and B-tree, the root node          centers, possibly to other vehicles by using car to car (C2C)
  and nodes at upper layers easily get performance bottleneck            communications [11]. In Tokyo, more than 2.6 million cars
  and points of failure since every query request is transferred         are moving, and about 250 places on roads are under con-
  from root to leaf node. In this paper, we propose a new tree-          struction every day in Tokyo [1]. In addition, about 200 car
  structured scheme named EDR-tree to store data. A road is              accidents occur every day in Tokyo [2]. Thus, a huge num-
  realized as a sequence of road units. A geographical space             ber of vehicles obtain road information and send them to a
  of roads is separated into area units where road units are             center databases system. The state of roads are changed and
  stored. An area unit is stored in a leaf node and there is             state of road information are sent to the database system. In
  a tree-structured index on the leaf nodes like B+ -tree and            addition, each vehicle makes an access to the databases to
  R-tree. Each vehicle first makes an access to a leaf node,              obtain road information. Thus, a huge volume of data is
  not the root, which has information of a road unit where the           generated and a huge number of queries on routes and traf-
  vehicle is currently moving. Then, a query request is effi-             fics are issued.
  ciently and reliably delivered to a target node by using not              In this paper, we take a distributed database approach to
  only parent-child links but also enhancing links, sibling and          storing and manipulating a vehicle space [10] with traffic
  adjacent links. We evaluate the EDR-tree in terms of search            information obtained by vehicles in order to realize the ef-
  time and insertion time.                                               ficient and fault-tolerant accesses to the traffic information.
                                                                         The database is dynamically distributed in multiple comput-

978-0-7695-3096-3/08 $25.00 © 2008 IEEE                            133
DOI 10.1109/WAINA.2008.237
ers with the dynamic R-tree scheme [6]. By using the R-tree                                                         si
index, records of a part of a road can be efficiently detected.
However, the root node and nodes at higher layers have per-
formance bottleneck and points of failure since every query
request is first issued to the root node and then is transferred
down from the root to a leaf node. In this paper, we propose
a bottom-up searching strategy named enhanced dynamic
R-tree (EDR-tree). First, a vehicle issues a query request                                                               ri3
to obtain information like traffic congestions on a part of a                     ri1              ri2
road. In the traditional tree-index like B+ -tree [5] and R-                                                                   ri4
tree [4, 8], the query request is first sent to the root node.
Then, the query request is sent down to nodes in the tree.
Finally, the query request arrives at a leaf node where target
records of the road are stored. On the other hand, the query
request is first sent to a current leaf node where informa-
tion on a part of a road unit where the vehicle is currently
moving are stored. The leaf node usually exists geographi-
cally nearer to the vehicle. Nodes in the EDR-tree are linked
not only in a parent-child link but also sibling and adjacent
links. Child nodes of a parent node are ordered. Each node                 Figure 1. Objects in vehicle information sys-
has a link to its sibling nodes. Suppose a part r1 of a road               tem.
r is stored in a leaf node D1 and another part r2 adjacent
to the part r1 is stored in another leaf node D2 . Here, a
pair of the nodes D1 and D2 are linked in an adjacent link.
If the leaf node does not have information on the destina-                 ed1(ri)            ed1(ry) ed2(ry)                          ed2(ri)
tion area, the leaf node forwards the query request to one of
                                                                                 ri1               rij     ri,j+1               rimi
the parent, sibling, or adjacent node. The leaf node selects
one node so that the number of nodes to visit is minimized.
Thus, nodes are linked with not only parent-child links but
also sibling and adjacent links in the EDR-tree. We evaluate
the EDR-tree in terms of search time and insertion time.                                 Figure 2. Road objects.
    In section 2, we present a vehicle system model. In sec-
tion 3, we discuss the enhanced distributed R-tree (EDR-
tree). In section 4, we evaluate the EDR-tree.
                                                                        on roads in a vehicle space object S. Suppose a vehicle ob-
                                                                        ject v is moving on a road ri . A road unit object rij where
                                                                        a vehicle object v exists is referred to as current road unit
2 A Model of Vehicle Information System                                 of the vehicle v. Let c(v) show a current road unit of a road
                                                                        of a vehicle object vi A vehicle object v is equipped with
    A vehicle information system (VIS) is composed of                   types of sensors and wireless mobile communication facil-
types of objects, vehicle, server, and space objects as shown           ities. The sensors gather traffic data like passing time i.e.
in Figure 1. A vehicle space object S shows a map of roads              how long it takes to pass the current road unit object c(v)
i.e. a collection of roads r1 , ..., rn (n ≥ 1). Each road              and car accidents. A vehicle object v collects traffic data on
ri is characterized in terms of terminal points ed1 (ri ) and           the current road unit object c(v) by its sensors and sends the
ed2 (ri ). That is, the road object ri shows a route from a             sensed traffic data to databases by using the radio mobile
point ed1 (ri ) to another point ed2 (ri ). Each road object            communications.
ri is realized in a sequence ri1 , ..., rimi (mi ≥ 1) of road               A database object D is a database server where informa-
unit objects [Figure 2]. Here, a pair of road units rij and             tion sent by vehicles are stored and vehicles send queries
ri,j+1 are referred to as adjacent. A road unit object rij is           to obtain traffic information. Traffic information collected
also characterized in a terms composed of nodes. Thus, the              by vehicles are sent to the database and are stored in the
vehicle space object S is hierarchically structured. As dis-            database. In addition, on receipt of a query request req(rij )
cussed in the succeeding section, the vehicle space objects             on a road unit object rij from a vehicle v, traffic informa-
is realized in a balanced hierarchical tree like R-tree [4, 8].         tion prop(rij ) on the road unit object rij are retrieved in the
    A vehicle object v shows a vehicle which moves around               database and are sent to the vehicle object v. In this paper,

a database object D is realized in multiple database subob-            3.2     Enhanced          Dynamic R-tree (EDR-
jects D1 , ..., Dn i.e. distributed database servers for the                   tree)
performance and reliability point of view. Each subserver
object holds one or more than one space node.
                                                                            A vehicle v gathers traffic information on a current
                                                                       road unit rij of a road ri by using sensors and commu-
3. Enhanced Dynamic R-tree                                             nication with other vehicles while moving on the road ri .
                                                                       The vehicle v on a road unit object rij sends a record
3.1    R-tree                                                            end1 (rij ), end2 (rij ), prop(rij ) of the road unit object rij
                                                                       to servers with the mobile communication, possibly car
    An R-tree [4, 8] is a multi-dimensional tree to store spa-         to car (C2C) communications. A pair of the attributes
tial data. An R-tree is hierarchical data structure based on           end1 (rij ) and end2 (rij ) show the end points of the road
a B+ -tree [5]. The R-tree takes the dynamic organization              unit object rij . The traffic information prop(rij ) of the road
of a set of d-dimensional geometric objects. Here, the R-              unit object rij is gathered using the sensors by the vehicle
tree is represented in the minimum bounding d-dimensional              v. Records on road unit objects collected by vehicle are
rectangles. The R-tree is composed of nodes, each of which             stored in a database. According to the increase of traffic, it
shows a rectangle, which bounds its child nodes.                       is hard or impossible to store all of the traffic information
                                                                       in one database. Here, a database is distributed into mul-
   A leaf node of the R-tree includes pointers to the                  tiple computers. A database in each computer is referred
database objects instead of pointers to children nodes. A              to as database node. A database node includes records on
non-leaf node contains pointers to children nodes. An R-               roads in some geographical area. Nodes are structured in
tree T of order (m > M) has the following characteristics.             a hierarchical tree as discussed later. The vehicle v obtains
                                                                       traffic information through communicating with other vehi-
 1. Each leaf node l which is not the root can include the             cle in the C2C communication. Records sent by vehicles are
    number h of entry records where m ≤ h ≤ M and m ≤                  stored in databases of servers. For each vehicle v, a database
    M                                                                  node where records on a current node unit object of the ve-
     2 . l = e1 , ..., ek where each entry read lj is of the
    form node, oid where node shows a d-dimensional                    hicle v is referred to as current database node. A vehicle v
    rectangle which spacially contains the object whose                sends records on the current road unit object to the current
    identifier is oid).                                                 database node. v.current shows the current database node
                                                                       of the vehicle v. Initially, there is one current database node
 2. A non-leaf node n can contain the number h of en-                  for each vehicle v. According to the movement of a vehi-
    tries, n = n1 , ..., nk where m ≤ k ≤ M and m ≤                    cle v, the vehicle v takes different database node as current
     2 . Each entry nv is of a form      node, pt where                ones.
    pt stands for a pointer to a child node and node is a                   An enhanced dynamic R-tree (EDR-tree) T is composed
    d-dimensional rectangle which spacially includes rect-             of index nodes and leaf nodes. Traffic records are stored
    angles contained the child node donated by pt                      in leaf node. Leaf nodes are referred to as database nodes
 3. In the root node, 2 ≤ h ≤ M.                                       where records on road unit objects in some disjoint geo-
                                                                       graphical one are stored. Index nodes are index of child
 4. Every leaf node is at the same level i.e. the R-tree is            nodes like B+ -tree. Each non-leaf node c has child nodes
    height-balanced.                                                   c1 , ..., cm . Each node covers a geographical area denoted
    Figure 1 shows an example of an R-tree with M = 4                  in a rectangle whose covers are x1 , y1 , x1 , y2 , x2 , y1 ,
    and m = 2.                                                           x2 , y2 when x1 ≤ x2 and y1 ≤ y2 . Each node is thus
                                                                       characterized in x1 , y1 , x2 , y2 . A query is issued to
                                                                       find a geographical point x, y , e.g. the destination point.
   The R-tree is dynamically organized as discussed in B+ -            A node c x1 , y1 , x2 , y2 is referred to as satisfied for a
tree. In order to insert an entry record, the R-tree is tra-           query Q x, y of x1 ≤ x ≤ x2 and y1 ≤ y ≤ y2 . That
versed to find a leaf node where the entry record to be lo-             is, a record on the point x, y is includes in a leaf node of
cated. Then, the entry record is inserted in the leaf node to          the subtree of the node c. If a query Q is satisfied by an
the root node is updated accordingly. If the leaf node is too          node c, the query Q x, y is issued to a child node ci where
full to store the entry node, the leaf node is split into two          xci 1 ≤ x ≤ xci 2 and yci 1 ≤ y ≤ yci 2 . Thus, a non-leaf
nodes. There are various ways to split a node, linear, quad            node c points to child nodes c1 , ..., cn . The child nodes c1 ,
and exponential splits.                                                ..., cn are ordered. Here, child nodes ci−1 and ci+1 are pre-
   An entry record is deleted. If a node is underflow, nodes            decessor and successor nodes of a node ci respectively. If a
are merged into are node.                                              pair of child nodes ci and cj include adjacent road units, ci

and cj are referred to as adjacent.
   Each node c has the following pointers as shown in Fig-
ure 3.
  c.parent        = parent node of c.
  c.child[i]      = the ith child node of c.
  c.pred          = predecessor node of c.                                                                     Full
  c.succ          = successor node of c.
  c.adjacent      = adjacent nodes of c.
                                                                                            separate space stored in DI
   D0                                                                                                                       DI

                     D011                        D032
   D01                                                               D03                                              DI1        DI2
                   Spatial data in database nodes.
                                                                                            Figure 4. separation of node DI .

                                                                                 the nodes D01 , D02 and D03 . Then, each child node D0i
                                                                                 is separated into D0i1 and D0i2 (i = 1, 2, 3). Thus, a node
            D01                    D02                    D03
                                                                                 D0i is further separated into D0i1 , ..., D0il0i . Here, Di is
                                                                                 a parent node are and Di1 , ..., Dili are child nodes of Di .
                                                                                 The node Di is an index node which points to child nodes
                                                                                 Di1 , ..., Dili . The nodes D0i and D0i1 is stored in the same
     D011         D012      D021         D022     D031          D032             computer as the parent node D0i .
                                                                                     Let us consider a pair of leaf nodes Di and Dj . Di and
                                                                                 Dj are referred to as adjacent iff a pair of adjacent road
         : child                   : parent
                                                                                 units rie and rju are included in the node Di and Dj , re-
         : predecessor             : successor          : adjacent               spectively. In addition, the child nodes are totally ordered.
                                                                                     Next, let D11 , ..., Dn be child nodes of a node D, child
                      Figure 3. EDR-tree.                                        are ordered. Here, Di−1 and Di+1 are referred to as pre-
                                                                                 decessor and successor nodes of a node Di . Let Sub(Di )
                                                                                 show a subtree whose root is a node Di in the EDR-tree T.
   Each vehicle is collects properties on a current road unit                    Let Q(destination) be a query to detect a destination unit
rij and stores them in the EDR-tree :                                            object. If the subtree Sub(Di) includes a destination road
                                                                                 unit object, the subtree Sub(Di ) is referred to as satisfy a
For every vehicle v.                                                             query Q(destination).
 1. A vehicle v on a road unit object rij collects traffic                            A query Q(destination) issued by a vehicle v is processed
    information record a(rij ) and sends the record a(rij )                      in the EDR-tree as follows :
    to the current database node DI .
                                                                                  1. A vehicle v issues a query Q(destination) to the current
 2. If the database node DI is not full, the record is stored                        database node D. Here, the node D is the initial node
    in the node DI .                                                                 of the query Q.
 3. If the database node DI is full, the database node DI is
    split into 2 nodes DI1 and DI2 [Figure 4]. Each node                          2. On the receipt of a query Q(destination) from a child
    DIk shows a subspace. DI is an index node to the leaf                            node C, a node D forwards the query Q in one of the
    node DI1 and DI2 .                                                               following ways :

  Figure 3 shows EDR-tree for a space D. First, the node                               (a) Q is sent down to a child node C’ (= C) if the
D0 covers a vehicle space. The node D0 is separated into                                   subtree Sub(C’) satisfies the query Q.

      (b) Q is sent up to a parent p if the subtree Sub(D)              2. The throughput of every current node is the same. That
          does not satisfy the query Q.                                    is, it takes the same time to process a query at each
      (c) Q is sent to a sibling node C” if Sub(D) does
          not satisfy the query Q but Sub(C”) satisfies the              3. Each database node has a part of road information.
          query Q.
                                                                          In the dynamic R-tree (DR-tree), every query Q is sent to
      (d) Q is sent to an adjacent node C”’ if the subtree             the root node. Then, the query Q goes down to a leaf node
          Sub(D) does not satisfy the query Q but (C”’)                which includes the destination. On the other hand, a query
          satisfies the query Q.                                        Q is first sent to the current node v.current of a vehicle v in
                                                                       the enhanced DR-tree (EDR-tree). On receipt of the query
 3. On receipt of a query Q from a parent node C, a node               Q, an index node takes one of the following ways to forward
    D forwards the query Q in one of the following ways :              the query Q to the destination node as shown in Figure5 :
      (a) Q is sent to a child node C’ if the subtree Sub(C’)           1. Q is sent up to a parent node.
          satisfies the query Q.                                         2. Q is sent down to a child node.
      (b) Q is sent to a sibling node C” if the subtree                 3. Q is sent to a sibling node, successor or predecessor.
          Sub(D) does not satisfy the query Q but Sub(C”)
          satisfies the query Q.                                         4. Q is sent to a adjacent node.
      (c) Q is sent to an adjacent node C”’ if the subtree
                                                                                                                              : DR-tree link
          Sub(D) does not satisfy the query Q but Sub(C”’)                                                  D0
                                                                                                                              : sibling link
          satisfies the query Q.                                                                                               : adjacent link

 4. On receipt of the query Q(destination) at a database
    node D, if the node D is a destination node of Q, the                            D01                    D02                  D03
    node D directly sends an ACK reply to the initial node
    of Q, else sends a NAK reply to the initial node.

 5. On receipt of a reply, if a node D is the initial node,
                                                                              D011         D012      D021         D022    D031         D032
    the node D sends the reply to the vehicle v. If the ve-
    hicle v moves to another current node D’, the reply is
    forwarded to the current node D’.
                                                                                      : current node               : destination node
    Suppose D011 is a current database node of a vehicle v in                         : DR-tree access             : EDR-tree access
Figure 5. Suppose the vehicle v would like to find a database
node D021 . In the traditional way, the vehicle v first sends a                                    Figure 5. Search.
query Q to the root D0 . Then, the query Q is sent to D02 and
them to D021 . On the other hand, the query Q is forwarded
                                                                          The nodes are realized in processes on the blade server
to the database node D021 through the adjacent link.
                                                                       HP ProLiant BL 10e G2 with Linux. Each Blade is
                                                                       equipped with Petiunm M (1GHz) CPU and 512 MB mem-
4 Evaluation                                                           ory. The Blades are interconnected with the Fast Ethernet.
                                                                       The spatial data of roads is stored in the Postgre SQL 8.2.5
   We evaluate the EDR-tree in terms of delay time. It                 database management system. A query is randomly initi-
means how long it takes to deliver a query request to a des-           ated by a vehicle at each leaf node. The recurrence interval
tination node. Each vehicle v first issues a query Q to a cur-          of queries is 10 milliseconds.
rent database node. Here, v let v.current denote the current              Figure 6 shows the delay time for the number of queries
node of a vehicle vi .                                                 initiated. The delay time is reduced for the EDR-tree. The
   In this paper, we consider the following environment for            more number of queries and issued, the smaller reduction
the evaluation.                                                        rate the EDR-tree obtains. One idea to improve the perfor-
                                                                       mance is to separate the leaf node where more number of
 1. The EDR-tree T includes ten nodes which are tree                   queries are issued.
    structured as shown in Figure 5. D0 , D01 , D02 and                   Figure 7 shows the delay time for how many nodes each
    D03 are index nodes. D011 , D012 , D021 , D022 , D031 ,            query traverses in the tree. The EDR-tree accesses less
    D032 are leaf database nodes.                                      number of nodes then the DR-tree.

                                                                       nodes in the EDR-tree. Even if a node of upper layers is
                                                                       faulty, queries can be still delivered to the destination node
                                                                       as long as the node are in the subtrees of the faulty node.


                                                                          This research is partially supported by Research Insti-
                                                                       tute for Science and Technology [Q06J-07] and Academic
                                                                       Frontier Research and Development Center [18-J-6], Tokyo
                                                                       Denki University.


                                                                        [1] KANTO            INFRASTRUCTURE                PORTALSITE.
                  Figure 6. Delay time                                  [2] Metropolitan                 Police                Department.
                                                                        [3] VICS.
                                                                        [4] A.Guttman. R-trees: a Dynamic Index Structure for Special
                                                                            Searching. In Proc of ACM SIGMOD Conf on Management
                                                                            of Data, pages pp42–57, 1984.
                                                                        [5] R. Bayer and E. McCreight. Organization and Maintenance
                                                                            of Large Ordered Indexes. In Acta Informatica, Vol. 1, Fasc.
                                                                            3, pages 173–189, 1972.
                                                                        [6] S. Bianchi, A. K. Datta, P. Felber, and M. Gradinariu. Stabi-
                                                                            lizing Peer-to-Peer Spatial Filters. In Proc. of IEEE ICDCS-
                                                                            2007, pages 27–35, 2007.
                                                                        [7] FlexRay Consortium, FlexRay
                                                                            Communications System Protocol Specification Version 2.1
                                                                            Revision A, 2005.
                                                                        [8] Y. Manolopoulos, A. Nanopoulos, A. N, Papadopoulos,
                                                                            and Y. Theodoridis. R-Trees: Theory and Applications,
                                                                            Springer. Springer-Verlag, 2005.
                                                                        [9] J. Ohya, Y. Seki, and K. Suzuki. A Study on Operation of
                                                                            ETC (Electronic Toll Collection. In Proc. of IEEE ITSC-
                                                                            1999, pages 581–584, 1999.
       Figure 7. Number of nodes accessed.                             [10] M. Takizawa, S. Hamada, and S. M. Deen. Vehicle Transac-
                                                                            tions. In Proc. of the 4th Int’l Conf. on Database and Expert
                                                                            Systems Applications (DEXA’93), pages 611–614, 1993.
                                                                       [11] S. Tsugawa. Inter-Vehicle Communications and their Appli-
5 Concluding Remarks                                                        cations to Intelligent Vehicles: An Overview. In Intelligent
                                                                            Vehicle Symposium, June, pages 17–22, 2004.
   In this paper, we proposed the enhanced dynamic R-
(EDR)-tree to store and manipulate traffic data collected by
vehicles. In the EDR-tree, a vehicle first issues a query to
a leaf node. Then, the query goes up to a parent or goes to
sibling or adjacent node. Finally, the query Q is delivered to
the leaf node which includes the properties of the destina-
tion. In the EDR-tree, a vehicle first issues a query to a cur-
rent leaf node which includes data on the current road unit
object where the vehicle v is moving. In the traditional tree-
structured search methods like DR-tree and B+ -tree, every
query is first sent to the root and goes down to leaf nodes.
Hence, the root node is congested and is a single point fail-
ure. Overheads for query processing can be distributed to


Shared By: