Docstoc

UBICC-submitted 206 206

Document Sample
UBICC-submitted 206 206 Powered By Docstoc
					                      Continuous Reverse Nearest Neighbor Search

                                     Lien-Fa Lin*, Chao-Chun Chen
                              Department of Computer Science and Information
                    Engineering National Cheng-Kung University, Tainan, Taiwan, R.O.C.
                        Department of Information Communication Southern Taiwan
                             University of Technology, Tainan, Taiwan, R.O.C.

                                lienfa@cc.kyu.edu.tw,chencc@mail.stut.edu.tw


                                                 ABSTRACT
               The query service for the location of an object is called Location Based Services
               (LBSs), and Reverse Nearest Neighbor (RNN) queries are one of them. RNN queries
               have diversified applications, such as decision support system, market decision,
               query of database document, and biological information. Studies of RNN in the past,
               however, focused on inquirers in immobile status without consideration of
               continuous demand for RNN queries in moving conditions. In the environment of
               wireless network, users often remain in moving conditions, and sending a query
               command while moving is a natural behavior. Availability of such service therefore
               becomes very important; we refer to this type of issue as Continuous Reverse
               Nearest Neighbor (CRNN) queries. Because an inquirer’s location changes
               according to time, RNN queries will return different results according to different
               locations. For a CRNN query, executing RNN search for every point of time during a
               continuous query period will require a tremendously large price to pay. In this work,
               an efficient algorithm is designed to provide precise results of a CRNN query in just
               one execution. In addition, a large amount of experiments were conducted to verify
               the above-mentioned method, of which results of the experiments showed significant
               enhancement in efficiency.

               Keywords: Location Based Services, Location-Dependent Query, Continuous
               Query, Reverse Nearest Neighbor Query, Continuous Reverse Nearest Neighbor
               Query

1   INTRODUCTION                                          Location-Dependent Query (LDQ), of which
                                                          applications include Range Query, Nearest Neighbor
     As wireless network communications and mobile        (NN) query, K-Nearest Neighbor (KNN) query, and
device technology develop vigorously and                  Reverse Nearest Neighbor (RNN) query.
positioning technology matures gradually, LBS is               There are plenty of studies about NN [14, 22, 26],
becoming a key development in the industrial as well      KNN [4, 9, 14, 23, 25], CNN [17, 3, 12, 20], and
as academic circles [2, 5, 13, 21, 26, 27]. According     CKNN [17, 20] queries, and issues pertaining to
to the report of “IT Roadmap to a Geospatial Future”      Reverse Nearest Neighbor (RNN) Query [10, 11, 16,
[6], LBSs will embrace pervasive computing and            18, 19, 22, 24] have been receiving attention in recent
transform mass advertising media, marketing, and          years. RNN query means finding a collection of
different societal facets in the upcoming decade.         nearest neighbor objects for S, a given collection of
Despite the fact that LBSs have been existing in the      objects, with q, a given query object. Practical
traditional calculation environment (such as Yahoo!       examples of RNN query are provided in [10]. If a
Local), its greatest development potential lies in the    bank is planning to open a new branch, and its clients
domain of mobile computing that provides freedom          prefer a branch on a nearest possible location, then
of mobility and access to information anywhere            such new branch should be established on a location
possible.                                                 where the distance to the majority of its clients is
      LBSs shall become an indispensable application      shorter than that of other banks. Taxi cabs selecting
in mobile network as its required technology has          passengers is another good example. If a taxi cab uses
matured      and     3G    wireless    communication      wireless devices to find out the location of its
infrastructure is expected to be deployed everywhere.     customer, then RNN queries will be far more
The query that answers to LBSs is referred to as          advantageous than NN queries from the aspect of


                    Ubiquitous Computing and Communication Journal
competition. Figure 1 illustrates that Customer c is       Related works about RNN search are introduced in
the nearest neighbor for Taxi a, but that does not         Section 2. Concerned issues are defined and
necessarily mean Taxi a can capture Customer c             assumptions made are described in Section 3. The
because Taxi b is even closer to Customer c. On the        proposed CRNN search algorithm is introduced in
contrary, the best option for Taxi a should be             Section 4 The experiment environment and
Customer d because Taxi a is the nearest neighbor for      evaluation parameters for experimental efficacy are
Customer d. That is, d is the RNN for a, and a may         described in Section 5. In the end, a conclusion and
reach d faster than any other taxi. This is an example     future study directions are provided in Section 6.
of CRNN query for that the query object, the taxi,
changes location according to time. Mobile users will
be mobile in a wireless environment, and that is why
the continuous query is an important issue in the
wireless environment.
      As far as the knowledge available to the
researchers is concerned, there is not yet any
researcher working on this issue. Because an inquirer
changes location constantly according to time,
changes of location will cause RNN queries to return
different results. For a CRNN query, executing RNN
search for every point of time during a continuous
query period will require a tremendously large price
to pay. The larger the number of query objects and                     Figure 1: Example of RNN query.
the shorter the time segment are, the longer the
calculation time will be.
      In addition, due to the continuance nature of
time, defining the appropriate time segment for RNN        2     RELATED WORK
search will be a concern; if the interval between RNN
searches is too short, then more CRNN queries need              RNN search concerns about finding q, a query
to be executed to complete the query, and vice versa.      that is the NN for some objects. Related works of
If a RNN search is repeated over a longer period of        study about RNN search are introduced and
time to reduce the number of execution, the RNN            summarized in this section:
query result for the whole time segment will lose              Index methods that support RNN search
accuracy due to insufficient frequency of sampling.            The number of objects can be infinite; if one must
      In this paper, a more efficient algorithm is         first find out the distance from query q to each object
designed to replace processing of each and every           for identifying the RNN for query q, then the
point of time for RNN search; just one execution of        efficiency may be unacceptably low due to
CRNN query is all it takes to properly define the          overwhelmingly large computation cost. To
segment for the query time that a user is interested in,   accelerate processing speed, most of studies adopt the
and find out the segments that share the same answer       index methods. Major index methods are introduced
and the RNN for each of the intervals.                     in this section.
                                                                RNN search of different types
Other than that, an index is also used to filter out           RNN searches in different scenarios are described
unnecessary objects to reduce search space and             and categorized according to static and moving
improve CRNN search efficiency. The experiment             situations of query q and the objects.
result suggests that using index provides efficiency
20 times better than not using index when the number       2.1    Index Methods for RNN Query
of objects is 1000.
This Study provides major contribution in three                 RNN search concerns about finding q, a query
ways:                                                      that is the NN for some objects, and it is necessary to
           This Study pioneers into continuous query       find out the distance between query q and each object,
           processing methods opposite to static           or the distance from the coordinate of query q to the
           query regarding RNN issues.                     coordinate of an object. For a given q, not every
           A CRNN search algorithm is proposed;            object is its RNN, and these objects which can not be
           just one execution will return all CRNN         RNN may be practically left out of consideration to
           results.                                        reduce the number of objects to be taken into
           The proposed method allows the index            consideration and accelerate processing speed for
           which was only applicable to finding RNN        RNN search. Many studies were dedicated to the
           for a single query point to support CRNN        designing of an effective indexing structure for
           query to improve CRNN search efficiency.        coordinates of an object. The most famous ones are
The structure of the other sections in this work:          R-Tree proposed by [8] and Rdnn-Tree proposed by



                     Ubiquitous Computing and Communication Journal
[10]. These two index methods are described below.         these child trees to its NN will not exceed MaxDnn.

2.1.1R-Tree

     R-Tree is an index structure developed in early
years for spatial database and was used by [10] to
accelerate RNN search processing. All objects are
grouped and then placed on leaf nodes according to
the closeness of their coordinates. That is, objects at
similar coordinates are put in one group. Next, each
group of objects is contained in a smallest possible
rectangle, which is called Minimum Bounding
Rectangle (MBR). Next, MBRs are grouped in
clusters, which are contained inside a larger MBR
until all objects are contained in the same MBR.
What is stored on an internal node of a R-Tree is an              Figure 3. Data structure of Rdnn-tree
MBR, in which all nodes underneath are contained,
and the root of the R-Tree contains all objects. The       2.2 Categories of Rnn queries
size and range of an MBR is defined by its lower left
coordinate (Ml,Md) and upper right coordinate (Mr,                Depending on the static or moving status of
Mu). Figure 2 is an example of R-Tree. From a to l,        query q and the query objects, related studies can be
                                                           summarized into 4 categories.
there are total 12 objects; (a , b , c , d) belong to
                                                           1. If query q and the query objects are both static,
MBR b1, and (e,f,g) belong to MBR b2. MBR b1               then this category is called static query vs. static
and b2 belong to MBR B1, and MBR R contains all            objects.
objects..                                                  2. If query q is moving and the query objects are
                                                           static, then this category is called moving query vs.
                                                           static objects.
                                                           3. If query q is static and the query objects are
                                                           moving, then this category is called static query vs.
                                                           moving objects.
                                                           4. If both query q and the query objects are moving,
                                                           then this category is called moving query vs. moving
                                                           objects.

                                                           2.2.1 Static query vs. static objects

                                                                The scenario that both query q and query objects
       Figure 2: Example of R-tree Indexing
                                                           are static is first discussed because the query and
                                                           query objects are immobile and are therefore easier
2.1.2 Rdnn-Tree
                                                           for processing than other scenarios. The method
                                                           proposed in [10] is now introduced. For static
     Rdnn-tree (R-tree containing Distance of Nearest
                                                           database, the author adopts a special R-tree, called
Neighbors) [22] improves the method of [10]. The
                                                           RNN-tree, for answering RNN queries. For static
author proposes a single index structure (Rdnn-tree)
                                                           database that requires being frequently updated, the
to provide solutions for NN queries and RNN queries
                                                           author proposes a combined use of NN-tree and
at the same time. Rdnn-tree differs from standard R-
                                                           RNN-tree. NN of every object is stored in the RNN-
tree structure by storing extra information about
                                                           tree, and what are stored in the NN-tree are the
nearest neighbor of the points in each node.
                                                           objects themselves and their respective collections of
Information of (ptid,dnn) is stored on the leaf node
                                                           NN. The author uses every object as the center of a
of Rdnn-tree, as shown in Figure 3. ptid means an          circle, of which the radius is the distance from the
object of which the data concentrate on the dimension,     object to its NN, to make a circle, and then examines
denoted as d, and dnn means the distance from such         every circle that contains query q to find out the
object to its NN. Information of (ptr , Rect ,             answers of RNN queries. Such method, however, is
MaxDnn) is stored on a non-leaf node, where ptr            very inefficient for dynamic database because the
points to the address of a child node, Rect contains       structures of NN-tree and RNN-tree must be changed
the MBR of all child nodes subordinate to this node,       whenever the database is updated. In [22], the method
and MaxDnn means the maximum value of dnn of all           proposed by [10] is therefore improved. The author
objects in the child trees subordinate to this node. The   proposes a single index structure, Rdnn-tree, for
maximum distance from any object contained in              answering NN queries and RNN queries at the same



                     Ubiquitous Computing and Communication Journal
time. It differs from normal R-tree; it separately         q; instead, i segments of time, such as segment1,
stores the information of NN of every object (i.e.         segment2 …segmenti, that have the same result, are
Distance of Nearest Neighbor), and NN of every             first identified among the entire CRNN query time
object must be calculated in advance.                      period. RNN result of each segment of time, such as
                                                           RNN1, RNN2, …, is calculated separately, and the
2.2.2 Static query vs. moving objects                      result is returned in the format of (q ,
                                                           [segment1])={RNN1 result} , … , (q ,
     Studies mentioned above primarily assume a            [segmenti])={RNNi result} back to the inquirer.
monochromatic situation that all objects, including
query q and query objects, are of the same type. In
[18], the researcher addresses this type of issues in a
bichromatic situation that objects are divided into two
different types; one is inquirer, and the other is query
object. NN and range query techniques are used in
this Paper to handle RNN issues.

2.2.3 Moving query
                                                                  Figure 4. Example of a CRNN query
     This subsection discusses the situation when an
inquirer is no longer static but changes his or her              Based on the description above, CRNN query,
location according to time, and the query object can       one issue that this Study concerns, may be stated as
be either static or moving. That is, two categories of     below:
query: moving query with static objects and moving         Given:
query with moving objects, are involved. Because the       A collection of static objects S={O1,O2,…,On}
inquirer is moving, these two categories of query will     A query point q, its current position (q.x,q.y), and
return different results for the identical RNN search      moving velocity (v.x,v.y)
at different points of time. This type of issue is         A continuous query time [Ps , Pe]; where Ps
obviously more complicate than the issues previously
                                                           represents the coordinate of the point of time when
discussed. As far as the knowledge available to the
                                                           the query begins, and Pe represents the coordinate of
researcher is concerned, no related study has ever
                                                           the point of time when the query ends
discussed about the issues of these two categories. In
                                                            Find:
this Study, solutions for a moving query with static
                                                           The RNNs of q between any two adjacent points of
objects are pursued.
                                                           time of {P1 ,P2 ,…,Pi} within [Ps ,Pe] remain
                                                           constant.
3 Problem Formulation                                      Such that:
                                                           RNN(q,[Ps ,P1]) = {RNN1},RNN(q,[P1 ,P2])
      CRNN query concerns about a period of
continuing time where adjacent points in such period       ={RNN2},…, RNN(q,[Pi ,Pe]) = {RNNi},
of time may have the identical RNN. That is, a period      where [Pi,Pe] ⊆ [Ps,Pe],{RNN1} ∈ {O1,O2,…,
of time may have the same RNN unless the query q           On}.
has moved beyond this period of time. Please refer to      Under the assumptions:
Figure 4. When a user executes CRNN query q, the           1. The moving direction of query q is fixed.
time segment of the continuous query is [Ps,Pe], and          2. All query objects are static.
the query objects are {a,b,c,d}. If time point P1               As described above, two adjacent points of time
can be identified, and any given point of time in the      may share the same RNN, or a segment of time has
                                                           only one RNN unless query q moves to another
time segment from Ps to P1, or [Ps,P1], has the same
                                                           segment of time that has a different RNN. The CRNN
RNN result, then one-time execution of RNN search
                                                           search algorithm proposed in this Study uses exactly
is all it needs for the time segment of [Ps ,P1]. If       this concept. First, the points of time that produce
points of time, P2, P3, and P4, are also identified, and   different RNN results within a query time are
any given point of time in the time segment of [P1,        identified. These points of time divide the query time
P2] has the identical RNN result, while any given          into several segments that have different RNN results,
point of time in the time segment of [P2,P3] has the       and then the RNN results are identified for each of
identical result, and any given point of time in the       the segments. The detailed algorithm of CRNN
time segment of [P3,P4] has the identical result, then     Search is explained in the next section.
the entire CRNN query needs only one-time
execution of RNN search at each time segment.              4. CRNN Search Algorithm
     For processing CRNN query, it is not necessary
to execute RNN search for every point of time and                The detailed procedure of CRNN Search
return the RNN of every point of time back to query        algorithm is introduced in this section. CRNN Search



                     Ubiquitous Computing and Communication Journal
algorithm is divided into two steps.                           As illustrated in Figure 6, if the NN of object a
Step 1: Finding segment points of CRNNq                   is b, and a circle is made using ab as the radius with
      Points of time that produce different RNN
                                                          a as the center point, then the distance from query q
results are identified. Based on these points of time,
                                                          to a must be shorter than the distance from a to its
CRNN query is divided into several time segments
                                                          NN, or object b, as long as query q falls within this
that require execution of RNN search. The RNN
                                                          circle. Therefore, during the period of time when
result for any given point of time within one segment
                                                          query q remains within this circle, RNNs of object a
will remain constant, and different segments have
                                                          must include a, unless query q moves out of this
different RNN results.
                                                          circle. Because the moving direction of query q is
Step 2: Calculating RNN result of each segment
                                                          assumed to be fixed, CRNN query will form a query
     Separately calculate the RNN results for each of
                                                          line (qline) from its beginning to its end. The point to
the segments that have been divided in the previous
                                                          which this CRNN query begins to leave this circle is
step.
                                                          the intersection S of this circle and the query line
      The entire procedure for processing CRNN
                                                          formed by CRNN query. Before intersection S, the
Search is illustrated in Figure 5. On top of the
                                                          result of RNN query must include object a; beyond
necessary query objects and continuous query (query
                                                          intersection S, the result of RNN query will not
path), it is divided into two steps: finding segment
                                                          include object a; the RNN results will be different.
points of CRNNq and calculating RNN result of each
                                                          This intersection is referred to as a segment point.
segment; each of the steps is described below:
                                                                 This explains why the intersection of the circle
                                                          with NN as its radius and the query line is the point
                                                          of time where RNN query produces different results.
                                                          Making a circle by using an object itself as the center
                                                          and the distance to its NN as the radius will enable all
                                                          of the intersections of the circle and the query line of
                                                          CRNN query to cut CRNN query into several time
                                                          segments that have different results of RNN query.




         Figure 5: Flow chart of CRNN query
                      processing

4.1 Finding segment points of CRNN

      What CRNN query pursues is a period of              Figure 6. Finding segment point of CRNN search
continuous time; the moving distance of query
objects is very short among some adjacent points of             Figure 7 illustrates the time segmentation
time for the query, thus possibly resulting in the same   process described above. For object a, b, and c, their
RNN result. That is, the entire period of continuous      respective NNs are identified first: NN(a)=b,
query is divided into several segments, and the RNN       NN(b)=a, and NN(c)=b. Next, use each object as the
results in each segment are the same. If these points     center of a circle, and the distance to its respective
of time share the same RNN result, then it is not         NN as the radius to make circles of a, b, and c. Then,
necessary to execute RNN search for each of the           intersections of the circles and qlines, Ps, P1, P2, P3,
points of time; one-time calculation is enough.           P4, and Pe , are sorted according to time, and every
Therefore, CRNN query does not require executing          two intersection points define a time segment. The
RNN search for all points of time. Instead, points of     entire CRNN query is cut into five time segments, [Ps,
time that share the same RNN result are grouped into      P1] , [ P1, P2] , [P2,P3] , [P3,P4] , and [P4,Pe].
time segments, and one-time RNN search is executed        Every segment has a unique RNN query result.
for each of the segments. RNN of query q is a
collection of the objects of which the NN is query q.
If the distance, or N, is realized in advance, then
these objects are the RNN for query q when the
distances from query q to the objects are shorter than
the distances from the objects to their respective NN.



                     Ubiquitous Computing and Communication Journal
                                                          Figure 8: Calculating RNN result of each segment.

                                                          1.   CRNN Algorithm with Index
                                                                Not every object will be an answer in the
                                                          processing of CRNN query. To improve RNN query
                                                          efficiency, it is preferred that the objects that can not
                                                          be answers are filtered out in advance to greatly
                                                          reduce search space for CRNN query, size of data
                                                          that requires CRNN query, and consequently,
                                                          computation cost. The process that further improves
                                                          CRNN query efficiency dramatically is referred to as
                                                          pruning process. Figure 9 illustrates the flowchart of
                                                          CRNN query processing with a pruning process
                                                          added.
    Figure 7: Segmenting of the CRNN query

4.2 Calculating RNN result of each segment

      In the previous section, intersections of qlines
and the circles with the distances between the objects
and their respective NNs as the radiuses are defined.
With these intersections, CRNN query is cut into
several time segments. The next step is to find RNNs
for each of the time segments. Because the distances
from query objects to their respective NNs are used
as the radiuses to make circles which are coded by
the objects’ numbers, if a segment falls within a
certain circle, then the resulting RNN of this time
segment for the CRNN query is the object collection
represented by such circle. This is illustrated in
Figure 8. First, intersections of qlines that represent
the CRNN query and the circles of the objects are          Figure 9. Flow chart of CRNN query with index
sorted by time; every two intersection points define a
time segment, and there are five segments, [Ps,P1] ,            Step 2 and 3 are identical to Step 1 and 2 in
                                                          CRNN search algorithm, which have been described
[ P1 , P2] , [P2 , P3] , [P3 , P4] , and [P4 , Pe].
                                                          in the previous sections, and they will not be
Segment [Ps , P1] is contained only by circle a,          reiterated again here. For step 1, the pruning process,
therefore: RNN(q,[Ps ,P1]) ={a}. Next, examine            an index structure for Rdnn-tree is designed to
segment [Ps,P1]; this segment is contained by circle      effectively execute the pruning process. The three
a and circle b. Therefore: RNN(q, [Ps,P1]) = {a,          steps of CRNN query with index are illustrated in
b}. If this process is repeated, then the obtained        Figure 9. The pruning process is described below. For
results will be RNN(q , [P2 , P3]) = {a , b , c},         every internal node of Rdnn-tree, the distance from
RNN(q,[P3,P4]) ={b,c}, and RNN(q,[P4,P5])                 query q to its node will be computed for every
={c}.                                                     separation, and the distance is denoted as D(q,Rect).
                                                          If D(q,Rect) of a node is larger than MaxDnn of the
                                                          node, then all the objects beneath it will not be
                                                          considered because the distance from query q to Rect
                                                          Node will be equal to or larger than the distances
                                                          from query q to all the objects underneath Rect node.
                                                          When the distance from query q to Rect node is
                                                          longer than MaxDnn, it is impossible that query q is
                                                          closer to its NN than any other object underneath
                                                          Rect node, and no object underneath can be the RNN
                                                          result for query q. On the contrary, if D(q,Rect)
                                                          equals the MaxDnn of such node, then the distances
                                                          from some objects underneath Rect node to their
                                                          respective NNs are shorter than the distance from
                                                          query q to Rect. That is, some objects are the RNN
                                                          results for query q. The examination continues along



                     Ubiquitous Computing and Communication Journal
the branch all the way to the lead node. All entries      therefore, h and i are placed inside RNNCanSet. Next,
underneath such leaf node are recorded as the             b4 is examined. MaxDnn of b4 is equal to or smaller
candidate objects for RNN query result. The               than D(q , b4); therefore, b4 can be pruned. The
collection of these candidate objects is referred to as   entire pruning process then ends.
RNNCanSet, which means the possible results for                 However, the CRNN query to be processed is
RNN query must exist within this collection, and the      not a RNN query of a single query point; therefore,
objects outside of RNNCanSet can not possibly be          the pruning process in [22] can not be directly used.
RNN query results. All that are needed to be              To ensure that no possible RNN result is deleted, the
considered when finding segment point of CRNNq of         criteria of pruning is changed from the condition that
CRNN search algorithm are the objects inside              D(q,Rect), the distance from query point to Rect,
RNNCanSet. This will greatly reduce the quantity of       must be longer than MaxDnn to the condition that
objects needed to be handled and enhance CRNN             MinD(q,Rect)>MaxDnn, where MinD(qline,Rect)
search algorithm efficiency.                              represents the minimum distance from qline to Rect
      Figure 3 explains the pruning process. It begins    node. The reason why the shortest distance is selected
with root node R. Because D(q,R) ≦MaxDnn of R,            is that if the minimum distance from the entire qline
child nodes of B1 and B2 must be examined. Because        to Rect node is larger than MaxDnn, then the distance
the MaxDnn of MBR B1 ≦D(q,B1), all child nodes            from any given point of time on the qline to Rect
underneath B1 can be pruned. Next, D(q ,                  node must be longer than MaxDnn. Therefore, all the
B2)≦MaxDnn of B2, so child nodes b3 and b4 of B2          objects underneath Rect node can not be RNN for
must be examined. D(q,b3) is equal to or smaller          qline, and pruning is out of consideration. Details of
than the MaxDnn of b3, which is also a leaf node;         the pruning algorithm are exhibited in Algorithm 1:




                                       Algorithm 1: Pruning Algorithm.
                                                         and comparison of experiment results.
5. Performance Study
                                                          5.1 Experiment Settings
      To evaluate the improvement which the method
proposed in this Study has made in CRNN query                   The coordinates of the objects disperse in an
efficiency, some experiments are designed, and this       experiment environment of [0 , 1]×[0 , 1] plane.
section provides descriptions of experiment               Because distribution density of the objects may
environments, experimental parameters and settings,       influence efficiency, it should be taken into



                     Ubiquitous Computing and Communication Journal
consideration in the experiment. In the experiment,                                            referring to [15], and the velocity vector of each
three different types of distribution are used in the                                          query falls between [-0.01 , 0.01]. Because the
generation of objects’ coordinates. The three different                                        influence of different types of object distribution on
types of distribution are Uniform distribution,                                                efficiency is concerned in this experiment, the queries
Gaussian distribution, and Zipf distribution. In                                               are generated as close to the center of the plane as
uniform distribution, the objects are evenly                                                   possible. Having executed 30 queries, the average
distributed on the plane, as shown in Figure 10(a). In                                         cost of executing one CRNN search is used in
Gaussian distribution, most of the objects concentrate                                         determining which method is more favorable. As to
on the center of the plane, as shown in Figure 10(b).                                          the program coding of Rdnn-tree in the CRNN search
In Zipf distribution, most of the objects will distribute                                      algorithm, R*-tree code of GIST[7] is used in
at the extreme left and extreme bottom of the plane.                                           perfecting Rdnn-tree to make it match with the
In the experiment, skew factor is set at 0.8, as shown                                         requirement of this experiment.
in Figure 10(c). In addition, 30 queries are generated
randomly in a [0.4 , 0.6]×[0.4 , 0.6] plane by
                1                                                       1                                                     1



               0.8                                                     0.8                                                   0.8



               0.6                                                     0.6                                                   0.6




                                                                                                                         s
           s




                                                                                                                     Y axi
                                                                   s
       Y axi




                                                               Y axi




               0.4                                                     0.4                                                   0.4




                                                                       0.2                                                   0.2
               0.2



                                                                        0                                                     0
                0                                                                                                                  0   0.2   0.4           0.6   0.8   1
                     0   0.2   0.4           0.6   0.8     1                 0   0.2   0.4           0.6   0.8   1
                                                                                             X axi
                                                                                                 s                                                     s
                                                                                                                                                   X axi
                                         s
                                     X axi




                                                   Figure 10 Data sets of experiment evaluation

      In addition to the object distribution described                                         CRNN query is executing RNN algorithm for every
above, the influences that the amount of query time                                            point of time which is continuous, and it is
(qline) and the number of objects may impose on                                                impossible to calculate the required count of
efficiency are also considered. Three data sets of                                             execution. Therefore, the CRNN query time must be
Uniform, Gaussian, and Zipf are considered in object                                           segmented before the total execution time required
distribution. The amount of query time (qline)                                                 for CRNN query may be calculated. The more the
changes from query length 1 to query length 10. The                                            time is segmented, the more executions of RNN are
number of objects changes from 1K to 10K.                                                      required. If a period of time is segmented into m
Parameters and settings used in the experiment are                                             segments, then time complexity will be O(m×n3), and
listed in Table 1.                                                                             if time is not adequately segmented, then the RNN
       Table 1: Parameter settings of experiment                                               result may be erroneous. These make it an inefficient
                                                                                               CRNN search algorithm, and it will not be compared
Parameter                Description                     Settings                              in this experiment. Efficiency of two methods is
distribution             Data distribution               Uniform,                              compared in this experiment: one uses Rdnn-tree as
                                                         Gaussian, Zipf                        the index, and the other uses no index. To evaluate
interval                 Time interval of                1, 2, 5, 8, 10                        these two methods, comparison of the time required
                         Query                                                                 for one CRNN search execution can be used, and this
object-no                Number of Data                  1, 10. 30, 50 ,                       comparison is referred to as total cost in this Study.
                         Objects                         100(k)
                                                                                               5.4 Performance Results and Discussion
5.2 Compared Algorithms and Performance
Metrics                                                                                              Based on the changes of metrics (distribution,
                                                                                               interval, and object-no), different types of
      The most intuitive method for finding RNN is                                             experiments have been conducted. Results are
looking for the NN of every object. If the number of                                           summarized by object-no and query interval in the
query objects is N, then time Complexity is O(n2).                                             next section.
Next, determine which objects’ NNs are query points.
If the NNs are the query points, then the objects will                                         5.4.1 The effect of object-no parameter
be the RNNs for the query points. The required time
complexity for the RNN algorithm is O(n3).                                                           First, the fixed query interval is set at 5. The
      However, the most intuitive method for finding                                           influence imposed on efficiency by object-no



                               Ubiquitous Computing and Communication Journal
parameter, which is the number of objects, under                                                                                                       comparison of influence from object distribution on
different types of object distribution, will be                                                                                                        efficiency clearly suggests Zipf distribution offers the
discussed. The experiment result is shown in Figure                                                                                                    best efficiency, followed by Uniform distribution,
11. X axle represents the total cost of time required                                                                                                  and Gaussian distribution offers the worst. This result
for executing one CRNN search, and Y axle                                                                                                              can be explained as such: because Zipf distribution is
represents the number of query objects. Total cost                                                                                                     located at the far left and the lowest bottom, most of
increases as the number of query objects increases.                                                                                                    the data will be pruned, and the number of objects
      In addition, when the number of objects is 1K,                                                                                                   that are included in RNNCanSet without being
the efficiency of the CRNN search that uses Rdnn-                                                                                                      pruned is very small, offering the lowest total cost.
tree is about 300 seconds, and that of the CRNN                                                                                                        On the opposite, data in Gaussian distribution
search using no Rdnn-tree index is about 15 seconds;                                                                                                   concentrate in the center of the plane and very few
about 20 times faster. It is obvious that pruning some                                                                                                 data can be pruned, allowing many objects to remain,
unnecessary objects by adopting Rdnn-tree as the                                                                                                       and causing a large RNNCanSet, thus resulting in the
index to reduce CRNN search space provides much                                                                                                        highest total cost.
higher efficiency than not adopting Rdnn-tree. The
                                  7                                                                                  7
                                10                                                                                 10                                                                                             107

                                                  CRNN without index                                                                                                                                                   6        CRNN without index
                                106                                                                                10
                                                                                                                     6        CRNN without index                                                                  10
                                                    CRNN with index                                                                                                                                                               CRNN with index
                                                                                                                                CRNN with index
                                  5                                                                                  5                                                                                                 5
                                10                                                                                 10                                                                                             10




                                                                                                                                                                                              Total time (sec.
      Total time (sec.




                                                                                                Total time (sec.




                                                                                                                                                                                                            )
                    )




                                                                                                              )




                                104                                                                                104                                                                                            104


                                103
                                                                                                                   103                                                                                            103

                                  2                                                                                  2                                                                                                 2
                                10                                                                                 10                                                                                             10


                                10                                                                                 10                                                                                                 10


                                 1                                                                                  1                                                                                                  1
                                      1K          10K              30K            50K    100K                            1K   10K              30K            50K   100K                                                   1K   10K              30K           50K   100K
                                                                object-no                                                                   object-no                                                                                         object-no




                                              Figure 11: Influences on different types of data distribution from changing object-no

5.4.2 The effect of query interval parameter                                                                                                           distance of MinD(qline , Rect) decreases, causing
                                                                                                                                                       pruning efficiency to reduce. On the contrary, when
      This section focuses on the influence from the                                                                                                   the interval increases, the number of time
length of query interval on each method under                                                                                                          segmentation      by   CRNN      query    increases.
different types of object distribution. Results of the                                                                                                 Consequently, the number of RNN searches for every
experiment are shown in Figure 12. Generally                                                                                                           segment increases, and total cost of CRNN query
speaking, when the query interval is lengthened, the                                                                                                   increases as well.
                                105                                                                                105                                                                                           10
                                                                                                                                                                                                                   5




                                104                                                                                104                                                                                           10
                                                                                                                                                                                                                   4

                                                                    CRNN without index                                                         CRNN without index
                                                                      CRNN with index                                                            CRNN with index
             Total Time (sec.




                                                                                                                                                                           Total Time (sec.
                                                                                                Total time (sec.
                           )




                                                                                                                                                                                         )
                                                                                                              )




                                      3
                                10                                                                                 103                                                                                           103                           CRNN without index
                                                                                                                                                                                                                                                 CRNN with index


                                102                                                                                10
                                                                                                                     2
                                                                                                                                                                                                                 10
                                                                                                                                                                                                                   2




                                 10                                                                                10                                                                                            10



                                     1                                                                              1                                                                                             1
                                          1         2               5                8     10                            1     2                   5            8    10                                                1        2                5              8     10
                                                               Query Interval                                                             Query Interval                                                                                   Query Interval




                                                  Figure12: Effect of query interval parameter for different data distribution

6. Conclusions and Future Works                                                                                                                        Study also prove the efficiency of the proposed
                                                                                                                                                       method. As wireless communication and mobile
    An efficient CRNN search algorithm is proposed                                                                                                     device technology become mature, more and more
in this Study. Such algorithm requires only one                                                                                                        users access information from wireless information
execution to find out RNN results from all continuous                                                                                                  systems through mobile devices. To process requests
RNN searches. The diversified experiments in this                                                                                                      from more and more mobile users, data dissemination




                                                           Ubiquitous Computing and Communication Journal
through broadcast is an effective solution for                  (CLDB’96), pp. 215–226.
scalability. The future goal of this Study is extending   [13] Lee,D.L., chien Lee, W.,Xu,J., and Zheng, B.
the issues of CRNN search to the wireless                       (2002) Data management in location dependent
broadcasting environment.                                       services. IEEE Pervasive Computing,1, 65–72.
                                                          [14] Roussopoulos,N., Kelley,S. ,and Vincent,
7. References                                                   F.(1995)        Nearest    neighbor      queries.
                                                                Proceedings of ACM Sigmod International
[1] AmitSingh,H.F. and Tosun,A.S. (2003) High                   Conference on Management of Data , Illinois,
       dimensional reverse nearest neighbor queries.            USA, June, pp.71–79.
       Proceedings of the 20th International              [15] Ross,S.(2000) Introduction to Probability and
       Conference on Information and Knowledge                  Statistics for Engineers and Scientists.
       Management (CIKM’03), NewOrleans, LA,              [16] Stanoi,I., Agrawal,D., and Abbadi,A.E. (2000)
       USA, pp.91–98.                                           Reverse nearest neighbor queries for dynamic
[2] Barbara,D. (1999) Mobile computing and                      databases. ACM SIGMOD Workshop on
       databases-a survey. IEEE Transactions on                 Research Issues in Data Mining and
       Knowledge and Data Engineering, 11, 108–117.             Knowledge Discovery, pp.44–53.
[3] Benetis,R.,Jensen, C.S.,Karciauskas,G., and           [17] Song,Z. and Roussopoulos,N. (2001) K-nearest
       Saltenis,S. (2002) Nearest neighbor and reverse          neighbor search for moving query point.
       nearest neighbor queries for moving objects.             Proceedings of 7th International Symposium on
       International Database Engineering and                   Advances in Spatial and Temporal Databases,
       Applications Symposium, Canada, July17-19,               LNCS2121, RedondoBeach, CA, USA, July12-
       pp.44–53.                                                15, pp.79–96.
[4] Chaudhuri,S. and Gravano,L. (1999) Evaluating         [18] Stanoi,I.,Riedewald, M.,Agrawal,D., and
       top-k selection queries. Proceedings of the 25th         Abbadi,A.E. (2001) Discovery of influence sets
       IEEE International Conference on Very Large              in frequently updated databases. Proceedings of
       Data Bases, pp.397–410.                                  the 27th VLDB Conference, Roma, Italy, pp.99–
[5] Civilis,A., Jensen,C.S., and Pakalnis,S. (2005)             108.
      Techniques for efficient road-network-based         [18] Tao,Y., Papadias,D., and Lian,X. (2004) Reverse
      tracking of moving objects. IEEE Transactions             knn search in arbitrary dimensionality.
      on Knowledge and Data Engineering, 17, 698–               Proceedings of 30th Very Large Data Bases,
      712.                                                      Toronto,       Canada,    August29-September3,
[6] Computer Science and Telecommunication                      pp.279–290.
      Board.IT Roadmap to a geospatial future, the        [20]Tao,Y., Papadias, D., and Shen,Q. (2002)
      national academies press,2003.                            Continuous       nearest    neighbor      search.
[7] http://gist.cs.berkeley.edu/.                               International Conference on Very Large Data
[8] Guttman,A. (1984) R-trees:A dynamic index                   Bases, Hong Kong, China, August 20-23,
       structure for spatial searching. Proceedings of          pp.279–290.
       the 1984 ACM SIGMOD international
       conference on Management of data, pp.47–57.        [21] Xu,J., Zheng,B., Lee,W.-C.,, and Lee,D.L. (2003)
[9] Hjaltason,G.R. and Samet,H. (1999) Distance                 Energy efficient index for energy query
       browsing in spatial data bases. ACM                      location-dependent      data      in    mobile
       Transactions on Database Systems (TODS), 24,             environments. In Proceedings of the 19th IEEE
       265–318.                                                 International Conference on Data Engineering
[10] Korn,F. and Muthukrishnan,S. (2000) Influence              (ICDE’03), Bangalore, India, March, pp.239–
       sets based on reverse nearest neighbor queries.          250.
       Proceedings of the 2000 ACM SIGMOD                 [22] Yang,C. and Lin,K.-I. (2001) An index structure
       International Conference on Management of                for efficient reverse nearest neighbor queries.
       Data, Dallas, Texas, USA, May16-18, pp.201–              Proceedings of the 17th International
       212.                                                     Conference on Data Engineering, pp.485–492.
[11] Korn,F.,Muthukrishnan, S.,and Srivastava.,D.         [23] Yiu,M.L., Papadias,D., Manoulis,N., and Tao,Y.
       (2002) Reverse nearest neighbor aggregates               (2005) Reverse nearest neighbors in large
       over data streams. Proceedings of the                    graphs. Proceedings of 21st IEEE International
       International Conference on Very Large                   Conference on Data Engineering (ICDE),
       DataBases (VLDB’02), Hong Kong, China,                   Tokyo, Japan, April5-8, pp.186–187.
       August, pp.91–98.                                  [24] Yu,C.,Ooi,B.C., Tan,K.-L., and Jagadish,H.V.
[12] Korn,F., Sidiropoulos, N.,Faloutsos, C.,Siegel,E.,         (2001) Indexing the distance: An efficient
       and Protopapas,Z. (1996) Fast nearest neighbor           method to knn processing. Proceedings of the
       search in medical image database. In                     27th VLDB Conference, Roma, Italy, pp. 421–
       Proceedings of the 22th International                    430.
       Conference on Very Large Data Bases                [25] Zheng,B.,Lee, W.-C., and Lee,D.L. (2003)



                     Ubiquitous Computing and Communication Journal
     Search k nearest neighbors on air. Proceedings         Crete, Greece,March, pp.48–66.
     of the 4th International Conferenceon Mobile      [27] Zhang,J., Zhu,M., Papadias,D., Tao,Y., and
     Data Management, Melbourne, Australia,                 Lee,D.L. (2003) Location-based spatial queries.
     January, pp.181–195.                                   In Proceedings of the 2003 ACM SIGMOD
[26] Zheng,B., Xu,J., chien Lee, W., and Lee,D.L.           international conference on Management of
     (2004) Energy conserving air indexes for               data, SanDiego, California, USA, June9-12,
     nearest neighbor search. Proceedings of the 9th        pp.443–454.
     International Conference on Extending
     Database Technology (EDBT’04), Heraklion,




                    Ubiquitous Computing and Communication Journal

				
DOCUMENT INFO
Shared By:
Categories:
Tags: UbiCC, Journal
Stats:
views:17
posted:6/17/2010
language:English
pages:11
Description: UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.
UbiCC Journal UbiCC Journal Ubiquitous Computing and Communication Journal www.ubicc.org
About UBICC, the Ubiquitous Computing and Communication Journal [ISSN 1992-8424], is an international scientific and educational organization dedicated to advancing the arts, sciences, and applications of information technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the various fields of Information Technology, and for interpreting the impact of information technology on society.