The Skyline Query in Databases Which Objects are the by oneforseven

VIEWS: 45 PAGES: 42

									The Skyline Query in Databases
   Which Objects are the Most Important?


    Donghui Zhang
    College of Computer and Information Science
    Northeastern University
The Skyline of Boston




Buildings not “dominated”, i.e. shorter and further than another building.

2009-4-23                  Tian Xia - Northeastern University            2
The Skyline of NBA Players
    NBA statistics data*: 19,112 records, 1946-2004, 17 attributes.
    A piece of data in 2004.
    Who are the best players?
    Best = Not dominated by any other player.
              Name        Points       Rebounds                Assists     Steals    ……
       Tracy McGrady      2003              484                    448      135      ……
            Kobe Bryant   1819              392                    398       86      ……
      Shaquille O'Neal    1669              760                    200       36      ……
             Yao Ming     1465              669                    61        34      ……
       Dwyane Wade        1854              397                    520      121      ……
            Steve Nash    1165              249                    861       74      ……
               ……          ……               ……                     ……       ……       ……
                                                                   * www.databaseBasketball.com
2009-4-23                     Tian Xia - Northeastern University                           3
The Skyline of NBA Players
    NBA statistics data*: 19, 112 records, 1946-2004, 17 attributes.
    A piece of data in 2004.
    Who are the best players?
    Best = Not dominated by any other player.
              Name        Points       Rebounds                Assists     Steals    ……
       Tracy McGrady      2003              484                    448      135      ……
            Kobe Bryant   1819              392                    398       86      ……
      Shaquille O'Neal    1669              760                    200       36      ……
             Yao Ming     1465              669                    61        34      ……
       Dwyane Wade        1854              397                    520      121      ……
            Steve Nash    1165              249                    861       74      ……
               ……          ……               ……                     ……       ……       ……
                                                                   * www.databaseBasketball.com
2009-4-23                     Tian Xia - Northeastern University                           4
The Skyline of NBA Players
    NBA statistics data*: 19, 112 records, 1946-2004, 17 attributes.
    A piece of data in 2004.
    Who are the best players?
    Best = Not dominated by any other player.

              Name        Points       Rebounds                Assists     Steals    ……
       Tracy McGrady      2003              484                    448      135      ……
            Kobe Bryant   1819              392                    398       86      ……
      Shaquille O'Neal    1669              760                    200       36      ……
             Yao Ming     1465              669                    61        34      ……
       Dwyane Wade        1854              397                    520      121      ……
            Steve Nash    1165              249                    861       74      ……
               ……          ……               ……                     ……       ……       ……
                                                                   * www.databaseBasketball.com
2009-4-23                     Tian Xia - Northeastern University                           5
The Skyline of NBA Players
    NBA statistics data*: 19, 112 records, 1946-2004, 17 attributes.
    A piece of data in 2004.
    Who are the best players?
    Best = Not dominated by any other player.

              Name        Points       Rebounds                Assists     Steals    ……
       Tracy McGrady      2003              484                    448      135      ……
            Kobe Bryant   1819              392                    398       86      ……
      Shaquille O'Neal    1669              760                    200       36      ……
             Yao Ming     1465              669                    61        34      ……
       Dwyane Wade        1854              397                    520      121      ……
            Steve Nash    1165              249                    861       74      ……
               ……          ……               ……                     ……       ……       ……
                                                                   * www.databaseBasketball.com
2009-4-23                     Tian Xia - Northeastern University                           6
The Skyline of NBA Players
    NBA statistics data*: 19, 112 records, 1946-2004, 17 attributes.
    A piece of data in 2004.
    Who are the best players?
    Best = Not dominated by any other player.

              Name        Points       Rebounds                Assists     Steals    ……
       Tracy McGrady      2003              484                    448      135      ……
            Kobe Bryant   1819              392                    398       86      ……
      Shaquille O'Neal    1669              760                    200       36      ……
             Yao Ming     1465              669                    61        34      ……
       Dwyane Wade        1854              397                    520      121      ……
            Steve Nash    1165              249                    861       74      ……
               ……          ……               ……                     ……       ……       ……
                                                                   * www.databaseBasketball.com
2009-4-23                     Tian Xia - Northeastern University                           7
The Skyline of NBA Players
    NBA statistics data*: 19, 112 records, 1946-2004, 17 attributes.
    A piece of data in 2004.
    Who are the best players?
    Best = Not dominated by any other player.

              Name        Points       Rebounds                Assists     Steals    ……
       Tracy McGrady      2003              484                    448      135      ……
            Kobe Bryant   1819              392                    398       86      ……
      Shaquille O'Neal    1669              760                    200       36      ……
             Yao Ming     1465              669                    61        34      ……
       Dwyane Wade        1854              397                    520      121      ……
            Steve Nash    1165              249                    861       74      ……
               ……          ……               ……                     ……       ……       ……
                                                                   * www.databaseBasketball.com
2009-4-23                     Tian Xia - Northeastern University                           8
The Skyline of Hotels

    Example: suppose a student wants to find a hotel
     near the ICFP’07 conference hotel.
    Which are the best choices?
      Hotels in Germany                Distance
            Price   Distance
                                                                                          t2, t3 and t4 are



                                           8
     t1       3          2                                                         t2
                                                                                          dominated.


                                           6 7
     t2       4         7                                                          t4             t3
     t3       9         5
                                           5            t7
     t4       4         6
                                           4
                                                                 t5
     t5       2         3
                                           3


                                                                          t1
     t6       6         1
                                        Skyline
                                           2




                                                                                         t6
     t7       1         4               objects
                                           1




    The smaller, the better!                        1        2        3        4    5   6 7   8   9    Price

2009-4-23                      Tian Xia - Northeastern University                                              9
Skyline Query Applications

    Find best NBA players: (#points, #rebounds),
     or any other subset of the 17 dimensions.
    Find best hotels: (price, distance to
     conference hotel).
    Find best researchers: (#pubs in POPL, PLDI,
     ICFP, SIGCOMM, SIGMOD)

    Any table in a RDBMS has a list of records
     with multiple attributes, so ……

2009-4-23           Tian Xia - Northeastern University   10
How to Find A Skyline?

    For every object o
           Compare with all other objects.
           Report o if it is not dominated by any.


    Complexity: O(n2)
    Problem: in a large database, O(n2) is
     inefficient!



2009-4-23                   Tian Xia - Northeastern University   11
Why is an O(n2) Algorithm inefficient?
   Data size is large  stored on disk.




2009-4-23             Tian Xia - Northeastern University   12
                               Spindle
                                      Tracks
            Disk head



                                           Sector




                                      Platters
               Arm movement




    Arm assembly

                                                    transfer time
Each disk access:
                   seek time              rotational delay
Why is an O(n2) Algorithm inefficient?
    Data size is large  stored on disk.
    Sample scenario:
           Disk page size: 8KB.
           Database size: 1GB = 131,072 disk pages.
           Let each disk I/O be 1 ms.
    O(n): 131 seconds  2 minutes. (Not efficient!)


                               Find the nearest hospital…


    O(n2):  200 days! (Out of the question!)

2009-4-23                  Tian Xia - Northeastern University   14
Content

    Motivation and definition of skyline
    Branch-and-Bound Skyline (BBS)
    Compressed Skycube (CSC)




    BBS: [PTFS03], SIGMOD
    CSC:[XZ06], SIGMOD

2009-4-23             Tian Xia - Northeastern University   15
Branch-and-Bound Skyline (BBS)

    Use an R-tree to index the objects.
    Find NN to origin. This is a skyline object.
    Prune search space.
    Repeat finding NN in unpruned space.




2009-4-23            Tian Xia - Northeastern University   16
R-Tree Motivation
                    y axis
               10                                       m
                                  g   h
                8                                                   l
                                                                k
                     e f
               6
                                                   i        j
                              d
               4
                       b                  a
               2      c

                                                                             x axis
               0          2           4            6        8           10



    Range query: find the objects in a given range.
    E.g. find all hotels in Boston.

    No index: scan through all objects. NOT EFFICIENT!
2009-4-23                                     Tian Xia - Northeastern University      17
     R-Tree: Clustering by Proximity
                                        y axis
                               10                                               m
                                                         g     h
                                8                                                               l
                                                                                        k
                                             e f
                                6
                                                                           i        j
                                                     d
                                4
                                             b
                                                         E3       a
                                                                          Minimum Bounding Rectangle (MBR)
                                2            c
                                                                                                     x axis
                                0                2            4            6        8               10
                                                         Root
                                                             E             E
                                                              1             2

                 E        E             E            E                                      E            E
                  1        3             4            5                                      6            7       E
                                                                                                                   2

     a           b    c             d            e                    f         g           h            i    j   k         l   m
E    2009-4-23                 E                              E Tian Xia - Northeastern University
                                                                                         E                             E            18
 3                              4                              5                           6                            7
     R-Tree
                                         y axis
                                10                                          m           E7
                                                         g     h
                                    8                                                       l
                                                                       E6
                                                               E5                   k
                                             e f
                                    6            E4                     i       j
                                                     d
                                    4
                                             b
                                                         E3    a
                                    2        c
                                                                                                 x axis
                                    0            2            4         6       8               10
                                                     Root
                                                          E            E
                                                           1            2

                 E        E             E            E                                  E            E
                  1        3             4            5                                  6            7       E
                                                                                                               2

     a           b    c             d            e                 f        g           h            i    j   k         l   m
E    2009-4-23                 E                             E Tian Xia - Northeastern University                               19
 3                              4                             5                          E                         E
                                                                                           6                        7
     R-Tree
                                        y axis
                               10                                                 m
                                                         g    h
                                8                                                                 l
                                                                                          k
                                             e f                                              E2
                                6
                                                                              i       j
                                                     d            E1
                                4
                                             b                    a
                                2            c
                                                                                                       x axis
                                0                2            4               6       8               10
                                                         Root
                                                             E               E
                                                              1               2

                 E        E             E            E                                        E            E
                  1        3             4            5                                        6            7       E
                                                                                                                     2

     a       b        c             d            e                    f           g           h            i    j   k         l   m
E                              E                              E                                        E                 E
 3   2009-4-23
                                4                              5                                   6
                                                                          Tian Xia - Northeastern University              7           20
     Range Query
                                        y axis
                               10                                                 m
                                                         g    h
                                8                                                                  l
                                                                                           k
                                             e f
                                6
                                                                              i
                                                                                               E2
                                                                                      j

                                4
                                                     d            E1
                                             b                    a
                                2            c
                                                                                                        x axis
                                0                2            4               6        8               10
                                                         Root
                                                             E               E
                                                              1               2

                 E        E             E            E                                         E            E
                  1        3             4            5                                         6            7       E
                                                                                                                      2

     a       b        c             d            e                    f           g            h            i    j   k         l   m
E                              E                              E                                         E                 E
 3   2009-4-23                  4                              5                                   6
                                                                          Tian Xia - Northeastern University               7           21
     Range Query
                                        y axis
                               10                                                 m
                                                         g    h
                                8                                                                  l
                                                                                           k
                                             e f
                                6
                                                                              i
                                                                                               E2
                                                                                      j

                                4
                                                     d            E1
                                             b                    a
                                2            c
                                                                                                        x axis
                                0                2            4               6        8               10
                                                         Root
                                                             E               E
                                                              1               2

                 E        E             E            E                                         E            E
                  1        3             4            5                                         6            7       E
                                                                                                                      2

     a       b        c             d            e                    f           g            h            i    j   k         l   m
E                              E                              E                                         E                 E
 3   2009-4-23                  4                              5                                   6
                                                                          Tian Xia - Northeastern University               7           22
Branched and Bound Skyline (BBS)
y
10
                  b                                     e
    9
    8
                a N1 c
                                          d
                                                   N2            N6                   Assume all points are
    7
    6
                                 g         f
                                                                                       indexed in an R-tree.
    5
    4                           h                   n        l        N7
                                                                                      mindist(MBR) = the L1
    3
                                         m N5
 2
                            i       N3
                                                            N4
                                                            k
                                                                                       distance between its
 1
        o
            1   2       3       4 5 6          7   8 9 10
                                                                  x                    lower-left corner and the
                                                            R
                                                                                       origin.
                                                             e6       e7

                        N6                                                                             N7
                            e1       e2                                                 e3   e4    e5


            a       b           c          d        e        f             g       h     i        l         k   m    n
        N1                                N2                           N3                         N4            N5
    2009-4-23                                                          Tian Xia - Northeastern University                23
Branched and Bound Skyline (BBS)
y
10
                  b                                     e
    9
    8
                a N1 c                             N2            N6                Each heap entry keeps
                                          d
    7                                                                               the mindist of the MBR.
    6
                                 g                                               action                       heap contents   S
                                           f
    5                                                                          acce ss root                   <e7,4><e6,6>    
    4                           h                   n        l        N7
    3
                                         m N5               N4
 2
                            i       N3                      k
 1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k        m    n
        N1                                N2                           N3                       N4                N5
    2009-4-23                                                          Tian Xia - Northeastern University                     24
    Example of BBS
y
10
    9
                  b                                     e                          Process entries in ascending
                a N1 c                             N2            N6
    8                                     d                                         order of their mindists.
    7
                                                                                  action                        heap contents         S
    6
                                 g         f                                   acce ss root                     <e7,4><e6,6>          
    5
                                h                                               expand e7                 <e3,5><e6,6><e5,8><e4,10>   
    4                                               n        l        N7
    3
                                         m N5               N4
2
                            i       N3                      k
1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k          m    n
        N1                                N2                           N3                       N4                  N5
    2009-4-23                                                          Tian Xia - Northeastern University                             25
    Example of BBS
y
10
                  b                                     e
    9           a N1 c                             N2            N6
    8                                     d
    7                                                                             action                        heap contents          S
    6
                                 g
                                                                               acce ss root                      <e7,4><e6,6>         
                                           f
    5                                                                           expand e7                 <e3,5><e6,6><e5,8><e4,10>   
    4                           h                   n                           expand e3                 <i,5><e6,6><e5,8><e4,10>    {i}
                                                             l        N7
    3
                                         m N5               N4
2
                            i       N3                      k
1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k          m    n
        N1                                N2                           N3                       N4                  N5
    2009-4-23                                                          Tian Xia - Northeastern University                              26
    Example of BBS
y
10
                  b                                     e
    9           a N1 c                             N2            N6
    8                                     d                                       action                         heap contents         S
    7
                                                                               acce ss root                      <e7,4><e6,6>         
    6
    5
                                 g         f                                    expand e7                 <e3,5><e6,6><e5,8><e4,10>   
    4                           h                                               expand e3                 <i,5><e6,6><e5,8> <e4,10>   {i}
                                                    n        l        N7        expand e6                    <e5,8><e1,9><e4,10>      {i}
    3
                                         m N5               N4
2
                            i       N3                      k
1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k          m    n
        N1                                N2                           N3                       N4                  N5
    2009-4-23                                                          Tian Xia - Northeastern University                              27
    Example of BBS
y
10
                  b                                     e
    9           a N1 c                             N2            N6
    8                                     d                                       action                         heap contents         S
    7                                                                          acce ss root                      <e7,4><e6,6>         
    6
                                 g                                              expand e7                 <e3,5><e6,6><e5,8><e4,10>   
                                           f
    5                                                                           expand e3                 <i,5><e6,6><e5,8> <e4,10>   {i}
    4                           h                   n                           expand e6                    <e5,8><e1,9><e4,10>      {i}
                                                             l        N7
    3                                                                          remove e5                         <e1,9><e4,10>        {i}
                                         m N5               N4
2
                            i       N3                      k
1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k          m    n
        N1                                N2                           N3                       N4                  N5
    2009-4-23                                                          Tian Xia - Northeastern University                              28
    Example of BBS
y
10
                  b                                     e
    9           a N1 c                             N2            N6               action                     heap contents         S
    8                                     d                                    acce ss root                  <e7,4><e6,6>          
    7
    6
                                                                                expand e7             <e3,5><e6,6><e5,8><e4,10>    
                                 g         f                                    expand e3             <i,5><e6,6><e5,8> <e4,10>   {i}
    5
                                h                                               expand e6                <e5,8><e1,9><e4,10>      {i}
    4                                               n        l        N7        remove e5                    <e1,9><e4,10>        {i}
    3
                                         m N5               N4                  expand e1                    <a,10><e4,10>        {i,a}
2
                            i       N3                      k
1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k       m   n
        N1                                N2                           N3                       N4               N5
    2009-4-23                                                          Tian Xia - Northeastern University                            29
    Example of BBS
y
10
                  b                                     e
    9           a N1 c                                                            action                     heap contents          S
                                                   N2            N6
    8                                     d                                    acce ss root                  <e7,4><e6,6>          
    7                                                                           expand e7             <e3,5><e6,6><e5,8><e4,10>    
    6
                                 g                                              expand e3             <i,5><e6,6><e5,8> <e4,10>   {i}
                                           f
    5                                                                           expand e6                <e5,8><e1,9><e4,10>      {i}
    4                           h                   n                           remove e5                    <e1,9><e4,10>         {i}
                                                             l        N7
    3
                                         m N5                                   expand e1                    <a,10><e4,10>        {i,a}
                                                            N4
2                                                                               expand e4                        <k,10>           {i,a,k}
                            i       N3                      k
1
        o                                                         x
            1   2       3       4 5 6          7   8 9 10

                                                            R
                                                             e6       e7

                        N6                                                                           N7
                            e1       e2                                             e3     e4    e5


            a       b           c          d        e        f             g    h     i         l         k       m    n
        N1                                N2                           N3                       N4               N5
    2009-4-23                                                          Tian Xia - Northeastern University                             30
Content

    Motivation and definition of skyline
    Branch-and-Bound Skyline (BBS)
    Compressed Skycube (CSC)




    BBS: [PTFS03], SIGMOD
    CSC:[XZ06], SIGMOD

2009-4-23             Tian Xia - Northeastern University   31
   Subspace Skyline Queries
       Hotels have many attributes, e.g. price, distance, star rating, food
        quality, facility, location, transportation, …
       Users may ask skyline queries on any subsets of attributes,
        depending on their interests.
       Subspace skylines can be very different!
         u1    u2   u3   u4   u3                                                       u3
                                                           t2
                              8




                                                                                       8
  t1      3    4    2    5                                                                                              t2
                              7




                                                                                       7
                                                                                                              t4
  t2      4    6    7    2                                 t4                    t3
                              6




                                                                                       6
                                                                                                                                     t3
  t3      9    7    5    6
                              5




                                                                                       5
                                         t7                                                                   t7
                              4




                                                                                       4
  t4      4    3    6    1                                                                       t5
                              3




                                                                                       3
  t5      2    2    3    1               t5                                                                        t1
                              1 2




                                                                                       1 2
                                                  t1                    t6
  t6      6    1    1    3                                                                        t6
  t7      1    3    4    1                                                            u1                                                           u2
                                     1        2   3    4        5   6    7   8    9          1        2   3   4    5         6   7   8    9
Objects of 4-dimensions                  Skyline in u1, u3                                 Skyline in u2, u3

   2009-4-23                        Tian Xia - Northeastern University                                                                        32
Our Problem

    Our problem: how to support arbitrary subspace skyline
     queries in dynamic and frequently-updated databases?

Problem Settings:
 Online systems: the database server receives multiple
  concurrent skyline queries on arbitrary, unpredictable
  subspaces.
 Frequently-updated databases: The data are also
  constantly changing. E.g., in an online hotel-booking
  system, room prices change due to the availability.


2009-4-23               Tian Xia - Northeastern University    33
Straightforward Solutions

    On-the-fly computation: slow in query processing
           Compute the results from scratch
           Process the whole dataset for each query

    Pre-compute and store all subspace skylines:
     high update costs
           Expensive to correctly maintain all results
           Waste of storage


2009-4-23                   Tian Xia - Northeastern University   34
The complete pre-computation
                                                                       Subspace                 Skyline
                                                                         u1                          t7
                                                                           u2                        t6
            u1   u2   u3   u4
                                                                           u3                        t6
    t1      3    4    2    5
                                                                           u4                  t4 , t5 , t 7
    t2      4    6    7    2
    t3      9    7    5    6         Skycube                            u1 , u2              t5 , t6 , t 7 , t9
                                                                        u1 , u3           t1 , t5 , t6 , t7 , t9
    t4      4    3    6    1
                                                                        u1 , u4                      t7
    t5      2    2    3    1
                                                                        u2 , u 3                     t6
    t6      6    1    1    3
                                                                        u2 , u 4                  t5 , t6
    t7      1    3    4    1          Contains many
                                                                        u3 , u4                   t5 , t6
    t8      6    5    3    8         duplicates, e.g. t6
                                     appears 12 times                 u1 , u2 , u3        t1 , t5 , t6 , t7 , t9
    t9      2    2    3    7
                                                                      u1 , u2 , u 4            t5 , t6 , t 7
                                                                      u1 , u3 , u 4          t 1 , t 5 , t6 , t 7
                                                                      u2 , u 3 , u 4              t5 , t6
                                                                     u1 , u2 , u 3 , u4      t 1 , t 5 , t6 , t 7


2009-4-23                       Tian Xia - Northeastern University                                                  35
Our Solution: the Compressed Skycube

    The Compressed Skycube achieves both fast query
     response and efficient update support.

    New storage model
           Represent all skylines in a very concise way, by preserving
            only essential information of subspace skylines.
    New query processing algorithm
           Efficiently answer arbitrary subspace skyline queries
            without accessing the whole dataset.
    New object-aware update scheme
           Avoid unnecessary data access and subspace skyline
            computation upon updates.

2009-4-23                      Tian Xia - Northeastern University     36
Minimum Subspace (mss)
                                                                                 Subspace              Skyline
                                                                                   u1                        t7
    Object t6 appears in the                                                       u2                       t6
     skylines of 12 subspaces.                                                      u3                       t6
                                                                                    u4                 t4 , t5 , t 7
    The number of minimum                                                        u1 , u2            t5 , t6 , t 7 , t9
     subspaces of t6 is only 2.                                                   u1 , u3         t1 , t 5 , t6 , t 7 , t9
                                                                                  u1 , u4                    t7
                 Minimum Subspaces
                                                                                  u2 , u 3                   t6
            t1   u1, u3
                                                                                  u2 , u 4                t5 , t6
            t4   u4
                                                                                  u3 , u4                 t5 , t6
            t5   u4, u1, u2, u1, u3
                                                                             u1 , u2 , u3         t1 , t5 , t6 , t7 , t9
            t6   u2, u3
                                                                             u1 , u2 , u 4              t5 , t6 , t 7
            t7   u1, u4
                                                                             u1 , u3 , u 4           t1 , t 5 , t6 , t 7
            t9   u1, u2, u1, u3
                                                                             u2 , u 3 , u 4               t5 , t6
                                                                            u1 , u2 , u 3 , u 4      t1 , t 5 , t6 , t 7

2009-4-23                                   Tian Xia - Northeastern University                                               37
The Compressed Skycube (CSC)

    Definition: The Compressed Skycube (CSC)
     consists of non-empty subspace U, such that an
     object t is stored in a subspace U if and only if U is a
     minimum subspace of t, i.e. U mss(t).
                                                                                              CSC
                 Minimum Subspaces                                               Subspace           Skyline
            t1   u1, u3                                                            u1                  t7
            t4   u4                                                                u2                  t6
            t5   u4, u1, u2, u1, u3                                            u3                  t6
            t6   u2, u3                                                          u4             t4 , t5 , t 7
            t7   u1, u4                                                        u1 , u 2           t5 , t9
            t9   u1, u2, u1, u3                                                u1 , u 3         t1 , t 5 , t 9


2009-4-23                                   Tian Xia - Northeastern University                                       38
 Querying CSC
                                     Only visit CSC,
Find the skyline                    not whole dataset CSC                           Output is non-
in subspace u2,                                                                      blocking!
u3, u4.            u1   u2   u3     u4             Subspace       Skyline
               t1   3    4    2       5                 u1              t7
               t4   4    3    6       1                   u2            t6                      t6
               t5   2    2    3       1                   u3            t6
               t6   6    1    1       3                   u4       t4 , t5 , t 7                t5
               t7   1    3    4       1                u1 , u2       t5 , t9
               t9   2    2    3       7                u1 , u3     t1 , t 5 , t 9

     Theorem 1: Given a query space Uq and an object t, if for any subspace
      Ui in mss(t), Ui  Uq, then t is not in the skyline of Uq.
       Search the subspaces which are subsets of the query space.

     Theorem 2 (Local Comparison): To check a candidate t in a subspace V
       Uq, we only need to compare t with the objects within the same
      subspace.
       Compare candidates within their own subspaces.


 2009-4-23                    Tian Xia - Northeastern University                               39
  Updating CSC
      Crucial questions:
             When do we access the whole dataset to retrieve new skyline objects?
             When do we re-compute the skylines of certain subspaces?
      Full-space: a subspace containing all dimensions, represented as D
      Skyline objects in full-space: sky(D)

      t: object before update; tnew: object after update                No skyline computation.
                                                                         Existing CSC objects are
Case 1:                                      tnew  sky(D)               not changed.
                    No need to
   t  sky(D)
                   access data               tnew  sky(D)                 May update
                                                                           subspace skylines.
Case 2:
                   May access    Retrieve new
   t  sky(D)                                             The number of full-space skyline
                   dataset       skyline objects
                                                          objects is small compared to
                                                          the whole dataset!

  2009-4-23                         Tian Xia - Northeastern University                          40
Performance

    (Full-space) Dimensionality: 6
    Object cardinality: [100K, 500K].
    Distribution: Uniform




     Storage efficiency        Query efficiency                Update efficiency

2009-4-23                 Tian Xia - Northeastern University                   41
Thank you!


   Questions?

								
To top