Embed
Email

BizgresOnDiskBitmapI ndex - Do something, Learn something, Share

Document Sample
BizgresOnDiskBitmapI ndex - Do something, Learn something, Share
Shared by: BrandalJaclson
Stats
views:
3
posted:
9/2/2009
language:
English
pages:
18
On-Disk Bitmap Index In Bizgres





Ayush Parashar

aparashar@greenplum.com and



Jie Zhang

jzhang@greenplum.com









1

Agenda



 Introduction to On-Disk Bitmap Index

 Bitmap index creation

 Bitmap index creation performance - index size and

creation time

 Performance with varying cardinality

 Query performance

 Summary









2

Introduction



 Access method specially efficient for low-

cardinality column of high-dimensional fact table

 Takes a fraction of space as compared to B-tree

 Very less index creation time

 Very effective on queries with multiple-conditions

in where clause









3

Introduction: continued..



 Consider the following data:









Qu i ckTi me ™ an d a

TIFF (LZW) de co mp re ss or

a re ne ed ed to se e thi s p i ctu re .









4

Introduction: B-Tree representation









QuickTime™ and a

TIFF (Uncompressed) decompre ssor

are neede d to see this picture.









5

Introduction: Bitmap representation









QuickTime™ and a

TIFF (LZW) decompressor

are neede d to see this picture.









6

Bitmap index creation

 Steps in bitmap-index creation

 Build bin indexes

 Apply encoding schemes: Equality encoding scheme

used in the present implementation

 Apply compression schemes

 Contributors: Jie Zhang, Mark Kirkwood, Gavin

Sherry









7

Bitmap index creation: continued..









QuickTime™ and a

TIFF (LZW) decompressor

are neede d to see this picture.









8

On-disk Bitmap Index - Performance

 Table definition from OSDL - DBT 3 benchmark

 Size of table is corresponding to scale factor 10 for DBT 3



Table Name Indexed Column Data Type Cardinality



LINEITEM L_SHIPMODE character(10) 7



LINEITEM L_QUANTITY numeric(15,2) 50



LINEITEM L_LINENUMBER integer 7



LINEITEM L_SHIPMODE, L_QUANTITY character(10), 350

numeric(15,2)

* multi-column index







ORDERS O_ORDERSTATUS character(1) 3



ORDERS O_ORDERPRIORITY character(15) 5



CUSTOMER C_MKTSEGMENT character(10) 5



CUSTOMER C_NATIONKEY integer 25









9

On-disk Bitmap Index - Index Size Performance



 Index size - Size of bitmap index is a fraction of B-tree index



3000 2845







2500

Size of the index (in MBytes)









2000 1804 1804





1500 1285





1000



580



500 321

117 176

58 59 5 11 1 45 2 32

0

l_s hipmode l_quantity l_linenum ber l_s m & l_q o_orders tatus o_orderpriority c_mktsegment c_nationkey





Bitmap B-tree









10

On-disk Bitmap Index - Creation Time Performance



 Index creation time: Up to 7 times faster index creation



3500

Time take n to c rea te the index (sec onds )









2933.4

3000





2500

2217.1



2000





1500



937.8 948.7

1000

679.2

547.2

454.9 412.4

500 374.6

241.4

83.5 108.5

10.9 51.4 8.3 9.3

0

l_s hipmode l_quantity l_linenum ber l_s m & l_q o_orders tatus o_orderpriority c_mktsegment c_nationkey





Bitmap B-Tree









11

On-disk Bitmap Index - Performance with varying cardinality





 Index size with varying cardinality: Total rows 250 million



Variation in index size with change in column cardinality





7000







6000







5000

Index size in MB









4000

Bitmap Index

B-tree Index

3000







2000







1000







0

0 2000 4000 6000 8000 10000 12000

Column Cardinality 50, 100, 500, 1000, 2000, 4000, 10000 (total rows 250 million)









12

On-disk Bitmap Index - Performance with varying cardinality





 Index creation time with varying cardinality: Total rows 250

million

Variation in index creation time with varying cardinality





1600







1400







1200

Index creation time in seconds









1000





Bitmap Index

800

B-tree Index





600







400







200







0

0 2000 4000 6000 8000 10000 12000

Column Cardinality: 50, 100, 500, 1000, 2000, 4000, 10000 (total rows 250 million)









13

On-disk Bitmap Index - Query Performance



Query 1 Query 2

SELECT sum(lineitem.l_discount)

SELECT avg(lineitem.l_tax)

FROM

FROM

lineitem, orders, customer, nation

lineitem, orders

WHERE

WHERE

nation.n_name='UNITED STATES' AND

orders.o_orderstatus='F' AND

customer.c_mktsegment='AUTOMOBILE' AND

orders.o_orderpriority='4-NOT SPECIFIED' AND

orders.o_orderstatus='P' AND

lineitem.l_linenumber=5 AND

orders.o_orderpriority='2-HIGH' AND

lineitem.l_shipmode='TRUCK' AND

lineitem.l_quantity=5 AND

lineitem.l_quantity=2 AND

lineitem.l_shipmode='AIR' AND

orders.o_orderkey=lineitem.l_orderkey;

lineitem.l_linenumber=5 AND

customer.c_custkey=orders.o_custkey AND

orders.o_orderkey=lineitem.l_orderkey AND

nation.n_nationkey=customer.c_nationkey;









Query 3

SELECT count(*) FROM lineitem WHERE l_linenumber=1;



Query 4

SELECT count(*) FROM lineitem WHERE l_linenumber in (1,2) AND l_shipmode IN ('RAIL','TRUCK');



Query 5

SELECT count(*) FROM lineitem WHERE l_linenumber=5 AND l_shipmode='RAIL' AND l_quantity=18;









14

On-disk Bitmap Index - Query Performance



 Query Performance:

 Run1, Run2 and Run3 indicate that the same query has been run consecutively three-times



450

Time taken for query completion (seconds)









400

39 39

38 6.5 2.8 5.1 38 38

38 6.6 6.7 8.7



34 35

35 0.8 6.7 2.2

350



300 28 28

28 3.4 3.3 3.3





250



200 18 1.8 1 0.5

18 18



150 13 6.3

10 1.4 .4 .7

99 98 10 10

10 8.6 9.1 9.3

100

60 68

60 .4 .6 .3 53 .9

50

1.7 1.7 0.8 0.6

0

Query 1 Query 2 Query 3 Query 4 Query 5



Bitmap - Ru n1 Bitmap - Ru n2 Bitmap - Ru n3 B-tree - Run1

B-tree - Run2 B-tree - Run3





15

On-disk Bitmap Index - Query Performance



 Example: Query I explain plan output

QUERY PLAN

-----------------------------------------------------------------------------

Aggregate (cost=713682.71..713682.72 rows=1 width=9)

-> Merge Join (cost=706473.11..713678.58 rows=1649 width=9)

Merge Cond: ("outer".l_orderkey = "inner".o_orderkey)

-> Sort (cost=122798.17..122841.70 rows=17412 width=13)

Sort Key: lineitem.l_orderkey

-> Bitmap Heap Scan on lineitem (cost=57440.64..121571.69 rows=17412 width=13)

Recheck Cond: ((l_quantity = 2::numeric) AND (l_linenumber = 5) AND (l_shipmode = 'TRUCK'::bpchar))

-> BitmapAnd (cost=57440.64..57440.64 rows=17412 width=0)

-> Bitmap Index Scan on l_quantity_bm_idx(on-disk bitmap index) (cost=0.00..4500.02 rows=1199721 width=0)

Index Cond: (l_quantity = 2::numeric)

-> Bitmap Index Scan on l_linenumber_bm_idx(on-disk bitmap index) (cost=0.00..22779.89 rows=6278540 width=0)

Index Cond: (l_linenumber = 5)

-> Bitmap Index Scan on l_shipmode_bm_idx(on-disk bitmap index) (cost=0.00..30160.23 rows=8318065 width=0)

Index Cond: (l_shipmode = 'TRUCK'::bpchar)

-> Sort (cost=583674.94..587225.94 rows=1420397 width=4)

Sort Key: orders.o_orderkey

-> Bitmap Heap Scan on orders (cost=10318.21..372165.55 rows=1420397 width=4)

Recheck Cond: (o_orderpriority = '4-NOT SPECIFIED'::bpchar)

Filter: (o_orderstatus = 'F'::bpchar)

-> Bitmap Index Scan on o_orderpriority_bm_idx(on-disk bitmap index) (cost=0.00..10318.21 rows=2869489 width=0)

Index Cond: (o_orderpriority = '4-NOT SPECIFIED'::bpchar)

(21 rows)



16

Summary



 On-disk Bitmap Index: New feature in Bizgres 0.9

 Provides dramatic improvements in index creation

time and space used by the index

 Dramatically improves response time for large

classes of ad hoc data-warehousing queries









17

Thanks



Questions..?



References



 Bitmap Index Design and Evaluation: Chee-Yong Chan and Yannis E. Ioannidis

 Compressed bitmap indices for efficient query processing: Kesheng Wu Ekow J. Otoo

Arie Shoshani

 On-Disk Bitmap Index Performance in Bizgres 0.9 - A Greenplum Whitepaper:

http://bgn.greenplum.com

 http://www.bizgres.org/home.php









18


Related docs
Other docs by BrandalJaclson
LICIO GELLI
Views: 61  |  Downloads: 0
DRILLS
Views: 12  |  Downloads: 0
Traditional Japanese color
Views: 266  |  Downloads: 0
Patterns of Three by
Views: 11  |  Downloads: 0
CHISME
Views: 2  |  Downloads: 0
Free Summer Jazz Concert Schedule
Views: 16  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!