DBFarm A Scalable Cluster for Multiple Databases by oneforseven

VIEWS: 6 PAGES: 17

									DBFarm: A Scalable Cluster for
    Multiple Databases
Christian Plattner, Gustov Alonso ETH Zurich
        M Tamer Ozsu U of Waterloo
         Presentation By : Kush Patel
              CS848 Winter 2009

                                               1
Outline
• Motivation
• Goal
• Architecture
  – Architectural Overview
  – Update and Read sequence
  – Adapters
• Experiments
• Conclusion
                               2
Motivation
• Data grids, large scale web applications etc.
  overcharge DB engines
• Optimizing one DB cannot solve the issue
• Replication is the solutions
• How about replicating DB instances?




                                                  3
Goals
• Replicating DBs should provide
  – A single image of database for clients
  – Scalability
  – Consistency
  – Resource Optimization
  – Load Balancing
• We assume writes << reads


                                             4
Architectural Overview
• DBFarm with 3 masters and 6 satellites




• Each database is installed on one master and
  may be replicated on different satellites

                                                 5
Update Propagation
UPDATE      Update the Master   Satellites:
                                >Put update in FIFO queue
                                >Apply the updates via writesets
            Generate:
            >Change Number
            >Writesets



• Change number is used to track the updates
• Writesets are a tuple with ID in the database
• Satellites keep the latest change number for
  managing reads


                                                                   6
Read Sequence
Read      master            Satellites:
                            >Perform read on base of
          >Attach latest    change number
          change number
          from master


• Reads are routed to satellites via master with
  latest change number(CN)
• CN(read) <= CN(satellite) then read is
  performed else read waits till the writesets
  are applied
                                                       7
Writeset Extraction
• Generic Approach
  – Collection of DML statements as writesets
  – It misses the update through triggers
• Writeset extraction with triggers
  – C library for PostgreSQL
  – Captures the changes and put the writesets in
    memory
  – Fast because of no disk access

                                                    8
Adapters as Middleware




                         9
Experiments
• TPC-W benchmark x 300 DB (497MB/DB)
• RUBBoS x 60 (2440MB/DB)
• Masters
   – Dual Core Xeon CPU 3 GHz, 4 GB RAM
   – 250 x 5 GB HDD
   – OS : Fedora Core 4
• Satellites
   – Dual AMD Opteron 2.44 GHz 4 GB RAM
   – 120 GB HDD
   – Red Hat Linux AS Release 4

                                          10
Results for TPC-W Benchmark




•100 DB with PostgreSQL master (PostgreSQL)
•100 DB with 10 attached satellites (PostgreSQL)
• DBFarm scales and reduces response time
                                                   11
Results for RUBBoS

•RUBBoS has data more
than 2 GB which makes the
case more difficult
• Single master again beaten
by DBFarm results
• Adding DBFarm with
RUBBoS also scales




                               12
Comapring DBFarms




• No of copies on each satellite is also a crucial decision
• Here for 800 clients mean response time
    –For 10 x 1 copies 1000 ms while for 2 x 5 copies is roughly 2250 ms, for
    single database its 3000 ms
• Adding 10 times cpu power with compare to original setup gives you only
300% improvement on 1 DB configuration
                                                                                13
Scaleout for Selected Databases
•   RUBBoS with 200 TPC-W
•   20 setellites for 200 TPC-W
•   3 setellites for RUBBos
•   DBFarm performs better than single server




                                                14
Scaleout continue …




• Scaleout experiment with 1 PostgreSQL and DBFarm
                                                     15
Conclusion
•   Scalability and consistency is guaranteed
•   No change required in client codes
•   Good choice for database service provider
•   Load balancing is achieved due to Adapters




                                                 16
• Thank You

• Questions ???




                  17

								
To top