Advanced Scalability and Reliability by sW9S27

VIEWS: 1 PAGES: 83

									EM419
MobiLink Advanced
Scalability and Reliability



Reg Domaratzki
Sustaining Engineering
iAnywhere Solutions
Reg.Domaratzki@sybase.com
Do you use MobiLink yet?

Which version of MobiLink?
Which type of clients?
  Adaptive Server Anywhere (ASA) or UltraLite
Which type of consolidated database?
  ASA, ASE, MS SQL Server, Oracle8, IBM DB2 UDB
How many remote databases?
Common questions


We are often asked the following:
What is the maximum number of remote
 users?
How scalable is MobiLink?
An example of scalability
Goals of this presentation


I will try to convince you that MobiLink:
  Scales ideally with increasing remote databases
  Makes efficient use of hardware
  Has modest hardware requirements
I want you to:
  Use MobiLink for large number of remote databases
  Get the best performance
Benefits


You can:
Support a large number of remote
  databases
Predict performance for a large number
  of remote databases from tests with a
  small number
Maximize throughput by following
  performance tips
Performance of MobiLink

MobiLink overview
What takes time in a MobiLink synchronization?
How performance was measured
Results of performance testing
  Optimum number of worker threads
  Number of clients
  Size of synchronizations
  Parallel efficiency
Recommendations and next steps
What is MobiLink?


A two-way synchronization technology for
  large scale mobile database
  deployment
  remote database (mobile, embedded, or workgroup server
    database)
  consolidated database (enterprise, workgroup, or desktop
    database)
A server that processes synchronization
  requests from mobile databases
What is MobiLink?

                   Consolidated
                    Database



                    MobiLink


           Communication Infrastructure
           ( Internet / Dial-up / Wireless )
    Data
                                                 Data




                                          Data
           Data
                   Data
                                  Data
                           Data




             Mobile or Remote Databases
MobiLink design goals


Heterogeneous consolidated database
Scalable and robust
  (tens of thousands of remote
  databases)
Manageable in large deployments
Support handheld and wireless devices
Flexible
Designed for scalability


Connection pooling
Worker threads
Little or no disk access
Almost no contention in MobiLink
Performance of MobiLink

MobiLink overview
What takes time in a MobiLink synchronization?
How performance was measured
Results of performance testing
  Optimum number of worker threads
  Number of clients
  Size of synchronizations
  Parallel efficiency
Recommendations and next steps
What takes time in a
synchronization?

Connections
Upload
Download
Connections


Remote database (client) to MobiLink
  Overhead of creating network connection
  Client may have to wait for available MobiLink worker
    thread.
MobiLink to consolidated database
  Worker thread uses database connection from pool
  Each database connection is tied to a sync version
  Reconnection on error or change in sync version
  Tip: # db connections  # versions  # workers
Upload


                         Consolidated
                             DB
 2. MobiLink to consolidated
                          MobiLink
 1. Client to MobiLink

                           Remote
                          Database
Upload: client to MobiLink

Data transfer from client to MobiLink worker
  thread
  upload size  bandwidth
  packing reduces transfer of zero-valued bytes
  some client processing with UltraLite clients
  worker does character set translation to Unicode
  all in memory, unless upload or BLOB cache overflow to disk
  Tip: upload cache (-u)  largest upload  # workers
  Tip: BLOB cache (-bc)  2  largest BLOB data in a row  # workers
Upload: MobiLink to consolidated


MobiLink worker thread applies upload to
 consolidated database
  via your upload synchronization scripts
  time dictated by consolidated database performance
      • simultaneous connections
      • concurrency
      • size of transactions
      • network bandwidth
Download


                          Consolidated
                              DB
  1. Consolidated to MobiLink
                           MobiLink
  2. MobiLink to client

                            Remote
                           Database
Download: consolidated to
MobiLink

MobiLink worker thread fetches data to
 be downloaded
  via your download synchronization scripts
  time dictated by consolidated database performance
  MobiLink uses same BLOB cache as for upload
Download: MobiLink to client

Data transferred from MobiLink worker thread to
  client
  worker does character set translation from Unicode
  more client processing than in upload
  download size  bandwidth  client processing
MobiLink worker thread waits for client
 acknowledgement
  This is optional in v8
  We’ve found that with very slow clients, that a MobiLink worker
    thread would spend a majority of it’s time waiting for an
    acknowledgement of the download stream
 Scaling up to more clients

More worker threads allow more simultaneous
   syncs
Ideally:
    total time  single sync time  # clients  #
                        workers
       (assuming # clients  # workers)
Neglects contention and multitasking overhead
In practice, should hit limit where increasing
   worker threads does not reduce total time
Potential bottlenecks


Throughput may be limited by:
  client processing speed
  bandwidth for client-to-MobiLink communications
  speed of the computer running MobiLink
  number of MobiLink worker threads
  bandwidth for communication between MobiLink and the
     consolidated database
  performance of the consolidated database
  contention in your synchronization scripts
Performance of MobiLink

MobiLink overview
What takes time in a MobiLink synchronization?
How performance was measured
Results of performance testing
  Optimum number of worker threads
  Number of clients
  Size of synchronizations
  Parallel efficiency
Recommendations and next steps
Performance tests

Determine performance characteristics
 of MobiLink
  optimal number of worker threads for many clients
  differing number of clients
  synchronization size
  parallelism
Testing methodology
  vary one thing at a time
  stress MobiLink and/or consolidated database
  keep it simple
Schema

Single table
Two-column primary key to avoid primary
  key pool
Representative data types
  CREATE TABLE Purchase (
      emp_id      INT       NOT NULL,
      purch_id    INT       NOT NULL,
      cust_id     INT       NOT NULL,
      cost        NUMERIC   NOT NULL,
      order_date TIMESTAMP NOT NULL,
      notes       VARCHAR(64),
      PRIMARY KEY ( emp_id, purch_id ),
  )
Values


Emp_id maps to remote client via
 employee table (which is not
 synchronized)
  Mutually exclusive partitioning of data between clients (to
   avoid contention and conflicts)
Large values chosen for integer data
  (so packing would not shrink data
  transferred)
  Each row is 92 bytes when transferred
Timing framework


Extra tables in consolidated database
MobiLink synchronization scripts
Small, efficient client application
  Win32 console application
  Spawns multiple child processes that act as clients
  UltraLite with no file-based persistent storage
Supervisor program
  Coordinates clients on different computers
Ensuring simultaneous
synchronizations

Clients kept in step via gates
  At a gate, each client waits for all the others
  Win32 event objects for clients on one computer
  Named pipes to supervisor for multiple computers
  Efficient (1 to 2 seconds for 1000 clients on 10 PCs)
Gates before and after each
  synchronization
Times recorded between gates and
  synchronization on both client and
  server
Timing a synchronization

    1. Client: prepare for synchronization
   2. Client: wait for all other clients (“gate”)
   3. Client: record client start time
    4. Client: start synchronization, via
    ULSynchronize()
   5. ML: record start (begin_synchronization
    script)
    6. Perform synchronization
   7. ML: record end (end_synchronization script)
   8. Client: record client end
   9. Client: wait for all other clients (“gate”)
 Times and throughput definitions

Client-measured time (for a single
  synchronization):
      tclient_end - tclient_start
Server-measured time (for a single
  synchronization):
      tserver_end - tserver_start
Total server time (for a set of simultaneous
  syncs):
      max(tserver_end) - min(tserver_start)
Throughput:
      total # rows  total server time
Test environment

Sybase SQL Anywhere Studio 7.0.1 and 8.0.0
Isolated test rack
MobiLink and ASA on Dell PowerEdge 6300/550
  (4P3-550, 512 MB, database file on array
  drive, database log file on separate drive)
Clients on 10 Dell Optiplex GXa 266Mbr
  (P2-266, 64 MB)
100 Mbps Fast Ethernet hub (with utilization
  gauge)
Performance of MobiLink

MobiLink overview
What takes time in a MobiLink synchronization?
How performance was measured
Results of performance testing
  Optimum number of worker threads
  Number of clients
  Size of synchronizations
  Parallel efficiency
Recommendations and next steps
Results of performance testing

Four main tests:
Number of worker threads
  Fast clients
  Slower Clients
  Slowest Clients
  Upload Cap
Number of clients
Size of synchronizations
Number of server processors
Test 1-A: Varying worker threads


Constants:
  1,000 clients
  1,000 rows per client synchronization
     (92 bytes per row)
  total of 1,000 synchronizations
Varied ML worker threads
  2, 4, 5, 10, 20, 50
                         Throughput vs. worker threads
                         for fast clients
                      14000

                      12000

                      10000
Throughput (rows/s)




                       8000                                                Downloads
                                                                           Inserts
                       6000
                                                                           Deletes
                       4000                                                Updates

                       2000

                          0
                              0   10   20             30              40   50          60
                                            MobiLink worker threads
Optimal number of worker threads


Throughput rises then drops with increasing
  workers
Two likely causes for drop:
  Hardware contention due to CPU or disk access saturation on server
    computer
  Software contention between connections in the consolidated
    database (blocking)
In this case, 100% CPU utilization reached with 5
   worker threads
Clients fast enough to saturate ML/ASA (no
   difference increasing from 10 to 12 computers
   running clients)
Client perspective


0.5% of syncs active at any time with 5
  worker threads
Rest are either queued waiting, or
  already finished
Client times:
Longest client time  total server time
Average client time  ½ total server time
Maximizing throughput also minimizes
  average and longest client sync times
Test 1-B : Varying worker
threads with slower clients

Constants as before, except client
 hardware and network
  clients now run on 15 P-75 computers
  10 Mbps Ethernet hub
Varied ML worker threads
  5, 10, 20, 50, 100
                         Throughput vs. worker threads
                         for slower clients
                      10000
                       9000
                       8000
Throughput (rows/s)




                       7000
                       6000                                                Downloads
                       5000                                                Inserts
                       4000                                                Deletes
                       3000                                                Updates

                       2000
                       1000
                          0
                              0   20   40             60              80   100         120
                                            MobiLink worker threads
Effects of slower clients

All types of synchronization slowed
Downloads depend more on client speed
  than uploads
  With 5 MobiLink worker threads, downloads slowed by
   46%, deletes slowed by 18%, updates and inserts
   slowed by 10%
Adding worker threads reduces shortfall
  Uploads best with 10, download best with 50
High variability for downloads
   25-30% instead of usual  2%
   Timings vs. worker threads
   for slower clients


  You may not want to optimize for download
     add ~400 s to upload to save 20 s in download!

Action   Time with 10 Time with 50 Difference % difference
download     128          109         -19        -15%
insert       490          836         346        71%
delete       649         1061         412        64%
update      1074         1589         515        48%
Test 1-C : Simulating very slow
clients

Wanted to simulate 1000 Palm devices on
 wireless WAN network
  Actual timings with Palm IIIx connected at 4800 baud
  Single Win32 client slowed to match or exceed Palm
    timings (using special UL runtime with optional delays)
  Use same delays for 1000 Win32 clients to simulate 1000
    Palm devices connecting at 4800 baud
Varying worker threads with very
slow clients

Constants:
  1,000 clients (delayed to match Palm timings)
  1,000 rows per client synchronization
     (92 bytes per row)
  total of 1,000 synchronizations
Varied ML worker threads
  5, 10, 20, 50, 100, 200, 500
                         Throughput vs. worker threads
                         for very slow clients
                      1600
                      1400
                      1200
Throughput (rows/s)




                      1000
                                                                             Downloads
                       800                                                   Inserts
                       600                                                   Deletes
                                                                             Updates
                       400
                       200
                         0
                             0   100   200            300              400   500         600
                                             MobiLink worker threads
Optimal number of worker threads
for very slow clients

Download improves almost linearly
  Long times to apply downloads are overlapped more with
    more workers
Uploads best at 100 or 200 worker
 threads
Optimal # of workers very different for
 upload and download!
Upload cap


Limits number of worker threads that can
  apply uploads simultaneously
  Referred to as “uploaders”
Other worker threads can still download
  or receive upload
Allows independent optimization of
  worker threads for upload and
  download throughput
Test 1-D : Varying uploaders with
very slow clients

Constants:
  1,000 clients (delayed to match Palm timings)
  1,000 rows per client synchronization
     (92 bytes per row)
  total of 1,000 synchronizations
  500 ML worker threads
Varied ML upload cap
  2, 5, 10, 20, 50, 100
                         Upload throughput vs. uploaders
                         for very slow clients
                      1400

                      1200
                                                                                            Inserts
Throughput (rows/s)




                      1000
                                                                                            Deletes
                       800                                                                  Updates

                       600

                       400

                       200

                         0
                             0       20            40           60            80           100        120
                                 Max number of simultaneously uploading worker threads (out of 500)
Test 1-E : Varying worker
threads with upload cap and very
slow clients
Constants:
  1,000 clients (delayed to match Palm timings)
  1,000 rows per client synchronization
     (92 bytes per row)
  total of 1,000 synchronizations
  5 for upload cap (i.e. 5 uploaders)
Varied ML worker threads
  50, 100, 200, 334, 500
                             Throughput vs. worker threads
                             for upload cap and very slow
                             clients
                      1600
                      1400
                      1200
Throughput (rows/s)




                                                                                               Downloads
                      1000                                                                     Inserts
                       800                                                                     Deletes
                                                                                               Updates
                       600
                       400
                       200
                         0
                             0    100          200           300           400           500        600
                                        MobiLink worker threads (with upload cap of 5)
Upload cap improves upload
throughput with very slow
clients

Upload type   No cap   With cap % Difference
  Inserts      589      1419       141%
  Deletes      551      1118       103%
 Updates       309       616        99%
Optimum number of worker
threads and uploaders

Best throughput with relatively small number of
  worker threads for upload
  For fast clients, small number of worker threads (5 is best here)
  For slower clients, need more worker threads and upload cap
       • Higher number of total worker threads for slower clients
       • Small upload cap to maximize upload throughput (depends on
         consolidated speed, around 5 best here)
  May need more worker threads and uploaders when ML and
    consolidated on different computers
Tip: Add workers or uploaders until server
  saturated or contention in the consolidated
  limits throughput
Test 2: Varying number of clients

Constants:
  5 MobiLink worker threads
  1,000 rows per client synchronization
     (92 bytes per row)
  # clients  # synchronizations per client
  total of 1,000 synchronizations
Varied:
  number of clients (20, 50, 100, 200, 500, 1000)
  number of synchronizations per client adjusted to fix total
    number of synchronizations
   Constant amount of data
1000 Clients:


                       Total server time
   • Each client synchronizes once
   • One set of simultaneous synchronizations
500 Clients:


                      Total server time
  • Each client synchronizes twice
  • Two sets of simultaneous synchronizations
                         Throughput vs. number of clients
                      16000

                      14000

                      12000
Throughput (rows/s)




                      10000                                   Downloads
                                                              Inserts
                       8000
                                                              Deletes
                       6000                                   Updates
                       4000
                       2000

                          0
                              0   200   400    600      800   1000        1200
                                              Clients
Client scalability

         total time  single sync time  # clients  # workers
                      Sum of     Estimated
         Action     server times total time   Actual time Difference
 1000 delete           2732         546          570          4%
 clients download       371          74           77          3%
 ×1      insert        2192         438          450          3%
 sync    update        4637         927          1044        13%
 20      delete        2843         569          570          0%
 clients download       422          84           77         -9%
 × 50    insert        2053         411          450         10%
 syncs update          4893         979          1044         7%
 20      delete          57         569          570          0%
 clients download         9          90           77        -15%
 ×1      insert          40         396          450         14%
 sync    update          98         981          1044         6%
Client scalability


MobiLink scales linearly with additional
 clients
Tests with a small number of clients can
 effectively predict performance with a
 much larger number of clients
Test 3:
Varying size of synchronizations

Constants:
  200 clients
  5 MobiLink worker threads
  # rows  # synchronizations per client
  total of 1,000 synchronizations
Varied:
  number of rows per sync (100, 200, 500, 1000, 2500,
    5000)
  number of synchronizations per client adjusted to fix total
    number of rows synchronized
                         Effect of synchronization size
                      16000

                      14000

                      12000
Throughput (rows/s)




                      10000                                             Downloads
                                                                        Inserts
                       8000                                             Deletes
                       6000                                             Updates

                       4000
                       2000

                          0
                              0   1000   2000       3000         4000   5000        6000
                                           Rows per synchronization
Effect of synchronization size


Smallest synchronizations slowest
Levels out with larger synchronizations
Greatest effect on downloads; almost no
  effect on updates
Suggests per-synchronization overhead
MobiLink has some per-synchronization
  overhead
Timing framework adds more
Effect of synchronization size


Above 2500 rows, some performance
 drop
  Download reduced 12%
  Uploads reduced around 2%
Why?
  Increased contention in consolidated database
    (0.5 MB per sync with 5000 rows)
  Disk access becoming bottleneck
    (lower CPU utilization observed with 5000 rows)
Synchronization size


Some per-synchronization overhead, so
 throughput lower for smaller-sized
 synchronizations
Throughput might be reduced with larger
 synchronizations, particularly for
 downloads
  Likely to depend on consolidated database
Test 4:
Varying number of server CPUs

Constants:
  200 clients
  5 MobiLink worker threads
  5 synchronizations per client
  total of 1,000 synchronizations
  1,000 rows per synchronization (92 bytes each)
Varied:
  CPUs in use on ML/ASA server (1, 2, 3, 4)
                         Parallel scalability
                      16000

                      14000

                      12000
Throughput (rows/s)




                                                                    Downloads
                      10000
                                                                    Inserts
                       8000                                         Deletes
                                                                    Updates
                       6000

                       4000
                       2000

                          0
                              0   1        2              3         4           5
                                      Number of server processors
                         Parallel scalability (uploads only)
                      2500


                      2000
Throughput (rows/s)




                                                                       Inserts
                      1500
                                                                       Deletes
                                                                       Updates
                      1000


                       500


                         0
                             0   1        2              3         4             5
                                     Number of server processors
Parallel speedups

  CPUs       1      2      3      4
Downloads   1.00   1.59   2.11   2.34
 Inserts    1.00   1.49   1.82   1.93
 Deletes    1.00   1.50   1.90   2.01
 Updates    1.00   1.50   1.95   2.05
Parallel efficiency

Improved performance with additional
  processors
Speedups less than ideal, especially going from 3
  to 4
Best speedup for downloads, least for inserts

Note: Results are for MobiLink and ASA on same
  server computer, not for MobiLink on its own
  Contention is much more likely in consolidated database than in
    MobiLink
Hardware requirements

CPU utilization usually 98% to 100%, except for:
  Slower clients (especially for downloads)
  Downloads with few clients
  Few worker threads
  Points where consolidated database was committing data to disk
     (checkpoints)
MobiLink fluctuates from 25 to 35%, ASA 65 to
 75%
  MobiLink needs less processing power than consolidated database
   (less than ½ with ASA).
Performance of MobiLink

MobiLink overview
What takes time in a MobiLink synchronization?
How performance was measured
Results of performance testing
  Optimum number of worker threads
  Number of clients
  Size of synchronizations
  Parallel efficiency
Recommendations and next steps
Recommendations


Summary of performance tips
Applicability to your situation
Large deployments
Summary of performance tips

Avoid contention in your scripts
Use smallest number of worker threads
 that gives you optimum throughput
Set connection pool size if using multiple
 versions
Set upload and BLOB cache sizes to
 avoid disk access
Dedicate enough processing power to
 MobiLink so that it can saturate your
 consolidated database
Applicability


YMMV (your mileage may vary)
You should test with your
  schema
  data
  consolidated database
  synchronization scripts
  clients
Test with relatively small number of
 clients
 Suggested test procedure

1. Determine your synchronization needs
2. Set up a pilot implementation with a few
   clients
   (e.g. 20 clients and test with 5, 10, 20 worker
   threads)
3. Could run ML on same server as consolidated
4. Enable minimal verbose logging (-v)
5. Perform test synchronizations
6. From times in log, estimate times for more
   clients
   total server time, maximum and average client times
If you want to use our timing
framework

To use it as-is, or modify it yourself, we
 intend to make it available through
 Sybase Developer Network:
  http://www.sybase.com/developer/mobile
For help in adapting it to your needs, or
  for help in setting up efficient
  deployments with MobiLink, email our
  Solution Services group at:
  contact.us@ianywhere.com
 Large deployments

A single MobiLink server can handle tens
  of thousand or hundreds of thousand
  remote databases
  These tests equivalent to 80,000 to 1.4 million per day
For higher scalability or availability, can
  use multiple MobiLink servers with
  single consolidated database
  Use load balancer to make them appear as one, and provide
    failover and load balancing
Can also use multiple MobiLink servers in
 a synchronization hierarchy
Large deployment with load
balancing and failover

                                               WAN/
                     Router
                                              Internet

     Load Balancer            Load Balancer

MobiLink   MobiLink       MobiLink      MobiLink
 Server     Server         Server        Server


                                              Consolidated
                                                  DB
Other consolidated types?

Have done some testing with Oracle (8
 and 8i), ASE (11.9.2 and 12), and MS
 SQL Server (7 and 2000).
  All require considerable tuning
  MobiLink scalability unchanged
  Similar download throughput to ASA
  Slower uploads, best with fewer worker threads and
    uploaders
      • Upload performance better for non-ASA
        consolidateds in Vail; Oracle on par with ASA
MobiLink is scalable!


These tests show that MobiLink:
  Scales ideally with increasing remote databases
  Makes efficient use of hardware
  Has modest hardware requirements
I want you to:
  Use MobiLink for large number of remote databases
  Get the best performance
Benefits


You can:
Support a large number of remote
  databases
Predict performance for a large number
  of remote databases from tests with a
  small number
Maximize throughput by following
  performance tips
Q&A

White papers of interest at
 http://my.sybase.com/detail?id=XXX

  1009664 – MobiLink Performance
  1011880 – Recommended ODBC Drivers for MobiLink
  1010502 – Synchronizing with Oracle and ASA
  1012973 – Using Different Script Versions in MobiLink
  1009621 – MobiLink Transport Layer Security
  1013181 – Mobilink and TCP/IP Keep-Alive
  1012332 – Details on how the non-DMC Conduit Works
  1002288 – ASA Supported Platforms and Support Status
Q&A


Award winning newsgroup support at
 forums.sybase.com
  sybase.public.sqlanywhere.mobilink
  sybase.public.sqlanywhere.ultralite
  sybase.public.sqlanywhere.general
  iAnywhere Solutions Highlights

• Ask the Experts - about Mobile & Wireless Solutions
-Mezzanine Level Room 15B
Mon./Tues. 11:30 am - 3:30 pm; Wed. 11:30 - 1:30;
Thurs. 9 am - 12 noon
-Exhibit Hall - Demo Center (truck) exhibit hall hours
• SIG (Special Interest Group)
- Tuesday 5:30pm Mobile & Wireless SDCC, Upper level, Room 11
• Keynote - Enabling m-Business Solutions
Wednesday 1:30 pm - 3:00 pm
• iAnywhere Solutions Developer Community
-Excellent resource for commonly asked questions, newsgroups, bug
fixes, newsletters, event listings - visit www.ianywhere.com/developer
Q&A


  Scalable fast-food dining available at:




      http://www.webersrestaurants.com/

								
To top