Databases for Embedded Systems Tarun Bajaj Rahul Gupta Y

Document Sample
Databases for Embedded Systems Tarun Bajaj Rahul Gupta Y Powered By Docstoc
					  Databases for
Embedded Systems
Tarun Bajaj(200101105)
Rahul Gupta(200101030)
Y. Narendra(200101003)
                      Outline
   Need for Embedded Database?
   How it is different from normal enterprise
    databases?
   Features of Embedded DB
   Application Design
   Performance Tuning
   Available Embedded DBMSs
   A case study on Berkeley DB
Need for Databases in Embedded
           Systems?
   Devices like set-top boxes, network switches,
    mobile phones and consumer electronics
    becoming “Smarter”
   expanding feature sets  managing larger
    volumes of complex data
   self-developed data management solutions
    difficult to maintain and extend
            How it is different?
   Minimal functionalities required
   Configurable
   Small Footprint
   No Separate server process (interaction
    overhead)
   Database administration done by application
    (not DBA).
   In-Memory Database - No Caching required
   No performance Benchmarks
Features
            Available Services
   Database vendors want to target maximum
    customers, need to address as wide a range of
    requirements as they can.
   Developers identify their system requirements
    and choose a few from available options
   Any Concurrent access required?
   Does recovery from failures matter?
                   Footprint
   The resource load the database imposes on your
    embedded system.
   Does memory footprint make a difference to the
    application?
   Database Architecture – Client-Server/
    embedded library?
              Platform Support
   Platform support determines many subsequent
    options.

   Some database vendors distribute their product
    in source code form, so developers can port it to
    new platforms and new processors themselves.
                 Performance
   Concurrency and scalability are major
    considerations. Will the application have multiple
    control threads working with the database at the
    same time? How big will the database get?

   Evaluating the actual application’s performance
    is critical because it is the only embedded-
    system benchmark that matters
Application Design
                         Speed
   Data Representation
     Different representation  translation at every fetch
      and store operation
     A few database systems—generally, those that are
      libraries—let programs store data in program-native
      format, rather than translating it to a database format.
      In this case, the database requires no data
      translation.
                       Speed
   Access Patterns
     Data  should always be laid out with a view to the
      queries the application will execute.
     Most database engines support B+ tree storage, and
      some support other storage structures such as hash
      tables.
     Do consider the searches and updates the application
      will perform!
                               Speed
   Configuration
     Configuration      parameters
           the amount of memory used for secondary caches
          whether data should be written to disk or merely stored in memory
          granularity the locking system uses to acquire locks on objects
     Ingeneral, you should disable any unnecessary
      subsystems to save time and space.
     analyze the space that database records consume
      and size cache appropriately.
                      Predictability
   Why?
   Mismanagement of system resources
       Memory exhaustion resulting in failure to allocate memory
        properly
   Many libraries include their own resource
    managers, such as file descriptor pools and
    memory managers
   Backups, recovery procedures and periodic
    reorganization need to be handled automatically
Performance Tuning
         Contention for hot data
   Most database systems lock the values they
    touch during processing
   Record level locking helps in reducing
    contention to a great extent
   Changing your transactions so that they touch
    the hot data last and holding hot locks for a
    shorter time also reduce contention.
         Disk to memory transfers
   To reduce latency, database programmers do
    minimal read and write operations
   Conventional vs. exotic storage technology
       Flash RAM to store data persistently helps ignore disk accesses
        altogether
   Size of the cache should be sufficient to hold the
    application’s complete working set
   WS size can be estimated by the size of
    commonly accessed data
   Cache hit percentage can be improved by
    ensuring that the queries and data have the same
    locality patterns.
    Available Embedded DBMSs
   Empress - Runs on major embedded operating
    systems. Supports both SQL and programmatic
    interfaces to manage data.
   Sybase’s SQL Anywhere - can be configured to
    run only the queries a particular application
    requires.
   TimesTen - runs primarily from memory, limited
    support for embedded operating systems, forces
    write-throughs to disk for persistence.
    Berkeley DB – A Case Study
   Sleepycat’s Berkeley DB
     Fullblown, concurrent, recoverable database
      management
     Open Source licensing
     Multiple API support (C, C++, Java, TCL, Perl)
     can be configured to include or exclude components
      as the embedded application requires
                  Berkeley DB
   Portable
     Runs under almost all UNIX and Linux variants,
      Windows, and a number of embedded real-time
      operating systems.
     Runs on both 32-bit and 64-bit systems
     Deployed on palmtop computers, set-top boxes, in
      network switches, and elsewhere
                  Berkeley DB
   Scalable
     Quite compact (under 300 kilobytes of text space on
      common architectures)
     Small enough to run in tightly constrained embedded
      systems, but can take advantage of gigabytes of
      memory and terabytes of disk on high-end server
      machines.
     Supports high concurrency - thousands of users can
      operate on the same database at the same time
                   Berkeley DB
   Data Access Services
     supports   hash tables, B+ trees, simple record-
      number-based storage, and persistent queues
     Facility to choose from any of these storage
      structures to create table.
     mix operations on different kinds of tables in a single
      application possible.
                   Berkeley DB
   Data Management Services
     Two-phase   locking ensures that concurrent
      transactions are isolated from one another
     Write-ahead logging guarantees that committed
      changes survive application, system, or hardware
      failures.
     An application can specify, when it starts up, which
      data management services it will use
             Size Statistics

                      Object Size in Bytes  Lines
                          Text Data BSS of Code
Access methods (total) 108,697     52    0 22,000
Locking                 12,533      0    0  2,500
Logging                 37,367      0    0  8,000
Transactions/Recovery 26,948        8    4  5,000
Include                                    15,000
Total                  185,545     60    4 52,500
                  Conclusion
   Data management is an integral part.
   Embedded databases are fundamentally
    different from the enterprise databases, and
    require a fundamentally different solution.
   Lots of challenges facing embedded market.
   right trade-off between functionality and
    size/complexity required
                  References
   Selecting and Implementing an Embedded
    Database System by Michael A. Olson, Sleepycat
    Software
   Challenges in Embedded Database System
    Administration Margo I. Seltzer, Harvard University
    and Michael A. Olson, Sleepycat Software
   http://linuxdevices.com/articles/AT8030356341.html
   http://pybsddb.sourceforge.net/ref/intro/dbis.html
Thank You!!