Repository by dandanhuanghuang


									Software Repository Architecture

   An Enterprise Computing Example

    CSE691 - Software Modeling & Analysis
                  Fall 2000

                Jim Fawcett
                Enterprise Computing

   Implements processes necessary for the conduct of business.
   Provides access to corporate information to all those who need
    it wherever they work and live.
   Controls the state of corporate information.
     – applies business rules to ensure coherent transformations
     – probably requires transaction management
     – manages user roles, allowing different levels of access as
       appropriate for the individual user (not everyone has access to
       payroll attributes)
   Provides decision support for management.
   Is usually distributed, has very large data sets, but must deliver
    acceptable performance.

SW Repository                                                            2
                Software Repository

   A place to store and dispense an organization’s reusable
    software components
   Supports very productive development of software by making
    accessible a large store of existing software components.
   Its prime mission is to help user’s quickly locate useful parts out
    of a base of many thousands of components.
   A secondary mission is to provide tools to help administer the
     – carefully controlling change
     – supporting certified status for proven components
     – providing associate status for new components, being considered
       for certification

SW Repository                                                             3
          Software Repository Activities

    Stores and manages a very large base of reusable software.
    Supports browsing a well defined hierarchy of components.
    Provides information and status attachments at any level.
    Uses internet to provide access for on-site teams through encrypted
    Provides access based on role models.
    Supports configuration of subsets for individual projects and pro-
     grams, building of test-beds.
    Supports management of software builds.
    Supports planning, tracking status and issue resolutions.
    Supports configuration management and code analyses, perhaps
     by attaching a third party tool.

    SW Repository                                                    4
                Repository Structure

   The repository holds systems, programs, modules, and files,
    with information that relates to each of these.
   Repository Data Structure:
     – a system is a list of programs, summary and status information,
       system-level documents, and a make file
     – a program is a list of components, summary and status information,
       program-level documents, and a make file
     – a component is a list of files, summary and status information,
       component-level documents, and a make file
     – a file has versions and status attachments
   Indexes:
     – an index is a text summary, status information, and a list of lower-
       level elements with links to their indexes, e.g., a program index has
       a set of links to its constituent modules and files.

SW Repository                                                                  5
                               Repository Structure
                       Data Structure Diagram - Presistent Repository Association Index Structure

    system component                    program component           module component
   name : s                   name : test1.prg            name : test1.mod

           test1.prg                           test1.mod                   test1.h
           test2.prg                           test2.mod                  test1.cpp
                                               test3.mod                  test1.mak
                                                                          test1.doc            client directory

           key words                           key words                  key words
      status enumeration                  status enumeration          status enumeration
    summary inf o text block            summary inf o text block    summary inf o text block

                                               test1.prg                  test1.mod                  test1.h
                                               test2.prg                  test2.mod                 test1.cpp
                                               test3.prg                  test3.mod                 test1.doc
                                               test4.prg                  test4.mod                 test1.mak
                                                                          test5.mod                 test1.nts

        systemStore                        programStore                moduleStore                FileStore

SW Repository                                                                                                     6
                  Repository Services

   Browsing through reusable components
     – extracting indexes (lists of systems, programs, components, files)
     – viewing code
     – examining summary and status information
   Searching based on:
     –    hierarchy subset
     –   keywords supplied with each index
     –   component name fragment
     –   date interval
     –   content of status information
     –   content of summary information.

SW Repository                                                               7
                       Repository Structure

                                Client Desktop

          Client Repository Software                   co
            - views                                       m
            - browse service                            se unic
                                                          rvi at
            - data cache                                     ce ion

                                                                                  Repository Server

                                                 Server Repository Software
                                                  - file serving
                                                  - searching
                                                  - access and security
                                                  - updates and synchronization
                                                  - code analysis
                                                  - build service
                                                  - status collection
                                                  - shipping

SW Repository                                                                                         8
                       More Services

   Managing access based on user role.
   Enforcing security.
   Managing updates:
     – adding (and infrequently deleting) files
     – changing file entries shown on system, program, and component
     – managing file caches on a user’s desktop
   Synchronizing database updates between core library and project
   Creating views based on role models and what is being browsed.
     – A make file might not show up in a view unless explicitly requested.
     – A view could be restricted to code or to attached documents.

SW Repository                                                                 9
                 Software Repository - Services and Processes

                             browse, search, addition/deletion
                                build, packaging, analysis
                                        and status

                                     core processes
                            association, updating, communication
                                        and indexing

                Services are repository processes that the user interacts with

SW Repository                                                                    10
                                    Software Repository - Functional Partitions and Dependencies

                                                                                                                    f ind all components
                                                                         - display all selected components          and f iles that match:
 Services are repository                                                 - display f ile text                       - a set of key words,
 processes that the user                       - extract component       - display status inf ormation f or         - a time interv al, or                  - add/delete components
  interacts with directly.                       and build it              selected components                      - a name f ragment                        or f iles by name

                                                build service                browse service                       search service

- extract all f iles described by                                                                 - add new sy stem,                create/modify index
  some manif est to a named                packaging service                                        program, or module
  directory                                                                                       - build search indexes                 process

        - run an external tool
          on ev ery f ile in some    - attach or extract status
          component                    inf ormation f rom components

                                                                                                              - extract f ile f rom repository to f ile cache
        code analysis                                                                                           on desktop if not already there
                                         status service                      update process                   - add or remov e component or f ile f rom
           service                                                                                              repository database
                                                                                                              - extract components to some destination

                                    association process                                                             security service

                                     - f ind all components and f iles   - send a f ile or message between             - authorize an action
                                       associated with a higher lev el     repository and desktop                        based on user's role

         SW Repository                                                                                                                                                        11
                  Role-Based Access

   Provides access based on role models:
     – developers, quality assessment personnel have unlimited read-only
     – team leaders have write access to their team’s assigned
     – configuration management and repository administrators have
       unlimited access
   Supports configuring the role model attributes by function, e.g.:
     – core library
     – project library
     – project baselines

SW Repository                                                           12
                  Still More Services

   Providing tools and scripts for code analysis.
   Building execution images from manifests.
     – a manifest is just a temporary system
   Collecting status information from a manifest.
   Packaging for shipment.
   Client-side repository software would provide view control and
    caching of data.
   Server-side repository software would provide a well-defined set
    of relatively independent services on the repository data.
   Client-side will provide significant local support for repository
    services to unload server.

SW Repository                                                       13
                                 Software Repository - Client Processing Activity Diagram

                                                                                                               accept user's

                               display                                                  check                                                wait for pending
                                                   no           ok                                       no           quit         yes
                           error message                                             authorization                                                results


                                             search or status                                                          add/delete                analysis
                      browse request                                  build request          package request
                                                 request                                                                request                  request

                                            post refresh index
                         in cache                                       in cache                       in cache

                            no                                             no                             no

                       post update                                    post update                     post update                    forward request
                                               wait for reply
                         request                                        request                         request                          to server

                                                                                            yes                              yes

                yes                                                                                                                      wait for
                                                                      wait for reply                 wait for reply                   confirmation or
                                               search local
                       wait for reply                                                                                                denial of service

                                                                       build image                   build package                    display results

                                    display item

                                                                     implement with threads so user can initiate other requests while waiting
SW Repository                                                                                                                                                   14
                    Software Repository - File Serving Activity Diagram

                                            database integrity

                  listen f or
                                                           client connection


                                connect to new


                                                                                                              send back reply
                                                                                  accept request


                                                                                                             send back request

                                disconnect client         yes        disconnect   no         shutdown   no    process request

                                     process request         no                        yes

SW Repository                                                                                                                    15
                Estimating Workload

   1,500 developers

   500 managers, QA personnel, CM personnel, librarians,

   Assume one-half of these clients request simultaneous service at
    9:30 A.M. and 4:30 P.M.

   Assume that each service:
     – requires 500 KBytes of RAM to execute
     – opens 10 files

SW Repository                                                     16
                      Software Repository Server Load

                                                        Server Load:
                                                         - 1000 clients login f rom 9:30 to 11:00 A.M.
                                                         - av g. arriv al rate = 1000clients/90 minutes
                                                                               = 11 clients/min.
         Repository                                      - serv ice rate needs to be about 4 times the arriv al
                                                           rate to insure low queuing delay
           client                                                              =
                                                         - av g. serv ice rate 44 requests/min.
                                                                               = 0.73 requests/sec.
                                                         - av g serv ice time  =1/0.73
                                                                               = 1.4 sec/request

Repository                                                        Repository
  client                                                            server


SW Repository                                                                                           17
                Need for Middleware

   1000 concurrent clients want to run applications that average:
     – 500 KB memory
     – maintain 10 open files

   Without some sort of process management clients at peak load
    need from the server:
     –   1000 connections
     –   1000 processes
     –   500 MB memory
     –   10,000 open files

   No server can survive this assault.

SW Repository                                                        18
                              System Load

                                 1000 Connections
                                  1000 Processes
                                500 MBytes of RAM
                                 10,000 Open Files   Sy
                ie   nt                                st
             Cl                                                  Di
        00                                                         es

SW Repository                                                           19
                    Estimating Server Load

     If 1000 clients login between 9:30 and 11:00 A.M.
        – Arrival rate is 11 clients per minute.
        – Service rate should be at least than 4 times the arrival rate to keep
          queuing delays from impacting the user.
        – Service rate, averaged over all services and weighted by frequency
          of use of each service, should be at least 44 requests per minute.
        – So average service time should be no more than about 1.4 seconds.
     Conclusions:
        – We can’t allow a user to stay connected to the server. Clients must
          send requests that take on the average no more than 1.4 seconds to
          complete. Requests result in files being served to client for browsing
          on the client’s desktop
        – At quitting time, clients send files to be checked in to a holding area to
          be processed over night (but only if this is a project server).

    SW Repository                                                                20
                Middleware Support

   Maintains a pool of idle processes for the various services
   When a request for service arrives it wakes a process of the
    proper type and starts it working on the request.
   Maintains a pool of file handles. Supplies them as needed to a
    requesting process.
   May support the queuing of messages from multiple clients to a
    single server.
   May support the perception of a client/server session by saving
    session state in cookies on the client side while working on
    other client’s requests.
   May support multiple servers sharing a single service queue.

SW Repository                                                         21
                    Need for Middleware

                          50 Shared Connections
                               50 Processes
                              25 MBytes RAM
                              500 Open Files      Ex Si
              nts                                   c l mu
           lie                                         us
                                                          ive late
         0C                                                   Ac d
      100                                                        ce

SW Repository                                                           22
                     Browse Function

   Browse activity probably entails a few brief flurries of activity
    between desktop and library interspersed with long periods of
    quiescence (user is busy at the desktop).
   Browse mode has state:
     –   part of hierarchy being browsed
     –   user’s role
     –   current working data set
     –   current build manifest
   Implication is that we should do all stateful browsing on the
    client desktop, and ask the server only to serve up files and
    indexes that don’t exist on desktop or are out of date there.
    How do we do that?
     – use message queuing middleware (request file transfer, status, …)

SW Repository                                                              23
                     Queue Manager

   Queue Manager software provides services for enqueuing and
    dequeuing messages on a user’s desktop and on server
    supporting asynchronous message transfer.
     – The queue manager uses OS and network support to establish a
       connection between source and destination.
     – These should be transient connections.
     – On the sender’s side application software threads call a thread-safe
       enqueue function to post the message to a queue for delivery.
     – The queue manager’s thread calls its thread-safe dequeue function
       to extract the message to send across the network.
     – On the receiver’s side the queue manager’s thread receives the
       message and enqueues it to the destination queue. The receiver’s
       application thread dequeues it for processing.

SW Repository                                                             24
               Project #3 - Messaging System Architecture

                                        client/server model

                                           send and receive
                        Client            messages and data              Server

                                        MessageMgr Model

 thread-safe                           asynchronous messages                                       thread-safe
                      MessageMgr          and data transfers           MessageMgr
   queues                                                                                            queues

                                   - develops interface for clients
                                     and servers
                                   - implements protocols for
                                     message and data transfer
                                   - uses sockets interface to
                                     effect transfers
                                   - queues messages at receiver
                                   - handles socket with one thread,
                                     parses messages and handles
                                     queue with another

                                           sockets model

                                             bidirectional                          - server listens, spawns a
                        sockets              byte stream
                                                                         sockets      thread for each client

SW Repository                                                                                                    25
                                                                                                                   msgClnt Module
                                                                                      client                                                                   worker
         Software Repository                                                         thread                                                                    thread
   Messaging Service for File Transfer

                             Win32 API                                                                            mtqueue Module                            clientWorker
   msgServ Module

         serv er             Win32Utils
         thread               Module

                                                                                                      s ock
                                                                                                                                                      (conv ert messages into f iles)

                                                                     wn m

                                                                                                          et: -

                                                                shut D

                                                                                                                 _f ile,
                                                                            msgThrd Module

                                                          f ile, d

                                                                                                                         t a, d
                                  receiv er                                          worker                                                  sender

                                                                                                                              one m
                                   thread                                            thread                                                  thread

                                                    et: -

                                                  s ock
       startServ er

                                socketServ er                                    serv erWorker                                               sender

   basicSoc Module
                         timer Module
                                                                                             f ile name

                                                                             processFileMessages                                                                             .\sData
                                   mtqueue<msg>                                                                                       mtqueue<msg>
                                        in                                                                                                 out
                                                                          (conv ert f iles into messages)
    WinSock API

                                mtqueue Module                                                                                  mtqueue Module

SW Repository                                                                                                                                                                           26
         Estimating a User’s Data Set

   While in browse mode a developer will need to access a lot of
     – search by keyword, date, status content, summary content, name
     – may access every index in the data set
     – may examine manual pages of several hundred files

   This implies the need for high performance access to about:

          250 components * 500 lines * 25 bytes per line
                 = 3,125,000 bytes of data

   Practical to send to the user’s desktop as long as server sends
    only updates to data cached on the client’s desktop. Client side
    software needs to help clean up the resulting mess.

SW Repository                                                           27
     Estimating Server Data Set Size
   Assume that the organization has:
     – projects ranging in size from a few hundred thousand source lines of
       code (SLOCs) to about 5,000,000 lines.
   Assume that 90 percent of each is built from reused components
    once repository has been refined and established for a few years.
   Assume that any one project uses at most about 20 percent of the
    available components because we support programs in many
    different business areas.
   This implies
          5,000,000 * 0.9 *5 = 22,500,000 SLOCs
   If the average component is 500 lines there are:
          45,000 components under control
   if there are 5 files average per component:
          225,000 files under control

SW Repository                                                            28
                Repository Data Set Size

                       5,000,000 SLOC Sytems

                        45,000 components

                           225,000 files

SW Repository                                  29
                Partitioning the Data Set

   No single project wants to be buried under this mountain of
     – its important that any project has access to the entire core library
     – a project wants to create a subset for its development operations
       which will evolve as the project matures
     – this will need to be synchronized with the core library at well
       defined points in the program

   Both usability and performance (but not cost) will benefit from
    partitioning the repository into a set of dedicated servers with
    access to a core library server.

SW Repository                                                                 30
                    Repository Partitions
                Project A                  Project B
                 Library                    Library

                            Core Library

           Project C                          Project D
            Library                            Library

SW Repository                                             31
                   Performance Issues

   Some of the repository activities are local:
     – administrator connected to the core library server
   Some are very remote:
     – project team working in Australia communicating over satellite links
   Communication levels are:
     –   function to function
     –   thread to thread
     –   process to process
     –   process to disk
     –   computer to computer
     –   network to network
     –   continent to continent
   Each level could imply an order of magnitude speed difference.

SW Repository                                                             32
                  Architecture Issues
   If we partition activities badly we could wind up:
     – sending mouse clicks from Australia to Syracuse.
     – sending megabytes of data to the desktop at each step of a browse
     – sharing a communication channel among hundreds of users

   Middleware
    One role of middleware is to support high performance processing for
    many users across machine and network boundaries.

   Three-Tier Architecture
    Looks like we need three tiers of processing:
     – Client side presentation and execution of local services
     – Server side file serving and database synchronization
     – Transportation and caching services as middleware

SW Repository                                                              33
                       Design Issues

   If we design the repository badly we could wind up:
     – sending file duplicates to many places with no way to track them
     – getting satellite libraries and core library incoherent, e.g., having
       different data describing the same item or different items describing
       the same data
     – allowing routine operations to corrupt configuration management

   Transactions
    One way to manage these issues is to use transactions.
    Transactions define a set of operations as a group which must
    all succeed or all fail.

SW Repository                                                              34
         Desktop - Library Middleware

   Activities - controlled by role model
     –   transfer indexes and files
     –   execute build
     –   construct package
     –   add status and summary information
     –   create new components from existing files
     –   add new file versions
     –   add new files
     –   delete components
     –   delete files

SW Repository                                        35
Architecture Concept Document Outline

   Executive Summary
   Introduction
     – definition of repository
     – goals of repository tool
          • distributed database
          • effective access
          • tools for maintaining reusable component library
     – services provided
   Estimate of server load
     – arrival rate of clients
     – required service time based on queue analyses
     – implications for repository architecture
          • can’t stay connected
          • must be message-based

SW Repository                                                  36
                  Outline - Continued

   Distributed Architecture
     – partitioning of database into core library and project libraries
     – partitioning of repository activities
          • services and core processes
               – association, communication, update

                     • association based on association indexing
                     • communication is based on message overlaying sockets
                     • update user cache, repository database, and synchronization
               – browse, search, build, status, analysis, packaging, security,
          • client vs server activities
               – client: data caching, browse, search, build, packaging, analysis

               – server: file serving, security, add/delete, synchronization, status

SW Repository                                                                      37
                  Outline - Continued

   Detailed Service Analysis
     – browsing:
          • show activity diagram
          • discuss desktop data caching in some detail, e.g., how does the update
            process work on the user’s desktop
          • when should the server send indexes?
               – at startup or on demand?

          • with data caching as you’ve described it, analyze the client load on
          • how does partitioning the database into core library and project libraries
            affect server load - be quantitative
     – another service
          • you choose and then carry out an analysis something like the analysis
            of the browse service outlined above
          • each service will have somewhat different analysis requirements
               – you can always substitute prototyping code for detailed analysis

SW Repository                                                                        38
                  Outline - Continued

   Extension of the Repository Architecture to parallel servers
     – use m-server queue performance chart to show how using m
       servers will affect the number of clients the repository can handle
     – plot client load vs. number of servers
          • our analysis showed that one server can handle about 1000 clients
          • keep the avg. queue length the same in all cases, e.g. 0.4 waiting
   Messaging system
     – use your project #3 to provide prototype data for analysis
     – show your prototype architecture (builds customer confidence that
       you can actually build the system)
   Make sure that you provide an Executive Summary and a
    summary for each of the sections described above.
     – load estimate, distributed architecture, browsing service, other
       service, extension, messaging

SW Repository                                                                    39
                    Appendix A

                Transaction Processing

SW Repository                            40

   Transaction
     – A series of processing steps that results in a specific function or
       activity being completed, ensuring that a set of actions is treated as
       a single unit of work.
   Transactions Support the ACID Properties:
     – Atomicity: a transaction is an indivisible unit of work. All of its
       actions either succeed (commit) or they all fail (rollback).
     – Consistency: after execution a transaction leaves the system in a
       correct state or it must abort.
     – Isolation: a transaction’s behavior is not affected by other
       transactions that execute concurrently.
     – Durability: After a transaction commits its effects are permanent,
       even if the system fails.

SW Repository                                                                41
       Build and Update Transaction

   Suppose a system is composed of several programs. Some of
    these share the use of a lower-level module which is being
    updated to install a bug-fix.
   The transaction consists of
     – building the each of the affected programs with the new version of
       the fixed module
     – if each build is successful and if individual programs pass an
       automated battery of regression tests then the transaction commits.
       Otherwise it rolls back.
     – Committing the transaction implies that the system definition and
       each of its program definitions are updated in the corresponding
       indexes to use the new module version.
   Evidently a specialized transaction processor is needed here.

SW Repository                                                            42
                  Two Phase Commits

   If a transaction is created with a nested structure of lower-level
    transactions a two-phase commit is needed.
     – The root transaction (at the top level) asks if all its child
       transactions are ready to commit.
     – Each child in turn asks its children (if it has any) if they are ready to
     – When all the leaves are ready and any transactional processing
       that is needed at each upper-level also succeeds then the
       transaction is committed.
     – This commit processing proceeds in two phases:
          • recursively querying children about their readiness to commit
          • flowing back all the replies
          • committing if at each level the commit is affirmed. This is accomplished
            by the root flowing down a commit command recursively to each child.

SW Repository                                                                     43
    Transaction Processing Monitors
   TP Monitors
     – TP Monitors manage transactions by coordinating an application’s
       use of a set of services.
     – They were introduced to run classes of applications that can serve
       thousands of clients.
     – TP Monitors control the traffic that links lots of clients with lots of
       application programs.
     – TP Monitors manage transactions from their point of origin (a client)
       across one or more servers and back to the origin.
     – Using TP Monitors an application requires no specialized code other
       than begin/end transaction.
     – TP Monitors guarantee that unrelated services work together in ACID
   References:
     – 3-Tier Client/Server at work, Edwards, Wiley, 1999

SW Repository                                                              44
                   Appendix B

                Communication Styles

SW Repository                          45
                Communication Styles

   Synchronous (blocked) or asynchronous (unblocked)
   Connection-oriented or connectionless
   Conversational or pseudo-conversational
   Direct or queued
   Client/server or cooperative roles
   Partner availability and connection issues
   Number of partners issue

   Reference
     – High-Performance Client/Server, Loosley, Douglas, Wiley, 1998

SW Repository                                                          46
    (A)synchronous Communication

   Synchronous communcation
     – The initiating process sends a message (request) and waits for a
       response before it can continue processing. Remote Procedure
       Calls (RPCs) are synchronous.

   Asynchronous communication
     – The initiating process is unblocked - it can continue processing
       after sending a request, perhaps sending more messages to the
       same destination server or to others without waiting for a reply.
       Asynchronous communication requires middleware between client
       and server that supports a messaging model.

SW Repository                                                              47
      Connection or Connection-less

   Connection-Oriented
     – Two parties first connect, exchange messages, then disconnect.

   Connection-less
     – No connection is established. The sender simply sends a message
       and the server acts on the message with no further relationship with
       the requester.

SW Repository                                                             48
Conversational or Pseudo-Conversational

   Conversational
     – A connection-oriented exchange of messages is conducted.
     – If a server has many active connections at any given time, many
       are likely to be idle, waiting for the next client message. This
       wastes server memory and CPU cycles.

   Pseudo-Conversational
     – Upon completion of a request\response pair the server process
       terminates, or is reassigned, after saving any necessary session
       state with the client in a cookie. When a new message arrives it
       carries a cookie so that its session can continue.
     – This allows both the process and connection to be shared among
       several clients.

SW Repository                                                             49
    Direct or Queued Communication

   Direct Communication
     – Middleware accepts a message from the initiating process and
       passes it directly to the server process. This is usually done with a
       synchronous call from the client.

   Queued Communication
     – Queue Manager middleware places the message in a queue. The
       target process retrieves the message from the queue when it is
       ready. A response, if required, is posted by the server, via the
       middleware queue manager, to a client queue. The posting
       process unblocks as soon as the Queue Manager accepts the

SW Repository                                                                  50
         Client/Server or Peer-to-Peer

   Client/Server
     – One process makes requests, the other responds.

   Peer-to-Peer
     – Either process can initiate a request. Responses may or may not
       be paired with requests.

SW Repository                                                            51
Availability, Connection, and Binding

   Point-to-Point Middleware
      – P2P middleware relays messages between partners that are
        available and connected.
      – P2P requires the binding of source to a specific destination.
      – Remote Procedure Calls (RPCs) are one form of synchronous P2P

   Message Queued Middleware (MQM)
      – Message queued middleware does not require availability. A
        queued message can be held until the target becomes available.
      – In MQM the source binds to a queue, which may be served by a
        single server or possibly many servers.

 SW Repository                                                           52
                Number of Partners

   Some processes will need to communicate with many partners
    during a transaction. Most communication styles allow
    concurrent requests to partners, usually with one connection per

   in queued communication, processes connect to a queue
    manager to get access to one or more queues. Each of the
    queues can, in turn, be read by more than one partner.

SW Repository                                                      53

To top