Case Study for Information Database Management by hpd18293

VIEWS: 50 PAGES: 27

More Info
									                            Data Management and
                         Information Delivery (DMID)
                                 Case Study
                                  Sibongile Madonsela
                                    Matile Malimabe
                                    Bubele Vakalisa

                                  Statistics South Africa

                                UNECE Workshop on the
                              Common Metadata Framework
                              Vienna, Austria, July 4-7, 2007
Statistics South Africa
Case Study - ESMDF Project                 1
Statistics South Africa
Case Study - ESMDF Project   2
                                          Introduction



     Programme Providing Frame for Stats SA Projects
              Providing Relevant Statistical Information to meet user Needs
              Enhancing the Quality of Products and Services
              Developing and Promoting Statistical Coordination and Partnerships
              Building Human Capacity

           This project is aimed at supporting the strategic theme “Enhancing the Quality of
           Products and Services”. Within the DMID, the metadata management system
           addresses this strategic theme


     Overall Project Objective
            Metadata management system forms part of the organisation‟s broader objective
           to continuously improve the quality of its products
            The Survey metadata tool consist of elements for providing the overall description
           of a statistical survey
            The survey metadata component is fashioned along the lines of Statistics
           Canada‟s Integrated Metadata Database (IMDB) Metastat




Statistics South Africa
Case Study - ESMDF Project                          3
                                         Organisation Chart
                                    Stats SA and current projects
                                                               Statistician-General

                                              Internal Audit                          SG Support and
                                                                                      Strategic Planning



                 Economic Statistics     Population and            Quality and              Statistical Support      Corporate Services
                                        Social Statistics          Integration               and Informatics

                     Industry and          Population               National                  Geography                     Finance and
                     Trade Statistics      Census                   Statistics                                              Provisioning
                                                                    System Division           System of Registers
                     Employment            Census Comm.                                                                     Human Resource
                     and Price             Project                  Methodologies             Data Management               Management
                     Statistics                                     and Standards             and Technology
                                           Census 2011
                     CPI Project           Project                                            Data Management and           Facilities
                                                                    Integrative               Information Delivery          Management,
                                                                    Analysis                  Project                       Security and
                     Financial             Social                                                                           Logistics
                     Statistics            Statistics               SAS 9 Migration
                                                                    Project                   Statistical
                                                                                                                            Human Capacity
                                           Labour Force                                       Information Services
                                           Survey Re-                                                                       Development
                                           engineering              National                  Provincial
                                           Project
                                                                    Accounts                  Coordination

                                           Health and                                         Programme Office
                                           Vital Statistics



      The Data Management and Information Delivery (DMID) project (magenta shaded box) is located within the Data
                                       Management and Technology Division

        The yellow shaded boxes indicate some of the ongoing projects that are concurrent with the DMID project.


Statistics South Africa
Case Study - ESMDF Project                                                   4
                                Organisation Chart
                     DMID project, including supplier’s resources




    Prescient Business Technologies (PBT) - the supplier to the DMID project, developing the ESDMF System
    ESDMF – End to end Statistical Data Management Facility
    PM – Project Manager



Statistics South Africa
Case Study - ESMDF Project                                5
                      DMID Project Structure at a high level




        Standards Development and Implementation
               Led by Chief Standards Officer
               Develop policies, standards and procedures before components of the facility can
              be implemented, using the Standard Lifecycle
               For Phase One, the policies for Data Quality and Metadata were implemented
               For future phases, related policies will be developed

        End to end Statistical Data Management Facility (ESDMF)
               Led by Technical Lead/Project Manager
               Uses policies developed by the standard team to generate requirements for the
              system by using software technologies to implement the system


Statistics South Africa
Case Study - ESMDF Project                         6
                                              Standards
                                              Life Cycle




       Develop policies, standards and procedures before components of the facility can be
      implemented, using the Standard Lifecycle

Statistics South Africa
Case Study - ESMDF Project                       7
                                      Conceptual components of the
                                                 ESDMF
         Need                Design     Build   Collect   Process     Analyse    Disseminate
          1                    2         3        4          5          6             7




         End to end Statistical Data Management Facility (ESDMF) Uses policies
        developed by the standard team to generate requirements for the system by using
        software technologies to implement the system
Statistics South Africa
Case Study - ESMDF Project                           8
                        Statistical Metadata in Each Phase of
                                 the Statistical Cycle
   Description
       Metadata is used during various stages of statistical production as essential input to
      production processes
       The production processes in turn, produce metadata
       Metadata is also important in documenting the trail of activities during the statistical
      production process

   List of Metadata Groups (Categories of Metadata)
       Survey Metadata (Dataset Metadata)
          • Used to describe, access and update dataset, data structures
          • Called survey rather than dataset metadata because some of the metadata, such as information
          about “the population which the data describe”, refer to the broader aspects of the survey, and not
          only the dataset
       Definitional Metadata
          • Describes the concepts used in producing statistical data
          • These concepts are often encapsulated into measurement variables used to collect statistical data
       Methodological Metadata
          • Relate to the procedures by which data are collected and processed.
          • These include Sampling, Collection methods, Editing processes, etc.
       System Metadata
          • Refers to active metadata used to drive automated operations
          • Some of the examples are: file size, access methods to databases, etc.
       Operational Metadata
          • Metadata arising from and summarising the results of implementing the procedures
          • Examples include: Respondent burden, Response rates, Edit failure rates, Costs and other quality
          and performance indicators, etc.



Statistics South Africa
Case Study - ESMDF Project                                9
                                      Detailed Process Model
                                              Scheme
          Need               Design     Build    Collect      Process      Analyse     Disseminate
           1                   2         3         4             5           6              7



   Need
       Understand the need for the required statistics, i.e., what the required statistics are going
      to be used for in concrete terms by their users.
   Design
       Preparing ground for the execution of a statistical production project.
       For example, questionnaire design, Capturing tool design, Tabulation plans, etc.
   Build
       The build phase puts together all the pieces of the infrastructure for a statistical
      production project
       E.g. the data capturing and scanning tools are developed, tested and implemented
   Collect
       Refers to both direct and administrative methods of data collection
       The direct collection method refers to data collection in which Stats SA sources data
      directly from the respondents
       In administrative collection, data are drawn from databases of other organizations which
      in turn source them from their respondents
   Process
       Includes capturing collected data into databases so that data processing may be done
   Analyse
       After data have been cleaned during the Process phase, it is now ready for manipulation
      using analytical tools
   Disseminate
       Publications are created from the datasets produced by the analysis phase
       Disseminated in various forms, e.g. electronic, printed output and compact disks
Statistics South Africa
Case Study - ESMDF Project                         10
                       How the Stats SA process model map
                          to the METIS Metadata cycles

                             METIS                                Stats SA
          Survey planning and design                     Need and Design Phases
          Survey preparation                             Part of Design Phase
          Data collection                                Collection Phase
          Input processing                               Processing Phase
          Derivation, Estimation, Aggregation            Processing Phase
          Analysis                                       Analysis Phase
          Dissemination                                  Dissemination Phase
          Post Survey Evaluation



   Post Survey Evaluation
      This is currently done outside the statistical cycle. It is performed only for the large
      surveys such as the population census and the community survey




Statistics South Africa
Case Study - ESMDF Project                          11
                      How Metadata Fits into Other Stats SA
                                   Systems




  ESDMF Core (Metadata Subsystem is one part thereof)
      This consists of all the components that make up the ESDMF system (only a few shown)
  Integration Layer
      This facilitates access to the ESDMF functionality to the existing systems with Stats SA

Statistics South Africa
Case Study - ESMDF Project                       12
                                               Metadata
                                              Description
   Survey Metadata Capture Tool
       This is the principal mechanism by which metadata is documented
       The tool is based on an approved Survey Metadata Standard Template
       The standard is made up of groups of metadata elements
          e.g. Overview, Generic Information, Methodology, Data Quality Report, Documentation and Contacts)
         which describe certain characteristics about the survey data
   Survey
       Metadata is organized around an entity known as the survey
       A survey can be:
          a direct survey: data is collected directly from the respondents. This could be a sample or a census
          administrative: data is sourced from another organization, which had collected the data for their
         own purposes.
          derived: a statistical program uses administrative data or a data integration activity is done
   Series Metadata
       The group of metadata topics that remains constant for a period of time or do not change
      as frequently (e.g. history, objective or abstract of the survey etc.)
   Instance Metadata
       This is the metadata that is compiled and produced frequently
       The frequency of production of this metadata set is every time a release is produced
   Storage
       Metadata captured is stored in a database and can be saved and viewed anytime
   Benefits
       Once captured, users can always access their metadata from a centalised storage
      location (centralisation of metadata)
       Users across the organisation also use the same mechanism to capture metadata
      (quality of metadata)

Statistics South Africa
Case Study - ESMDF Project                             13
                             Survey Metadata Capture
                                      Tool
  The implemented Survey Metadata Capture Tool of the ESDMF captures the
  following metadata:
   Overview
       The Overview section comprises the following items: Objective, Abstract, History, etc.
   Generic Information
       provides generic information about the survey time frames, e.g. frequency, collection
      start and end dates
   Primary Data Source
       External data inputs to the survey, e.g. external or internal data sources
   Methodology
       The activities conducted and the methods and processes used which are specific to the
      survey, e.g. survey population, instrument design, sample design, etc.
   Data Quality Report
       Comprises the quality dimensions of the data, e.g. relevance, accuracy, accessibility, etc.
   Documentation
       Links to additional documentation related to the survey
   Contact
       Contact person who will manage enquiries related to the data or information produced by
      the survey
   Active Metadata Sets
       The file identifier and status of the current/active metadata set is displayed immediately
      under this section. In other words, the metadata set that the user is currently capturing,
      editing or viewing.
   Loaded Metadata Sets
      Lists the file identifiers and statuses of metadata sets created by the current user
      Enables the current user to switch between metadata sets
Statistics South Africa
Case Study - ESMDF Project                        14
                               Survey Metadata Capture Tool
                             User Interface – Activity Selection




                        Instance Metadata – Create an instance metadata
                        Create a Report – Create a report in PDF format
                        View (metadata) – View approved metadata
                        Approve (metadata) – Approve metadata for use in a survey
                        Series Metadata – Create a series metadata

Statistics South Africa
Case Study - ESMDF Project                        15
                                  Survey Metadata Capture Tool
                                   User Interface – Navigation




                                Active Metadata Sets
                                Overview
                                Generic Information
                                Primary Data Source
                                Methodology
                                Data Quality Report
                                Documentation
                                Contact
                                Loaded Metadata Sets
Statistics South Africa
Case Study - ESMDF Project                              16
                                         IT Infrastructure
                                          Specifications
     Operating Systems
         Desktops are in Microsoft Windows
         The application is deployed in an Open Source operating system (Novell SuSe Linux)

     Networks
         The network architecture is based on open protocols and industry standards
         Allows remote access to some employees
         Supports both local area (LAN) and wide area (WAN) networks

     Servers
         The system is developed as a client-server application
         This means that there is a need for powerful computer servers capable of handling
        intensive processing

     Data Storage
         Storage management is via the Storage Area Network (SAN)

     Environments
         Three environments:
              • Application Development
              • User Acceptance Testing (UAT)
              • Production




Statistics South Africa
Case Study - ESMDF Project                      17
                                              IT Infrastructure
                                            Specifications - Details
                                                               A. Development Environment
                                                                    Operating System/
                             Function       Make/Model                                      Comment
                                                                    Database Engine
                                            HP BL45p
                             Application    Quad processor                                  Make/Model exceeds
                                                                    SuSe Linux Ver. 10
                             Server         4 GB RAM                                        recommendation
                                            2 x 72 GB HDD
                                            HP BL45p                Oracle 10g or
                             Database       Quad processor          Sybase ASE and Sybase   Make/Model exceeds
                             Server         16 GB RAM               IQ                      recommendation
                                            2 x 72 GB HDD           Unix/Linux/Windows
                                            HP DL 320
                                            Dual processor                                  Make/Model exceeds
                             Build Server                           SuSe Linux Ver. 10
                                            2 GB RAM                                        recommendation
                                            2 x 72 GB HDD
                                                         B. User Acceptance Test (UAT) Environment
                                            2 x HP BL45p
                             Application    Quad processor                                  Make and model exceeds
                                                                    SuSe Linux Ver. 10
                             Servers        8 GB Ram                                        recommendation
                                            2 x 72 GB HDD
                                            2 x HP BL45p            Oracle 10g or
                             Database       Quad processor          Sybase ASE and Sybase
                             Servers        32 GB Ram               IQ
                                            2 x 72 GB HDD           Linux
                                                                C. Production Environment
                                            2 x HP BL45p
                             Application    Quad processor                                  Make and model exceeds
                                                                    SuSe Linux Ver. 10
                             Servers        8 GB Ram                                        recommendation
                                            2 x 72 GB HDD
                                            2 x HP BL45p            Oracle 10g or
                             Database       Quad processor          Sybase ASE and Sybase
Statistics South Africa      Servers        32 GB Ram               IQ
Case Study - ESMDF Project                  2 x 72 GB HDD           Linux
                                                                      18
                                 IT Infrastructure
                             Specifications - Diagrams




Statistics South Africa
Case Study - ESMDF Project              19
                      Components of Metadata Management
                                 Application

   User Interface
       The user interfaces for all the metadata management system applications is web-based
       Client workstations only need to have a web-browser to access server based applications
       The main supported web-browsers are Microsoft Internet Explorer and Firefox

   Database
       The application is supported by a relational database management system (RDBMS)
       The RDBMS engine of choice for this project is Sybase
       The project is currently using the open source RDBMS, MySQL

   Business Logic
       The business logic controlling the interaction between the UI and the underlying database
      is coded using Java server side scripting
       There is also business logic coded using stored procedures. This mostly performs
      housekeeping within the database

   Application/Web Server
       The application is served to the client via Tomcat, which processes Java code.
       Tomcat also handles HTTP calls from the web browser




Statistics South Africa
Case Study - ESMDF Project                       20
                     Partnerships and Cooperation between
                                   Agencies
   Latvia
       Metadata model is also based on Bo Sundgren‟s model
      Their outsourced supplier took a while to understand the business of the statistical
      organization
   Ireland
       Issues regarding communication between the customer and the supplier
       Project took longer than planned
   Slovenia
       Metadata model is also based on Bo Sundgren‟s model, with some modifications
   New Zealand
       Adopted their business process model, called it Statistical Value Chain in Stats SA
       We also adopted their broke down of metadata into five categories
   Australia
       For a successful data warehouse project, there is a need to develop policies and
      standards
   Sweden
       Advise us on various aspects of metadata and statistical production processes
       Better idea on how to develop a data quality template, as well as how data quality should
      be reported on
   Canada
       We applied that knowledge of their Metastat (IMDB) during the development of our
      Survey Metadata Capturing Tool.
       Consultants from Canada come to help in other projects within Stats SA, including us.
   United States
      We used the Corporate Metadata Repository (CMR) model by Dan Gillman, from the US
      Bureau of Statistics in our understanding the metadata model
Statistics South Africa
Case Study - ESMDF Project                       21
                                  Organizational and Cultural
                                            Issues

   Climate and Culture Assessment
       A key challenge to Stats SA is to focus the organisation on the strategic importance of
        the DMID project
       A Climate and Culture Assessment was done by holding focus groups as well as running
        an online survey via Stats SA intranet website

   Change Readiness Assessment
       A Change Readiness Assessment was conducted to determine the current capacity of
        Stats SA to change, and to identify areas of resistance towards DMID requiring
        Organisation Change Management (OCM) interventions
       The following „change readiness dimensions‟ formed the basis of the Change Readiness
        Assessment:
             •   Clear vision
             •   Effective leadership
             •   Positive experience with past change initiatives
             •   Motivation to do the project
             •   Effective communication
             •   Adequate project team resources

   What is Change Readiness?
       The Change Readiness Assessment is a process used to determine the levels of
        understanding, acceptance and commitment likely to affect the success of the planned
        change.




Statistics South Africa
Case Study - ESMDF Project                                   22
                                       Change Commitment Curve

                                                                                 Internalisation    This is the way I do things

           Achieving
          commitment
                                                                              Commitment      This is the way we do things



                                                                         Acceptance     I’ll do it the new way


            Achieving                                           Engagement
           acceptance                                                          I’ll look at doing it the new way

                                                   Understanding
                                                                   I know the implications for me
                             Contact   Awareness
             Setting                                 I know what it is
               the
              scene
                         I know something is changing




   Commitment
       As the DMID project phases roll out, different stakeholders will need to be at specific
       levels of commitment
       The level of commitment required will be dependent on the role they play in the DMID
       project and their ability to influence the program

   Framework
       The Change Commitment Curve will provide a framework for understanding and tracking
       the requisite levels of commitment that stakeholders need to be facilitated through so
       that OCM interventions can be developed accordingly
Statistics South Africa
Case Study - ESMDF Project                                               23
                                Climate and Culture
                        plus Change Readiness Assessments
  The following we the findings from the assessments:
      Executive Management does not have the same understanding of the DMID project
      Lack of communication between management and sub-ordinates; this makes it difficult for
      sub-ordinates to understand the purpose of the project and the impact it has on their
      working lives
      Lack of support from Executive management will result to resistance and difficult success
      of the project
      If management does not communicate, does not understand and does not promote the
      project, it will result in difficulty to deliver the message and get buy-in from staff in the
      organisation

  Next Steps from the Findings
      The findings of the assessments resulted in identifying where some of the key staff
       members belonged on the Change Commitment Curve.
      In general, most were in the “Setting the Scene” and “Achieving Acceptance” area
       bounded by in time by “Contact” (“I know something is changing”) and “Understanding”
       (“I know the implications for me”)
      Obviously, a lot of effort is needed in order to move from that area to “Achieving
       Commitment” demonstrated by “Internalisation” wherein staff can claim that “This is the
       way I do things”
      Another outcome of these assessments was to organize a Leadership Alignment
       workshop.
      In this workshop, the Executive Committee was given a presentation of the findings and
       the path forward
      The path forward is to ensure that the leadership understands the goals of the project and
       how they line up with the vision of Stats SA
      The leadership was also instructed on how to communicate the same message about the
       project
Statistics South Africa
Case Study - ESMDF Project                       24
                                      Lessons Learned


     The business of Stats SA
            The supplier had a difficult time understanding the business of Stats SA, which is
           statistical production processes

     Skills Transfer Plan
            Under pressure of meeting the deliverables, the supplier ignored the Skills
           Transfer Plan, with the result that the Stats SA developers were not involved in the
           final design and development of the phase 1 deliverable.

     Breakdown of deliverables
            Each phase was planned to be three months long in duration. Also, each phase
           was planned to be a complete deliverable in its own right, even though the next
           phase was planned to build on the previous phases.
            The first phase was delivered late mainly due to the lack of understanding that the
           supplier demonstrated.
            The first deliverable did not meet the stated business objectives of data quality
           initially




Statistics South Africa
Case Study - ESMDF Project                         25
                                    END



                                 Thank You

                             Contact information:

    Sibongile Madonsela (SibongileMA@statssa.gov.za)
        Matile Malimabe (MatileM@statssa.gov.za)
        Bubele Vakalisa (BubeleV@statssa.gov.za)
       Ashwell Jenneker (ashwellj@statssa.gov.za)

   all from Statistics South Africa (www.statssa.gov.za)




Statistics South Africa
Case Study - ESMDF Project             26
Statistics South Africa
Case Study - ESMDF Project   27

								
To top