DATA CAPTURE – PROCESSING 2006 POPULATION HOUSING CENSUS by sarob

VIEWS: 14 PAGES: 25

									 DATA CAPTURE – PROCESSING
 2006 POPULATION & HOUSING CENSUS
 OF NIGERIA


Presented at UN Regional
Workshop on Census Data
Processing

  By
  Adesola Fatilewa
  NATIONAL POPULATION COMMISSION
  At Dar-es-Salaam, Tanzania
  9th -13th June 2008

                                    1
MAP OF NIGERIA 36 STATES AND FCT ABUJA




                                         2
    ABOUT NIGERIA
 NIGERIA IS THE MOST POPULATED
  COUNTRY ON THE AFRICAN CONTINENT
  AND THE 10th BIGGEST IN THE WORLD.
 AN AREA OF ABOUT 9.28 MILLION SQ. KMS.
 POPULATION OF 140.2million BY 2006
  CENSUS
 COMPRISES OF 36 STATES AND FEDERAL
  CAPITAL TERRITORY
 774 LOCAL GOVERNMENT AREAS - LGA
  (DISTRICTS)
 DELINEATED INTO OVER 662,000
  ENUMERATION AREAS

                                           3
        Background
• Since the late nineties NPopC
  was being inundated with
  proposals on various
  document scanning systems.
• As at 2005, statements were
  being made, suggesting that
  the idea of using scanning
  technology was utopia.
                                  4
    Processing Pre-test and Trial
    Census
 A scanning system was used to process the second
  pre-test of April 2004.
 Number of documents processed was about of
  100,000 forms as survey covered one local
  government area (Lga) in each of the 36 States of the
  country and the Federal Capital Territory.
 The forms were only optical mark readable and
  editing was mainly to correct alignment errors.


                                                          5
    Processing Pre-test and Trial
    Census Continued
 Another solution provider supplied five scanners
  along with two servers for the processing of the Trial
  Census.
 Trial Census which took place in April 2005 covered
  about 5% of the country, which translated to about
  10million population.
 Processing was distributed between two
 data processing centres (DPCs); Lagos and Kano



                                                           6
          Lessons learnt
•   staff were identified for suitable roles in data
    processing of the main census
•   staff gained experience on the new technology
•   alignment and recognition problems detected and
    rectified
•   decision taken on appropriate archiving system for
    storage and retrieval of documents
•   need to have various reports to enable
    management follow progress of processing
•   decision to completely eliminate manual coding and
    editing

                                                     7
Data capture 2006 census
   Scanning technology was fully deployed in
    processing Nigeria 2006 Population and
    Housing Census.
   This was achieved with 21 scanners
    distributed in 7DPCs located strategically
    across the country.
   Immediately after the census, OMR/ICR
    forms (questionnaires) used to collect data
    started arriving at the DPCs .
   Inventory control was done using an EA
    tracking system
                                                  8
Data capture 2006 census
   Documents were enveloped by EA, tied in
    convenient batches and stacked on labelled
    shelves
   At the end of the receiving/archiving
    exercise, batches were retrieved for data
    capture




                                                 9
                      Paper Preparation before
                      Scanning

                                           Envelope
            ARCHIVE




                          Envelope
Batch                                                 cut the          Batch
Heade                                                 paper            Heade
r                                                     with cutting     r
                                                      machine
                                                                        NPC0
                                                      Otherwise:
  NPC0                                                                  x
                                                      paper
  x                   Bring the        Remove the
                                                      damaged,
STORE                 envelopes with envelopes
                                                      introduce dirt   Jog the
IN                    the
                                                      on the           paper
Program               questionnaires
                                                      scanned          with the
                      from the Archive
                                                      image, reject    supplied
Print and             room
                                                      increased        jogger
add
Batch
Header                                                                    10
  Data Processing Steps
        at DPCs
• Schematic diagram




                        Server
   Jog Docs   Scanner




                            Edit Stations




                                            11
        Scanner Views


Scanner Feeder
                        Questionnaire
                        processing




                                        12
 Scanning

 Sheets loaded on the feeder in batches separated
  by batch header went through transport system of
  scanners HR80 SC
 Scanner speed was 8000 sheets/hr barring jams
  and other loading difficulties.
 Scanning was effected by ProScan software and
  scanned documents were collected at the output
  tray.
 The sheets were returned into their envelopes and
  sent back to archive


                                                      13
             SC80HC + ProSort + kEOPs
                                                          5. kEOPs
                                                          recognition

                               4. Data
                   2.Scanner      +       Work
                                          Data
                               Images
                                         Storage



                                                                      8. DVD    CS Pro
                                                      Archive
     Batch                                             Data
     Header        3. Paper                           Storage
                   Archive

                                                                      9. TAPE    HQ
    NPC0         MANUAL WORK
    x
1. Preparation
for Scanning:            6. Correction             8. Local reports
  cut & jogg                                                                    Carto
                         Balancing
                                                                                14
     Editing
 Two levels of Editing:
    First level at DPC
    Second level at DVU at
     NPopC hq. in Abuja




                              15
       First Level Editing
   XML format stored in SAN on servers
    networked to scanners
   Forms in XML loaded onto edit stations
   The editing system used was called KEOPs
    and it was designed to check geographic ids
    against the batch headers, check ‘mandatory
    fields’
   Transactions or whole batches could be
    passed for ‘balancing correction level’ which
    was handled by more experienced staff
    designated ‘Supervisor’,                        16
Typical KEOPs Edit Screen




                            17
EXPORT




         18
  Second Level Editing
 Data in ASCII,was encrypted, backup on cds at the
  DPC and sent to NPopC Hq., Abuja
 Data is decrypted, validated, collated and further
  edited at Abuja
 Data is then checked for completeness to ensure
  that each delineated EA for any local government
  had data associated with it
 CsPro package was then utilized to edit data and
  aggregate appropriately



                                                       19
    Second Level Editing Continued
   Structure checks
   Range checks
   Skip pattern checks
   Inter-record and intra-record consistency checks
   Imputation methods applied for missing or invalid
    values:
       Hot deck’
       and ‘Cold deck’ or a combination of both


                                                        20
           Occupational Coding

        The only data that was not coded on the field was
         occupation
        The occupational coding was effected automatically
         using a computer-assisted coding system
        ‘Exceptional Coding’ was applied where coding clerk
         could not find an appropriate occupation code for an
         occupation




21
                Challenges


        Ensuring that documents for particular
         geographic locations were archived in sections of
         the archive and shelves designated for them
        That all forms were separated before taking them
         for scanning
        Breakdown of jogger
        Rate of getting documents ready for scanning
         was slower than rate of scanning
        Difficulty in maintaining belts and fixing them over
         pulley
        That correct batch headers were properly placed
22       on EA batches and that after scanning, EAs were
             Challenges Continued

        Instances of poor field work which resulted in
         ‘missing values’ of ‘mandatory fields’, outright wrong
         values for fields
        Difficulty in linking forms for households of greater
         than 8 persons
        Integration of the two solution providers: form design
         and equipment and software solutions were provided
         by two different companies
        Cleaning of blank records of data associated with
         them at data capture
        Dealing with sensitivity of Nigerians to census figures
23      lack of reliable and uninterrupted power supply
             Conclusion

        The Commission was proud that the decision to deploy a new
         technology for part of the processing of Nigeria 2006 Population
         and Housing Census was a success
        About 35million forms were scanned and edited using 21
         scanners, over 220 edit stations and data in XML format and
         ASCII stored in about 76TB of SANs. All scanning and first level
         editing was completed within nine months of enumeration
         period.
        About 1000 Nigerians were trained and gained expertise in
         various aspects of the scanning technology
        There is a need for intensive trainings in these areas of
         OMR/OCR forms design and development of appropriate
         scanning softwares.

24
      End

 Thank  you for your
 attention




                        25

								
To top