IPUMS-International_ Lessons from 10 years of Archving

					               IPUMS-International
and Integrated European Census Microdata Projects
Reduce Risks of Managing Trans-border Access and
               Add Significant Value
                       * * *
      Robert McCaa and Albert Esteve Palos
Minnesota Population Center and Centre d’Estudis Demografics--Barcelona


                www.ipums.org/international
                  www.iecm-project.org
                             “Dissemination [means]
          opening up the value inherent in our data.”

         -- Walter Radermacher and Pieter Everaers
Seminar on Emerging Trends in Data Communication
      and Statistics, UNSC, New York, Feb. 19, 2010*
     Trans-Border access is essential in 21st Century.
    Many researchers (e.g., demographers, members of
       IUSSP) reside outside their country of birth
•    New Zealanders        60% reside outside country of birth
•    Dutch                 40%                    Limiting access
•    Germans               38%                    to in-country is
                                                   old-fashioned,
•    Danes                 34%
                                                     inefficient,
•    Chinese               30%                        costly, &
•    Belgians              31%                         unfair.
•    British               25%                       Encourages
•    Australians           22%                       violations,
                                                    brain drain.
•    Canadians, Finns, French, Japanese, Swiss, etc.
                           ~20%
                       IPUMS-International
    IPUMS-International: 2012 (weighted by population size)
     dark green = anonymized, harmonized and disseminating
      (69 countries, 212 censuses, 480 millon person records)
medium green = to be integrated (29 countries, 75 censuses, ~100 mpr)

2012 launch:
El Salvador (2)
Indonesia (9)
Mexico (2010)
Morocco (3)
Nicaragua (3)
Turkey (3)
Uruguay (5)                              Work began in 1999.
                                 By 2020 we hope to integrate
                            census microdata of 100 countries,
                                    Mollweide projection
                               including 2010 round censuses.
                     IPUMS-International
 IECM/ IPUMS-Europe: 2012 (weighted by population size)
   dark green = anonymized, harmonized and disseminating
     (17 countries, 56 censuses, 93 millon person records)
medium green = to be integrated (2 countries, 6 censuses, ~5 mpr)
                                                         Countries not yet
                                                         participating are
                                                         invited to consider
                                                         doing so: Albania,
                                                         Belgium, Bosnia-H,
                                                         Croatia, Denmark,
                                                         Estonia, Finland,
                                                         Iceland, Latvia,
                                                         Lithuania, Moldova
                                                         R., Norway, Russia,
                                                         Serbia, Slovak R.,
                                                         Sweden, etc.
                                  Mollweide projection
       Outline: IPUMS-International & IECM
    Reduce Risks of Managing Trans-border Access
              and Add Significant Value

•   NSOs that disseminate microdata by “going it alone” incur
    significant risks, substantial costs, & much user dissatisfaction
I. IPUMS & IECM offer a “one-stop” comprehensive solution to
    managing access to census microdata
II. Statistical Confidentiality and Security
III. Integration
IV. Manage trans-border access
V. Conclusion: Invitation to cooperate,
    entrust 2010 round census microdata as soon as feasible.
           I. One-stop, comprehensive solution
    to disseminating census microdata & metadata…
                of Europe and the world
•   Organize        Uniform agreement with each NSO
•   Administer      We manage approval/denial of user access
•   Anonymize       We are responsible for data anonymization
•   Integrate       We do the work
     Metadata       Official language and integrated in English
     Microdata      Integrated globally & optimized for Europe
•   Disseminate     Extracts, custom-tailored to each request
•   Share           We share: results,
                    comprehensive electronic bibliography
No longer enough to prepare a CD or post a dataset on a web-site
          II. Statistical Confidentiality and Security
A. Microdata security and confidentiality protections
  • Employees face fines, job loss, and possible
         imprisonment for violations
     • Security: “best practice” – Dennis Trewin, ex Aus. Stat.
B.    Statistical disclosure control protections:
     • Suppression of records using sub-sampling, names, low-
         level geography, unique variates,
     • Paired swapping of geographical identifiers of
         households to create uncertainty
     • Top/bottom coding, global recodes, deletion of digits, etc.
C.    Managing restricted access to microdata (next slide)
   II. Statistical Confidentiality and Security (cont’d.)
A. Microdata security and confidentiality protections
B. Statistical disclosure control protections:
C. Managing restricted access to microdata
   • Detailed registration form to establish bona-fides
   • 4/5ths of viewers do not complete the form!
       --automatic denial
   •   Conditions of use bind researcher & institution;
       violations penalize every researcher at institution
   •   Custom-tailored extracts encourage researchers to
       jealously guard their downloads.
   •   More than 5,000 researchers approved for access
           III. Integration: Metadata & Microdata
D. Comprehensive source metadata in official language(s)
   • Questionnaires, instructions, manuals, etc.
E. Integrated, DDI compatible metadata: definitions, concepts,
      variable names, value labels, codes--all link back to sources
     • Descriptions of censuses and samples,
     • Variables defined, comparability discussions,
     • Example: educational attainment (next slide)
F.    Integrated, pooled microdata: multiple censuses in a single
      file
G.    Integrated boundary files (GIS) linked to microdata
H.    IPUMS value added variables
Example of composite coding: Educational attainment
     III. Integration: Metadata & Microdata (cont’d.)
D. Comprehensive source metadata in official language(s)
E. Integrated, DDI compatible metadata: definitions, concepts,
      variable names, value labels, codes--all link back to sources
F.    Integrated, pooled microdata: many censuses in single file
G.    Integrated boundary files (GIS) linked to microdata
H.    IPUMS value added variables:
     • Technical variables: weights, identifiers
     • Family, household info: summary indicators
     • Person variables: Locations of mother, father, spouse
          and rules for linking (momloc, poploc, sploc)
            IV. Managing Trans-border Access
I.  Trans-border access: uniform experience for access to all
    countries, regardless of nationality
J. Custom-tailored extracts: user selects country(ies),
    censuses, variables, sub-populations
   • Extract engine fulfills request, generates custom-tailored
       microdata and metadata
   • 3 unique IPUMS extract tools:
      • Select cases
      • Attach characteristics
      • Customize sample size
K. Usage: 8,048 extracts in 2011; 40,142 samples. See next
    page.
     IPUMS-International Google Analytics: 2011
   Disclosure Controls for Trans-Border access to
    Trans-Border via a Single License, Access Point:
 Census MicrodataAccess: 169 countries/territories
   3,033 cities, 45,000 page views. Up 4X from 2010
            The IPUMS-IECM partnership
                          * * *
       Robert McCaa and Albert Esteve Palos
Minnesota Population Center and Centre d’Estudis Demografics--Barcelona


                www.ipums.org/international


  “You have to do due diligence, something to assure yourself
   that the people you’re giving your data to can be trusted.”
                --http://www.nytimes.com/2011/09/09/us/09breach.html?hp
  Table 2. Rank of the Top Five and all European Countries plus Canada and the USA
    by Number of Extracts for the 2000 round census (statistics for calendar year 2011)
                          Sample Variables
Rank Country                %*        (n)*    Years of census samples             Extracts
   1 Brazil                  5        106     1960, 70, 80, 91, 2000                712
   2 Mexico                 10        120     1960p, 70, 90, 95, 2000, 05           626
   3 United States           5         92     1960, 70, 80, 90, 2000, 05            554
   4 Colombia               10        120     1964p, 72, 85, 93, 2005               516
   5 South Africa           10        108     1996, 2001, 2007                      428
   7 Canada                 2.5        59     1971p, 81p, 91p, 2001p                409
   9 France                 33         94     1962, 68, 75, 82, 90, 99, 06          380
  10 Spain                   5         99     1981, 91, 2001                        366
  13 Greece                 10         89     1971, 81, 91, 2001                    327
  18 Austria                10         75     1971, 81, 91, 2001                    310
  25 Italy                   5         81     2001                                  285
  26 Portugal                5         96     1981, 91, 2001                        283
  29 Romania                10         97     1976, 92, 2002                        272
  30 Switzerland             5         79     1970, 80, 90, 2000                    266
  32 United Kingdom          3         47     1991, 2001p                           263
  38 Hungary                 5         74     1970, 80, 90, 2001                    222
  42 The Netherlands         1         33     1960p, 71p, 2001p                     211
  45 Slovenia               10         80     2002                                  185
  48 Belarus                10         84     1999                                  179
 Total samples extracted for 55 countries (162 samples) available from 
January 1, 2011.                                                                   8,048
                                                                                     15
*2000 round census; refers to all integrated variables, including IPUMS 
        IECM value-added (in beta test):
Password protected, trans-border on-line tabulator
                              Reflections

•   Substantial returns to NSOs; no cost: economies of scale, low
    risk.
•   96 NSOs are participating
•   If yours is not, let’s discuss how to resolve the obstacles:
    Ø   Amend legislation,
    Ø   Revise regulations,
    Ø   Advocate statistical transparency, etc.
•   Entrust 2011 census microdata, as soon as feasible
•   Provide boundary files at low-level geography for each census
    possible
IPUMS at the 59th ISI (Hong Kong, Aug 24-30, 2013)
                http://www.isi2013.hk/


                                         »   IPUMS
                                             Workshop
                                         »   Microdata
                                             session
                                         »   IPUMS
                                             Funding for
                                             delegates
                                             from
                                             developing
                                             countries
                                         »   IPUMS
                                             booth
                Thank you

If your NSO is not participating yet, please
        contact: rmccaa@umn.edu

 When processing of your 2011 census
 microdata is completed, please contact:
           rmccaa@umn.edu

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:9/6/2013
language:English
pages:19