Docstoc

ITSEW - Schouten

Document Sample
ITSEW - Schouten Powered By Docstoc
					Decomposing the maximal impact of
under coverage, unit nonresponse and
item nonresponse on bias and MSE




Barry Schouten
Statistics Netherlands



ITSEW 2008, June 3
Data set – POLS 2002

2002 Survey on Living Conditions
• Two-stage sampling design
• CATI for listed fixed-line phones
• 35.994 persons of 14 years and older
• Key topics: Health, Safety monitor
• Also: Educational level, employment, pc ownership




Available variables from administrative data/registers:
• Age, gender, ethnicity, marital status
• Household composition, house value
• Social security allowance, disability allowance, any allowance
• Province, degree of urbanization, proportion non-native
Non-sampling errors

Three nested types of error:
• Under coverage for non-listed addresses
• Unit-nonresponse
• Item-nonresponse
Non-sampling errors - POLS 2002

Educational level

Error            Size    Relative rate   Cumulative rate

Sample          35.994     100,0%           100,0%
Coverage        24.052      67,6%            67,6%
Unit-response   14.275      59,4%            40,1%
Item-response   12.094      84,7%            34,0%
Coverage, unit-response and item-
response probabilities

Individual coverage probability:         i  P[Ci  1]

Individual unit-response probability:    i  P[U i  1 | Ci  1]

Individual item-response probability:   i  P[ I i  1 | U i  1, Ci  1]

Overall response probability:           i  P[ I i  1]



Interpretation of underlying probability mechanisms is key:
• Frequentist approach
• Randomness in unfixed circumstances
Measures for composition


Three measures:
1. Non-missing rate
2. An indicator for representativeness
3. An indicator for maximal absolute bias

Do not consider MSE as maximal bias dominates
standard error.
Indicator for representativeness

Variation of propensities in reference population

                             1 N
R( )  1  2S ( )  1  2      
                            N  1 i 1
                                       (i   ) 2


Estimated variation of estimated propensities


                             1 N si ˆ ˆ 2
ˆ ˆ          ˆ ˆ
R( )  1  2S ( )  1  2        (i  HT )
                            N  1 i 1 i
Indicator for representativeness


Indicator depends on model and auxiliary variables

Two estimation strategies:
1. Fixed model: age, ethnic, household type, allowance, province
2. Best fit model: all variables
Indicators for maximal absolute bias
Bias of mean under worst case scenario of item Y

     ˆ ) | C ( , y )
| B( y
               
Worst-case scenario upper limit

     ˆ      S ( ) S ( y) (1  R( ))S ( y)
| B( y ) |                                 Bm
                               2


Survey item specific and general standardized indicator

         ˆ ˆ ˆ                             ˆ ˆ
ˆ  (1  R( ))S ( y )
Bm                               ˆ s  1  R( )
                                 Bm
          2ˆ                             2ˆ
Analysis strategy


Steps in the analysis:
• Estimate item-response probabilities
• Estimate unit-response probabilities
• Estimate coverage probabilities
• Estimate overall probabilities
• Compute rate and indicators
• Adjust item for missing data using fixed model
Non-item specific indicators
Model Item response                  Rate     R       B

Fixed model                          84,7%   72,8%   16,1%

Best fit – auxiliary                 84,7%   69,5%   18,0%

Best fit – auxiliary + pc + employ   84,7%   67,3%   19,3%

Model unit response                  Rate     R       B

Fixed model                          59,4%   87,3%   10,7%
Best fit – auxiliary                 59,4%   85,4%   12,3%

Model coverage                       Rate     R       B
Fixed model                          67,6%   73,2%   19,8%

Best fit – auxiliary                 67,6%   68,0%   23,7%

Model overall                        Rate     R       B

Fixed model                          34,0%   80,0%   29,4%

Best fit – auxiliary                 34,0%   76,1%   35,2%
 Item-specific bias (fixed model)

                    Response   Item R   Unit R   Coverage   Total

Primary               6,7%     4,0%     2,7%      5,0%      7,4%
Junior secondary     12,2%     5,3%     3,5%      6,5%      9,6%

Pre-vocational       19,5%     6,4%     4,2%      7,9%      11,7%
Senior secondary      6,9%     4,1%     2,7%      5,0%      7,5%
Senior vocational    31,0%     7,4%     4,9%      9,2%      13,6%

High professional    17,2%     6,1%     4,0%      7,5%      11,1%
University            6,5%     4,0%     2,6%      4,9%      7,2%
Adjustment missing data
GREG estimator with age, ethnic, household type, allowance, province


                    Response   SE     Item R   Unit R   Cov     Cumulative

Primary               6,7%     0,1%   +0,5%    -1,2%    +0,2%     -0,5%

Junior secondary     12,2%     0,2%            -0,2%    -0,1%     -0,3%

Pre-vocational       19,5%     0,2%   +1,1%             -0,5%     +0,6%

Senior secondary      6,9%     0,1%   -0,3%    +0,1%    +0,2%

Senior vocational    31,0%     0,2%   -0,6%    +0,3%    +0,2%     -0,1%

High professional    17,2%     0,2%   -0,4%    +0,5%    +0,1%     +0,2%

University            6,5%     0,1%   -0,2%    +0,4%    -0,1%     +0,1%
             MAD                      0,4%     0,4%     0,2%      0,3%
Conclusions
Impact of single errors
• Unit-nonresponse produces highest missing rate but smallest
  impact on composition
• Coverage and item-nonresponse strong impact composition
• Educational level adjustments smallest for under coverage

Combination of errors
• Non-missing rate decreases and max bias increases
• Indicator for representativeness does not in general decrease
• Educational level adjustments partially cancel out

Estimation strategy
• Best fit models show lower indicators for representativeness


Discussion:
• Interpretation of individual probabilities?
• Are indicators useful given dependence on model?

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:3
posted:5/13/2011
language:English
pages:14