Dealing-with-Item-Non-response-in-a-Catering-Survey

Document Sample
Dealing-with-Item-Non-response-in-a-Catering-Survey Powered By Docstoc
					Dealing with Item Non-response
     in a Catering Survey
                   Pauli Ollila
                   Statistics Finland

                  Kaija Saarni
     Finnish Game and Fisheries Research Institute

               Asmo Honkanen
     Finnish Game and Fisheries Research Institute




                                                     1
        The Finnish Catering Survey
• Studying the use of fish, crawfish, red
  deer, elk and reindeer in the catering
  sector during year 2005.
• Carried out by Finnish Game and
  Fisheries Research Institute together
  with the interview organisation of
  Statistics Finland
• Computer assisted telephone interviews
  at the beginning of 2006
• Population 14740, sample size 2263,
  stratification by “portion classes” (7).
• Respondents 1741, unit non-response
  498, over-coverage 24
                                             2
           Information on amounts
• The questionnaire was divided
  into three sections for fish, crab
  and game (red deer, elk, reindeer)
• Among other questions every
  section included questions
  requiring amounts in kilograms,
  both in totals and in categories
  (type of product, species) and
  origin (domestic/imported)
• The amounts in categories could
  be defined in percentages as well


                                       3
                               EXAMPLE: Question 8a
  What was the total amount of fish as raw material you used in 2005
                       ________________ kg            MUISTIO                        1(1)


                                                      30.5.2007
                Yksikkö
Furthermore, estimate the form in which the fish as raw material was delivered to you? (If you
                Nimi
                Yhteystiedot
cannot estimate the distribution with kilograms, estimate the proportion of the total in percents)
                                                     kg /             %
                                                     year
                  1. Fresh whole/gutted
                  2. Fresh fillet
                  3. Frozen whole/gutted
                  4. Frozen fillet
                  5. Other frozen products
                  6. Prepared
                  7. Canned
                  8. Salted or spiced
                  9. Smoked
                  10. In other form
                                                                     100 %                  4
            The quality of response
• It was obvious that some respondents could not provide full
  and exact information for these questions due to various
  reasons.
• For example, the amounts given in classifying questions
  were contradictory to the overall questions. Further, the
  questions for domestic and foreign fish were providing
  different results than the overall fish consumption question.
• A lot of editing work was carried out in the Finnish Game
  and Fisheries Research Institute in order to get the data
  cleaner (e.g. functional deduction between questions) and to
  convert the percentage information into kilograms.

                                                             5
  • Still some contradictory and insufficient responses,
    which couldn’t be solved, were left for statistical
    processing.
                                MUISTIO                      1(1)

  • For example, regarding total kilograms and sum of
                                30.5.2007
Yksikkö
Nimi
    kilograms of categories we had:
Yhteystiedot


                     sums no total / categ.       categ.sum categ.sum     all
                     ok   zero total missing      more      less
    Fish             1270        26         237         96          112   1741
    Foreign fish     1059        93         344         92          153   1741
    Crawfish           79      1646          14          2            0   1741
    Red deer, elk,    172      1560           8          1            0   1741
    reindeer

      NOTE: Less than 10 % difference in total kilograms and sum of
      kilograms was allowed in the interview situation.

                                                                                6
              Item non-response
• The most usual case of item non-response: the
  category kilograms are totally missing when the
  overall total exists.
• The sum of the existing category kilograms may
  either exceed or go below the overall total given in
  the response.
• In principle the latter alternative can be considered as
  item non-response.
• However, it is not clear how many categories are
  under item non-response or whether the existing
  category sums are simply erroneous for some part.
                                                        7
                  How to correct?
 • How to treat full missingness of the category sums?
 • How to deal with category sums not matching the
   overall sum (mismatch sums)?

  Alternatives for dealing with the problems
    •   Donor imputation
    •   Mean imputation
    •   Regression imputation
    •   Weight adjustments
The method in the final statistical processing was chosen from
these alternatives considered in the following form:

                                                                 8
Corrections considered: donor imputation
 Full missingness of the category sums
 - Selecting a donor within a stratum (“portion category”),
 applying its percentages for creating the imputed values as
 proportions from the overall total.
 - Nearest neighbour class criterion by “number of kitchen staff”,
 “number of days serving fish”.
 Mismatch sums
 - For the cases of category sums lower than the overall sum it is
 hard to apply imputation, there is no information of which
 category/categories should get the imputation values, and the
 mismatch may still continue. For the opposite cases imputation
 is not applicable.
 - In order to retain distribution information on categories, the
 relations are proportioned up or down with a ratio
                      ri  yoverall,i    y
                                        category
                                                   category,i       9
Corrections considered: group mean imputation
Full missingness of the category sums
- Using group means of percentages for every amount category.
“Portion categories” and “number of days serving fish” used as
groups.
Mismatch sums (as in donor imputation)

Corrections considered: regression imputation
Full missingness of the category sums
- Using modelling for percentages in categories, various
auxiliary variables tried, e.g. “number of kitchen staff”, “number
of days serving fish” separately for “portion categories” (only
for those kitchens, who have served fish). No better explanatory
variables were available for all observations.
Mismatch sums (as in donor imputation)
                                                               10
Corrections considered: weight adjustments
Full missingness of the category sums & mismatch sums
- Correcting the category results by adjusting the
weight separately for the different questions including
amounts with a ratio
                                                       
            w y   i   overall,i
                                     wi  ycategory,i 
                                                       
             is                    is category       

i.e. the weighted overall total sum divided by the
weighted sum of the category sums.
- Separate weights cause inconsistencies when
comparing statistics based on variables with no item
non-response made either with normal weighting or
adjusted weighting. Also practical problems in
tabulations and analysis may occur.                         11
             Actions at that time
• Due to the lack of time at the estimation phase the
  weight adjustments were chosen. ==> conservative
  and quick solution => all the information on amounts
  were in line with each other (some kind of calibration).
• The purposes of the catering survey were purely
  descriptive, and studies were made only at the general
  level and some simple classes (e.g. region).
• Complex cross-tabulations and analysis were not
  conducted.
  WHAT DID THE SUBSEQUENT TESTS WITH THE
  CORRECTION ALTERNATIVES REVEAL?

                                                      12
            Subsequent test experiences
• Inflating item non-response factor in weight adjustments varying
  from 1.00689 to 1.47618
• Practical choice: mean and regression imputation conducted for
  others than the biggest class, which had the value 100 % - sum
  of other percent estimates. This ensured the situation that the
  sum of other percent estimates was not exceeding 100 %.
• The regression estimation performed so poorly (e.g. negative
  percentage values) that it was not considered further
• Only weight adjustment replicates the original distribution of the
  classification amounts
• The standard deviations are affected in all methods


                                                               13
 The inconsistency problem with weight
                              MUISTIO                         1(1)



adjustments (example: proportion classes)
Yksikkö
Nimi
Yhteystiedot
                              30.5.2007




                  1-49   50-99 100-           200- 500- 1000 all
                               199            499 999 -
  Original       4593     3522        2986     2317    923      399    14740
                31.16     23.89       20.26   15.72    6.26     2.71
  Fish           4845     3716        2858     2268    926      427    15039
                32.22     24.71       19.00   15.08    6.16     2.84
  Imported fish 5147      3847        2931     2261    920      422    15529
                33.15     24.77       18.87   14.56    5.93     2.72
  Species of     5089     3820        3020     2288    954      425    15596
  foreign fish  32.63     24.49       19.37   14.67    6.12     2.73
  Domestic fish 5969      4431        3473     2670   1052      473    18069
                33.04     24.52       19.22   14.78    5.82     2.62

     totals rounded to integers
                                                                          14
                                       MUISTIO                     1(1)



Yksikkö
                       The distribution problem
                                       30.5.2007
Nimi
Yhteystiedot
               (example: species of fish, overall total 14036226)
                      no               weight             donor            group mean
                      correction       adjustment         imputation       imputation
  Salmon               1777326 15.19      2131542 15.19    2357228 16.76    2086781 14.87
  Rainbow trout        3348190 28.61      4015476 28.61    3778266 26.86    3905681 27.83
  Baltic                936373 8.00       1122990 8.00     1119551 7.96     1144067 8.15
  herring
  European              289291 2.47         346946 2.47     338279 2.40      335130 2.39
  whitefish
  Pikeperch             282990 2.42        339389 2.42      338290 2.40      318147 2.27
  Vendace               184875 1.58        221720 1.58      214977 1.53      217264 1.55
  Perch                 143687 1.23        172324 1.23      164907 1.17      161378 1.15
  Herring               208967 1.79        250613 1.79      242182 1.72      245398 1.74
  Cod and other        2989394 25.54      3585172 25.54    3688276 26.22    3820289 27.22
  whitefish
  Tuna                 1224018 10.46 1467962 10.46 1445786 10.28 1429165 10.18
  Other                 318597 2.72    382093 2.72    348485 2.48    372927 2.66
                      11703710 100.0 14036226 100.0 14036226 100.0 14036226 100.0

                                                                                  15
         Weighted standard deviation changes
                                 MUISTIO             1(1)


                                 30.5.2007
Yksikkö
Nimi
Yhteystiedot (example: species of fish)
                   respondents weight       donor       regression
                   without      adjustments imputation imputation
                   correction
  Salmon                   1220        1336       1673         1293
  Rainbow trout            5324        5830       5346         5348
  Baltic herring            630         690         700          695
  European                  278         305         305          288
  whitefish
  Pikeperch               399                 437   424         409
  Vendace                 203                 222   214         207
  Perch                   181                 198   192         182
  Herring                 229                 251   241         236
  Cod and other          2103                2304   929        2305
  whitefish
  Tuna                     867               950    528         889
  Other                    524               574     73         528
                                                                 16
                   Conclusions
• The inconsistency level of the weight adjustment method
  was not serious
• Both donor and mean imputation had a slight effect to the
  distribution of amounts, but not remarkable
• It is clear that the weighted standard deviations were
  inflated by the weight adjustments, but donor imputation
  tended to have more varying standard deviation figures
  between amount categories. As expected, mean imputation
  had a diminishing effect on variation.
• Current recommendation: Banff package for statistical
  editing and imputation (by Statistics Canada, constructed
  in SAS environment)

                                                          17

				
DOCUMENT INFO
Shared By:
Tags: Deali, ng-wi
Stats:
views:13
posted:12/1/2009
language:English
pages:17
Description: Dealing-with-Item-Non-response-in-a-Catering-Survey