Docstoc
EXCLUSIVE OFFER FOR DOCSTOC USERS
Try the all-new QuickBooks Online for FREE.  No credit card required.

Haziza

Document Sample
Haziza Powered By Docstoc
					  ESTIMATION IN THE
PRESENCE OF TAX DATA
 IN BUSINESS SURVEYS

David Haziza, Gordon Kuromi and Joana Bérubé
    Université de Montréal & Statistics Canada



                    ICESIII
                  June 20, 2007

                                                 1
OUTLINE

• Introduction

• Current sampling design

• Current point estimators

• Alternative sampling design

• Alternative estimators

• Domain estimation

• When the tax variable is missing

                                     2
TAX DATA PROGRAM

• Goal: To increase the use of tax data in business surveys in
  order to



      reduce the respondents burden

      reduce costs

      potentially improve the quality of point estimators



                                                                 3
 TYPES OF VARIABLES
• We distinguish between 3 types of variables
      Financial survey variables (total revenue, total expenditure,
       etc)
      Financial tax variables (total revenue, total expenditure, etc)
      Non-financial variables


• There is a direct link between the financial survey variables and
  the financial tax variables
• No direct link between non-financial variables and tax variables
                                                                         4
TAX DATA

• 3 type of tax data:

      T1 data: un-incorporated businesses (Unified Enterprise
       Survey)

      T2 data: incorporated businesses (Unified Enterprise
       Survey)

      GST data: both incorporated & un-incorporated (Monthly
       surveys)



                                                                 5
CURRENT SAMPLING DESIGN

• Stratification by Province, NAICS and Size

• 3 types of strata:

      Take-all strata (typically complex units)

      Take-some strata (simple and complex units)

      Take-none strata (simple units)

• Use of tax data is limited to take some strata (for simple units
  only) and take-none strata

                                                                 6
CURRENT SAMPLING DESIGN
          STRATUM = PROVINCE x NAICS

  Noneligible units                      Eligible units


                                       
                                     seh (neh )

                   
             sneh (nneh )                    seh (neh )



                                                           50%
                              
                            sh (nh )
                                                                  7
                            U h ( Nh )
CURRENT SAMPLING DESIGN

• Advantages:
 The current design fits the imputation and estimation systems


• Disadvantages:
 It is a two-phase sampling design
 The sample sizes for collection in both the eligible and non-
  eligible strata are random variables         may increase the
  variance of the estimators and add uncertainty to the collection
  costs
 The use of tax data is limited to the first-phase sample.

                                                                  8
FINANCIAL VARIABLES
• Survey variables: y  available only for the units in sh

• Tax variables: x  available for all the units in U h

• For many financial variables, there is a corresponding tax
  variable

• We assume that both type of variables are known without
  errors (measurement errors, nonresponse)

• These two assumptions are not satisfied in practice!



                                                               9
CURRENT TAX REPLACEMENT METHODS
• Model describing the relationship between x and y:
                        yi  f  xi    i
                   E  i   0,V  i    2ci

• Special cases:

      Direct tax replacement: f  xi   xi , ci  1      yi  xi   i

      Ratio type replacement: f  xi    xi , ci  xi   yi  xi   i



                                                                   10
PREDICTED VALUES

  ˆ
• yi : predicted value for yi

      Direct tax replacement: yi  xi
                               ˆ          (used in UES)

                                   ˆ
       Ratio type replacement: yi  Bxi
                               ˆ          (used in monthly surveys)

              ˆ
• Estimate of B is obtained from the units in s.




                                                               11
NOT ONLY DIRECT TYPE REPLACEMENT?

• Considerable efforts have been made to standardize the
  concepts and definitions between the tax variables and the
  survey variables (Chart of Account compliance for T1 and T2)

• As a result, we expect that the model y  x   should be valid.
  Sometimes, it is not because

      Difference in reporting of data and other issues (Jocelyn,
       Mach et Pelletier, 2006)

      Difference in the reference period (GST data)
                                                                    12
CURRENT POINT ESTIMATORS:
PREDICTION TYPE
• In the noneligible portion: Horvitz-Thompson estimator

• In the eligible portion: Prediction (or imputed) type estimator

      y is observed for all i in s

      y is used for i in s s
       ˆ

      We have
               ˆ s
              YPRED   wi yi         wi yi
                                             ˆ
                           is        is s



                                                                    13
CURRENT POINT ESTIMATORS:
PREDICTION TYPE
• Advantages:
 Similar to imputed estimators in the context of imputation
 They are simple and fit the current imputation and estimation
  systems
 They fit the so-called micro approach for displaying the data


• Disadvantages:
 They are generally p-biased
 May be pm-biased if the tax replacement model is incorrectly
  specified
                                                                  14
DISPLAYING THE DATA: MICRO VS.
MACRO APPROACH
• We distinguish between two approaches for displaying the data:

  (i) Micro approach: consists of reporting the observed y-values
  as well as the predicted values (similar to an imputed file in the
  context of item nonresponse)

  (ii) Macro approach: consists of reporting the observed y-values
  along with a calibration weight

• Currently, the micro approach is used
                                                                 15
PREDICTION TYPE ESTIMATORS
                               Domain estimators
• Micro approach               potentially p-biased and pm-
                               biased
                   Unit      ~
                             y      wi
                    1       y1      
                                    w1
                    2               
                 s 
                             y2     w2
                    
                                  
                                                  Domain
                    n
                             yn    wn 
                   n  1   ˆ
                            yn 1    
                                    wn1
                                  
              s s 
                                   
                    
                    n
                           ˆ
                            yn       
                                    wn                  16
ALTERNATIVE SAMPLING DESIGN
            STRATUM = PROVINCE x NAICS

  Noneligible units                  Eligible units
      U neh ( N neh )                   U eh ( N eh )



                                        seh (neh )
      sneh (nneh )




                                                        17
                        U h ( Nh )
ALTERNATIVE SAMPLING DESIGN

• Advantages:
 It is a single phase sampling design which simplifies the
  estimation procedures, particularly variance estimation
 The sample sizes are known prior to sampling
 Full use of available tax data is now made


• Disadvantages:
 The estimation systems need to be modified to fit the new
  procedure


                                                              18
POINT ESTIMATION

• For Financial variables, we have 2 options:
      Tax/survey based framework: We simply use X        x
                                                           iU E
                                                                   i   for
       the eligible part and a design consistent estimator for the
       noneligible part
      Survey based framework: We want to estimate Y       y
                                                             iU
                                                                       i

       Use design consistent estimators (calibration estimators such
       as the GREG or optimal estimator) that make use of all the
       available tax data (Monthly surveys)



                                                                           19
GREG TYPE ESTIMATORS

• The GREG estimator is usually written as

                    YG   wiCAL yi
                     ˆ
                          is



• The GREG estimator fits the macro approach but it can
  also fit the micro approach

               YG   yi   wi  yi  yi 
                ˆ     ˆ                ˆ
                    iU         is




                                                          20
GREG TYPE ESTIMATORS
                                  Domain estimators
Micro approach                    asymptotically p-unbiased
       Unit      ˆ
                 y      e y y
                              ˆ        wi
     1          ˆ
                 y1      y1  y1
                              ˆ         w1
     2          ˆ       y2  y2
                              ˆ
                y2                     w2
   s 
                                                 Domain
     n
                 ˆ
                  yn     yn  yn
                               ˆ        wn
     n  1      ˆ
                 yn1        0           0
     
     
  Us 
     
      N
                ˆ
                 yN           0          0
                                                              21
DOMAIN ESTIMATION

• Three situations are encountered in practice:
  (i) The domain is identical with the model group
  (ii) The domain is contained in the model group
  (iii) The domain interesects more than one model groups




                                                            22
DOMAIN ESTIMATION

• Even if the prediction type estimators are pm-unbiased at the
  model group level, they could be significantly biased if the
  model prevailing at the domain level is different than the model
  prevailing at the model group level

• The GREG type estimators are always asymptotically p-unbiased
  at the domain level. However, they could be inefficient if the
  model prevailing at the domain level is different than the model
  prevailing at the model group level



                                                               23
DOMAIN ESTIMATION: MICRO vs.
MACRO

• Macro and micro approaches lead to identical estimators of
  parameters at the model group level

• At the domain level, both approaches lead to different
  estimators

• No definite comparison is possible but we expect that Yd micro 
                                                              ˆ
  will perform better than Y  macro  if the domain size is small
                            ˆ d




                                                                      24
WHEN THE TAX VARIABLE IS MISSING

•   In practice, the tax variable is subject to nonresponse and it is
    imputed

•   Let z be a new variable defined as: x if the tax variable is
    observed and x* if the tax variable is missing

•   Inference can be made conditional on z




                                                                    25
FUTURE WORK

•   Find a compromise calibration weight if the macro approach
    is used

•   For non-financial variables, find the best set of auxiliary
                   x*
    variables and use it to calibrate




                                                                  26
Pour plus d’informations, veuillez contacter/ for more
               information, please contact

             David.haziza@statcan.ca
                 (613) 951-5221




                                                         27

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:4
posted:3/4/2010
language:English
pages:27