Lecture 1 Review of probability and distributions _appendix A-C_ by dffhrtcv3


									Lecture 2: Review of probability and
    distributions (appendix A-C)

              Dr. S. Chen
              Outline and keywords
●   Relationship between 2 variables
●   Random sampling
●   Parameter, Estimator and estimate
●   What takes to be a good estimator?
       Small sample
       Large sample
●   Normal distribution and standard normal distribution
●   Confidence interval and hypothesis testing
●   Accuracy of an estimator: standard error and standard
    Relationship between 2 variables
●   Example: Linear relationship between monthly
    housing expenditure and monthly income:
                   Housing=164+.27 income
●   Predict the changes in housing expenditure
    using changes income
       Marginal effect of income:
         ●   For each additional dollar of income, 27 cents are
             spent on housing. Or,
         ●   Marginal propensity to consume = .27
         ●   For rich people, this may mean nothing...
●   Scale down by income:
    Suppose income increases from 100 to 200:
●   Percentage point change

●   Income elasticity of housing expense
     Shortcoming of linear functions
●   When income=0, housing expenditure =

●   For low levels of income, linear functions often
    fail to capture the housing expenditure correctly.
             Nonlinear functions
●   Example1 (Quadratic wage equation)
         wage=5.3+.10 educ+.5 exper - .01 exper2
    What is the Marginal Effect of one additional year
     of work experience on wages?

    What's the Percentage Point Change in wages for
     one additional increase in work experience?

    What is the Elasticity of wage?
●   Example2 (Quadratic log wage equation)
          log(wage) =2.7+.10 educ+.2 exper -.01 exper2
       Marginal Effect of one additional year of work
       Elasticity=
●   Other two examples
       Labor supply function
                      hours = 33+45 log(wage)
    ✔   Demand function for beer
                log(bottles) = 4.7+1.25 log(price)
                 Random sampling
●   Example: survey for UAlbany students about
    their drinking behaviour
       Possible locations of interviews
       Ideal method of survey
●   Definition of random sampling:
       If {Y1,Y2,...Yn} are independent random variables
        that come fro=m a common distribution, then
        {Y1,Y2,...Yn} is called random sample from this
        distribution. Or called independent identically
        distributed (or i.i.d) random variables from that
    Parameter, estimator and estimate
●   Example (population mean and sample
        Suppose we have a random sample about the
         previous survey, {y1, y2, y3,y4,y5}={1,0,0,1,1}.
        Sample mean is an estimator (i.e. formula) to
         approximate the population mean

        When we use the actual survey numbers, we get
         the value of the sample average, also called the
         estimate of the population mean
What takes to be a good estimator?
1. Unbiasedness
     An estimator is unbiased if its expectation equals the
      true parameter.
     Examples of unbiased estimators
       ●   sample average
       ●   y1
       ●   sample variance
     Measure the bias of an estimator
                  bias=E[estimator] – true parameter
     Example of a biased estimator
       ●   Natural sample variance
2. Efficiency
       Comparing two unbiased estimators W1 and W2 for
        parameter m, we say W1 is more efficient than W2 when
        Var(W1)<Var(W2) for any value of m.
       Example (sample average is more efficient than the y1
●   Mean square error
          MSE= E[(W-m)2]=Var(W)+[bias(W)]2
       Take account of both unbiasedness and efficiency
●   Examples: calculate the mean and variance of the
    following estimator:
       Sample average
       Y1
            Sampling distribution
●   Estimators are random variable too (because
    functions of random variables are random
    variables). Example: sample average.
●   Thus any estimator has a distribution called the
    sampling distribution.
●   Figure C.2
          Large sample properties of
●   Figure C3.
       When sample size increases, the sampling
        distribution would be more and more concentrated
        around the true parameter.)
●   Example:
       The notorious unbiased estimator Y1 for population
        mean is the widest sample distribution.
       The sample average is much more narrowly
        distributed around population mean. In fact the
        variance decreases whenever N increases.
●   Consistency
    An estimator is called consistent estimator
     if the probability of nonzero bias decreases with sample size
     and if this probability eventually converges to zero in large
●   Sample average of a random sample must be
    consistent (Law of Large Number)
●   Example:
       Sample variance is also a consistent estimator of population
        variance (and also unbiased)
       Natural sample variance is also consistent (but biased).
Useful facts about consistent estimators
●   Functions of consistent estimators are also
●   Examples
       sample variance is consistent so its square root (i.e.
        sample standard deviation) is a consistent estimator
        of population standard deviation.
       The difference between two the sample averages is
        a consistent estimator for their difference in
        population means.
●   Consistency basically tells us that the
    distribution of estimators are to collapse around
    the true parameter as sample size gets large.
●   But this provides no information about the
    shape of the distribution.
●   Asymptotic normality
       If the distribution of an estimator looks more and
        more like a normal distribution as the sample size
        get large, then this estimator is said to be
        asymptotic normal.
      Review of normal distributions
●   Suppose a random variable X is normally
    distributed (i.e. has a bell shape). We often
     to indicate that it has mean m and variance s2.
●   Standard normal distribution is a normal
    distribution with zero mean and unit variance.
    I.e. X~Normal(0,1).
●   Standard normal table p. 847 (can you read it?)
    P{Z<a given number at the table margin)}
    =number inside the table
●   Let Z be standard normal. Use the table to
    answer the following questions:
●   Normalization:
       i.e. demeaned by mean and rescaled by standard

       Any normal random variable Y~Normal(m,s2) can be
        normalized to be standard normal.

       Example:
         ●   Normalization of sample average
        Central Limit Theory (CLT)
●   A normalized sample average from any
    random sample must be standard normal in
    large sample.
●   Formally, let {y1,…yn} be a random sample with
    mean m and variance s2. Then

●   Furthermore, even if we replace the population
    variance in the normalization with the sample
    variance, the CLT still holds.
                 Applications of CLT
●   Remember that sample average is a random variable.
    After normalization, CLT tells us that the normalized
    sample average must be standard normal.

●   This will be very useful when we construct confidence
    interval for the sample average.
       Recall that P{-1.96<Z<1.96}=.95 if Z is standard normal.
       Can you construct a 95% confidence interval of sample
        average for the estimation of population mean?
     What if the sample is not large
●   Then the CLT cannot apply. So the normalized
    sample average (using sample deviation)
    cannot be standard normal.
●   Student-t
                      Y m
                           ~ t n 1
                       s n

●   Student-t table (p. 849; can you read it?)
●   Suppose you have a small sample (n=20). Can you
    construct the 95% confidence interval of sample
    average for the estimation of population mean?
       the critical points (for 2 sides):
       the critical points (for 1 side)

●   If the sample is large, then use the Standard Normal
    table instead.
       The critical points (for 2 tails)
       the critical points (for 1 side)
                Hypothesis testing
●   Example: want to test whether it’s true that
    more than ½ of UAlbany students drink weekly.
       Null hypothesis:        H0:  = 0.5
       Alternative hypothesis: H1:  > 0.5 (one-sided)
●   Procedure (use confidence intervals):
       Survey and get a random sample {1,0,0,1,1}
       Estimate the sample average y = 0.6
       Construct the 95% one-sided confidence interval
        using the true value (0.5):
    Example: Race discrimination in
            hiring (p.787)
●   Consider 5 pairs of people interview for several jobs.
    In each pair, one person was black and the other is
    white. Their resumes show they are virtually the same
    in terms of education and experience. We observe
    their outcomes for the 241 interviews. Let b and b
    indicate the probability of having a job offer for black
    and for white, resp.
       Construct hypotheses: H0: b- w =0; H1: b- w 0
       Calculate sample averages of the difference
                    B  W = 224  .357 = .133
       Calculate sample standard deviation of the diff: s=.482.
       Construct the 95% confidence interval
       Construct the 99% confidence interval
    Accuracy of the sample average
Suppose we have random sample y ~ (m,s2) and its
 sample average is      2
                            y ~ (m,          )
●   Standard deviation of sample average sd(y) = s
●   To estimate   s 2,   we use the sample standard variance.
                                   ( yi  y )

                           s 2 = i =1
                                        n 1
●   We call s the standard deviation of y.
●   The unbiased estimate of sd(sample avg)is called
    standard error of the sample average
                           se(y) =
      Example (Problem set C.8)
●   Larry Bird has FGA=1206 and FGM=455. The
    outcome of each shot (denoted by Yi) is a
    zero-one Bernoulli variable.
●   Yi =   1 with probability 
           0 with probability 1-
1. To estimate  , we use the sample average
2. Find standard deviation of the sample average:
 Given that Y is Bernoulli with mean equal to , the
 variance of Y is
                      Var (Y ) =  (1   )

 Thus the variance of the sample average is Var (Y ) =  (1   )
 where n is FGA of a given player.
 The standard deviation of the sample average is
                                  (1   )
                     sd (Y ) =
 Note that the sample counter part of the standard
 deviation is standard error
                              Y (1  Y )
                     se(Y ) =
3. By Central Limit Theorem, the normalized
  sample average is standard normal in large
                   Y 
                          ~ N (0,1)
                   se(Y )

 Hypothesis testing for Larry Bird for the 1%
 significance level:
 H0: =.5
 H1: >.5

To top