# Statistics

Document Sample

```					     Simple regression
Statistics for dummies

Statistics

Gabriel V. Montes-Rojas

Gabriel Montes-Rojas     Statistics
Simple regression
Statistics for dummies

y = β0 + β1x + u
Much of applied econometrics is concerned with the linear simple
regression model that explains the relationship between y and x:

y = β0 + β1 x + u

where

y                                  x
dependent variable               independent variable
explained variable                explanatory variable
response variable                   control variable
regressand                    regressor or covariate

u is called the error term, residual or disturbance and represents
all other factors, diﬀerent from x that aﬀect y .
Gabriel Montes-Rojas     Statistics
Simple regression
Statistics for dummies

y = β0 + β1x + u

Our interest is the eﬀect of x on the variable y on some
population. The error term, u is assumed to have no systematic
inﬂuence on y and therefore, only x is of importance. Then, we
believe that y ≡ f (x ) = β 0 + β 1 x.
The following deﬁnitions will be used extensively during the course:
β 0 is the intercept, f (0) = β 0 .
This represents the value of y when x is set at 0.
β 1 is the slope, ∆y = β 1 .
∆x
This represents the unit change in y after a unit change
in x.

Gabriel Montes-Rojas     Statistics
Simple regression
Statistics for dummies

Gabriel Montes-Rojas     Statistics
Simple regression
Statistics for dummies

Example 2.7 (p.41 in Wooldridge): Returns to education

wage = β 0 + β 1 educ + u

Wages are expected to be an increasing function of education,
i.e. more education means on average higher wages. Then, in
this linear model, we expect that β 1 > 0.
What does u mean? Other factors, diﬀerent from education,
that aﬀect wages, such as age or ability.

Gabriel Montes-Rojas    Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Statistics for dummies

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Random variables (RV)
Why do we need random variables in Econometrics????
We will (almost) never observe the whole population, only a
small portion of it
A random sample is a subset of a population
If we consider the random variable X , a random sample is
{xi }n=1 or x1 , x2 , ..., xn that consists of n realisations of the
i
variable X , which are indexed by i.
Example: If X is the return of an asset, a random sample are
actual observations in the market about the asset returns. Say
for a sample of three observations
x1 = \$ 1000, x2 = −\$ 567, x3 = \$ 0
Example: Flipping a coin: let X = 0 be HEADS and X = 1
be TAILS. Then, X = {0, 1}. Moreover,
P [X = 0] = P [X = 1] = 0.5. (This is called the Bernoulli
distribution).
Gabriel Montes-Rojas    Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Discrete vs Continuous RVs

A discrete random variable is one that takes on only a ﬁnite or
countably inﬁnite number of values.

Example: Flipping a coin: let X = 0 be HEADS and X = 1 be
TAILS. Two possible values: 0 or 1.

Example: Number of £50 bills in your wallet: X can take any
number in 0, 1, 2, 3,..., ∞

Each outcome of X has an associated probability.
pj = P (X = xj ), j = 1, ..., k. This probability measure satisﬁes:
pj ≥ 0, j = 1, 2, ..., k
∑k=1 pj = 1
j

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Discrete vs Continuous RVs

A continuous random variable is one that takes on any real
value.

Let X be a continuous random variable. Its probability measure is
described by a density function f (X ) that satisﬁes
f (x ) ≥ 0 for all x ∈ X , where X is the domain of X , usually
X =R
X
f (x )dx = 1

Although the density function acts as a probability of each value of
x, it has a tricky interpretation, because there are so many values
in X , that individually each one has probability zero (?!).

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Expectation of a RV

Random variables can be described by some of its features:

Expectation: E [X ]

What value should we expect from X ? If we have a considerable
amount of draws from the X random variable, what would be their
average?
For the coin example:
E [X ] = 0 × P [X = 0] + 1 × P [X = 1] = 0 × 0.5 + 1 × 0.5 = 0.5.
For the discrete RVs: E [X ] = ∑k=1 xj × P [X = xj ].
j

For the continuous RVs: E [X ] =            X
xf (x )dx.

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Property of expectation: Let A and B be two random variables,
and c and d two constants. Then, E [cA + dB ] = cE [A] + dE [B ].
Property of expectation: Let A and B be two independent
random variables. Then, E [A × B ] = E [A] × E [B ].

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

An estimator of the expectation of a random variable X is the
sample average.
Given a random sample {xi }n=1 , deﬁne x = n−1 ∑n=1 xi which is
i           ¯        i
simply the average.

ˆ                                           ˆ
An estimator µ is unbiased for a given parameter µ if E (µ) = µ

In words, if we consider all possible random samples, on average,
we will obtain the parameter we want to estimate.
In our case, we can prove that E (x ) = E (X ).
¯
Proof:...

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Variance of a RV

However, for a given realisation of X , deﬁned as x, we may have
that x = E [X ].
But, how much does this random variable deviate from the E [X ]?

Variance: Var [X ] ≡ E [(X − E [X ])2 ]

Gabriel Montes-Rojas    Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Prove that Var [X ] = E [X 2 ] − (E [X ])2 .
Property of variance: Var [aX ] = a2 × Var [X ]
Property of variance:
Var [aX + bY ] = a2 × Var [X ] + b 2 × Var [Y ] + 2ab × Cov [X , Y ],
where Cov [X , Y ] = E [XY ] − E [X ]E [Y ]

Gabriel Montes-Rojas    Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Covariance

The covariance of the random variables A and B measures how
much co-movement they have.

Covariance: Cov [Y , X ] ≡ E [YX ] − E [Y ]E [X ]

Property of covariance: Let A and B be two independent random
variables. Then, Cov [A, B ] = 0.

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

In the simple regression model...

In the simple regression model, Y , X and U are random
variables. β 0 and β 1 are population parameters, i.e. constants
that describe the relation between Y and X . Then,

E [Y ] = E [ β 0 + β 1 X + U ] = β 0 + β 1 E [X ] + E [U ]
(Since U captures other factors, we will assume that E [U ] = 0.)
However, our main interest is in the conditional expectation that
deﬁnes the population regression model:

E [Y |X ] = E [ β 0 + β 1 X + U |X ] = β 0 + β 1 X + E [U |X ] = β 0 + β 1 X

Assumption: U and X are independent, then E [U |X ] = E [U ] = 0.

Gabriel Montes-Rojas     Statistics
Expectation
Simple regression
Variance
Statistics for dummies
Regression model

Parameters vs Estimators

Note:
β 0 and β 1 are population parameters to be estimated.
ˆ       ˆ
β 0 and β 1 will be their estimators.
The parameters are just numbers, they are ﬁxed. However,
the estimators will be random variables.

Gabriel Montes-Rojas     Statistics

```
DOCUMENT INFO
Shared By:
Categories:
Stats:
 views: 21 posted: 2/5/2011 language: English pages: 17