A regression line, a+bX, is interpreted as E[Y|X], the conditional expectation of Y given X. The slope of a regression line in general can be estimated as b=Cov(X,Y)/Var(X). The intercept of a regression line in general can be estimated as a=E[Y]-b*E[X]. For a discrete PDF with only two possible outcomes for Y (Y1,Y2) and 2 possible outcomes for X (X1,X2), we can estimate the regression line in a very simple intuitive manner: 1) Estimate E[Y|X=X1] 2) Estimate E[Y|X=X2] 3) Set up the two equations for two unknowns and solve for a and b: E[Y | X X 1 ] a bX1 E[Y | X X 2 ] a bX 2 rA 18 -10 12 0.4 0.05 rI -5 0.25 0.3 P(A=18|I=12)=.4/.45=0.889 P(A=-10|I=12)=.05/.45=0.111 E[A|I=12]=18*0.889+(-10)*.111=14.89 P(A=18|I=-5)=.25/.55=0.455 P(A=-10|I=-5)=.30/.55=0.545 E[A|I=-5]=18*0.455+(-10)*.545=2.74 So we want an a and b such that 14 .89 a b 12 2.74 a b (5) First Solve for b 14.89 a b 12 Then Solve for a 2.74 a b (5) 14 .89 a 0.71 12 12.15 17b a 14 .89 8.52 6.4 b 0.71 How can we solve for the intercept and slope of a regression line in general? b=Cov(X,Y)/Var(X) a=E[Y]-b*E[X] In general, as long as we have a sample of historical outcomes, we know how to estimate Cov(X,Y), Var(X), E[Y], and E[X]. We therefore know how to estimate a and b from a sample of historical data. How do we know these formulas work? Do they work in the case of a joint PDF? From the joint PDF: Cov(rA,rI)=51.17 Var(rI)=71.53 E[rA]=8.2 E[rI]=2.65 Let’s check our regression formulas: And 51.17/71.53=0.71=b And 8.2-2.65*(0.715)=6.3=a Yes, they work.
Pages to are hidden for
"Regression"Please download to view full document