Quantitative Business Analysis for Decision MakingSimple Linear Regression403.72Lecture OutlinesScatter Plots Correlation AnalysisSimple Linear Regression ModelEstimation and Significance TestingCoefficient of DeterminationConfidence and Prediction IntervalsAnalysis of Residuals403.73Regression Analysis ?Regression analysis is used for modeling the mean of “response” variable Y as afunction of “predictor” variables X1, X2,.., Xk. When K = 1, it is called simple regression analysis.403.74Random SampleY: Response Variable, X: Predictor VariableFor each unit in a random sample of n, the pair (X, Y) is observed resulting a random sample: (x1, y1), (x2, y2),... (xn, yn) 403.75Scatter PlotScatter Plotis a graphical displays of the sample (x1, y1), (x2, y2),... (xn, yn) by n points in 2-dimension. It will suggest if there is a relationship between X and Y403.76A Scatter Plot Showing Linear Trend162126152025NielsenPeopleMA Scatter Plot Showing Linear Trendof Peoples Ratings and Nielsen Ratings403.77A Scatter Plot Showing No Linear Trend-101-101TodayYesterdaA Scatter Plot Showing No Linear Trendof Today's With Yesterday's DJIA403.78Modeling linear Trend A perfect linear relationship between Y and X exists if . Coefficient is the slope--quantifying the amount of change in y corresponding to one unit change in x.There are no perfect linear relationships in practical world. X of XY403.79Simple Linear Regression ModelModel: is linear function (nonrandom)is random error. It is assumed to be normallydistributed mean 0 and standard deviation . So are parameters of the modelXY and ,XXy403.710EstimationSimple linear regression analysis estimates the meanof Y (linear trend) by and X y bxayˆxbya2)())((xxyyxxb403.711Standard deviationStandard deviation(s) of the sample of n points in the scatter plot around the estimated regression line is:bxayˆ2ˆ2nyys403.712Testing the Slope of Linear TrendFor Testingcompute t-statistic and its p value:0a00:H vs.:Hbsb0-statistic-t403.713Coefficient of Determination: R2A quantification of the significance of estimated model is denoted by R2.R2> 85%= significant modelR2< 85%= model is perceived as inadequateLow R2will suggest a need for additional predictors for modeling the mean of Ybxayˆ403.714Correlation Coefficient: rThe correlation coefficient r is the square root of R2. It is a number between -1 and 1.–Closer ris to -1 or 1, the stronger is the linear trend –Its sign is positivefor increasing trend (slope b is positive)–Its sign is negativefor decreasing trend (slope b is negative)403.715Confidence and Prediction IntervalsTo estimate by a confidence interval, or to predict response Y corresponding to its predictor value x = x0–1. Compute:–2. compute:xy0ˆbxayyesyˆ..ˆ403.716What is ?i.e. Standard Error ofyesˆ..For estimating ,y220)()(1)ˆ.(.xxxxnsyesFor Predicting Y,220)()(11)ˆ.(.xxxxnsyesyˆ403.717Analysis of ResidualsResiduals are defined:Residual analysis is used to check the normality and homogeneity of variance assumptions of random errors .Histogram or box plot of residuals will help to ascertain if errors are normally distributed. 2,....n 1,i ,ˆiiiyye403.718Analysis of Residuals (con’t)Plot of residual against observed predictor values xi will help ascertain homogeneity assumption. –random appearance= homogeneity of variance assumption is valid.–non-random appearance=homogeneity assumption is not valid and variance is dependent on predictor values. ie