Statistics
Document Sample


Part 15 – Regression Models
Statistics and Data
Analysis
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
Part 15 – Regression Models
Statistics and Data Analysis
Part 15 – Regression
Models
1/49
Part 15 – Regression Models
Linear Regression Models
Analyzing residuals
Violations of assumptions
Unusual data points
Hints for improving the model
Model building
Linear models – cost functions
Semilog models – growth models
Logs and elasticities
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
3/49
Part 15 – Regression Models
An Enduring Art Mystery
Graphics show relative
sizes of the two works.
The Persistence
of Statistics.
Hildebrand, Ott
and Gray, 2005
Why do larger
paintings command The Persistence of
Memory. Salvador
higher prices? Dali, 1931
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
7/49
Part 15 – Regression Models
The Data
Histogram of ln (US$)
80
70
60
50
Frequency
40
Histogram of ln (SurfaceArea)
30
20 90
80
10
70
0
10.5 12.0 13.5 15.0 16.5 60
Frequency
ln (US$)
50
40
30
20
10
0
3.2 4.0 4.8 5.6 6.4 7.2 8.0 8.8
ln (SurfaceArea)
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
4/49
Part 15 – Regression Models
Monet in Large and Small
Sale prices of 328 signed Monet paintings
Fitted Line Plot
ln (US$) = 2.825 + 1.725 ln (SurfaceArea)
18 S 1.00645
R-Sq 20.0%
17 R-Sq(adj) 19.8%
16
15
The residuals do not
ln (US$)
14
show any obvious
13
patterns that seem
12
inconsistent with the
11 assumptions of the
6.0 6.2 6.4 6.6 6.8 7.0 7.2 7.4 7.6 model.
ln (SurfaceArea)
Log of $price = a + b log surface area + e
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
8/49
Part 15 – Regression Models
Monet Regression
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
9/49
Part 15 – Regression Models
Using the Residuals
How do you know the model is “good?”
Various diagnostics to be developed over
the semester.
But, the first place to look is at the
residuals.
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
10/49
Part 15 – Regression Models
Residuals Can Signal a
Flawed Model
Standard application: Cost function
for output of a production process.
Compare linear equation to a
quadratic model (in logs)
(124 American Electric Utilities)
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
Part 15 – Regression Models
Electricity Cost Function
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
11/49
Part 15 – Regression Models
Candidate Model for Cost
Log c = a + b log q + e
Scatterplot of logCost vs logOutput
7
6
5
Most of the points in
4
this area are above Most of the points in
logCost
3 the regression line. this area are above
the regression line.
2
1
Most of the points in
0 this area are below the
regression line.
-1
0 2 4 6 8 10 12
logOutput
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
12/49
Part 15 – Regression Models
A Missing Variable?
Residuals from the (log)linear cost model
Residuals Versus logOutput
(response is logCost)
2.0
1.5
1.0
Residual
0.5
0.0
-0.5
-1.0
0 2 4 6 8 10 12
logOutput
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
13/49
Part 15 – Regression Models
A Better Model?
Log Cost = α + β1 logOutput + β2 [logOutput]2 + ε
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
14/49
Part 15 – Regression Models
Candidate Models for Cost
The quadratic equation is the appropriate model.
Logc = a + b1 logq + b2 log2q + e
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
15/49
Part 15 – Regression Models
Missing Variable Included
Residuals from the quadratic cost model
Residuals Versus logOutput
Residuals Versus logOutput
(response is logCost) (response is logCost)
2.0
0.50
1.5
1.0
0.25
Residual
0.5
0.0
Residual
-0.5
0.00
-1.0
0 2 4 6 8 10 12
logOutput -0.25
Residuals from the -0.50
linear cost model
0 2 4 6 8 10 12
logOutput
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
16/49
Part 15 – Regression Models
Heteroscedasticity
Hetero - differences
Scedastic - function, variation around
the mean
Arises when y is “proportional” to x
Arises sometimes when there are
natural, heterogeneous groups
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
17/49
Part 15 – Regression Models
Heteroscedasticity
Scatterplot of Residuals vs Predictions Residuals from
30000
a regression of
salaries on
20000
years of
10000
experience.
Residuals
0
-10000
-20000
10000 20000 30000 40000 50000 60000 70000
Predictions
Standard deviation of the residuals seems not to be constant.
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
18/49
Part 15 – Regression Models
Problem with the Model?
Residuals Versus Output
(response is Cost)
150
100
50
Residual
0
-50
-100
0 10000 20000 30000 40000 50000 60000 70000 80000
Output
This usually suggests the model should be
defined in terms of logs of the variable.
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
19/49
Part 15 – Regression Models
Sometimes Heteroscedasticity Can
Be Cured By Taking Logs
Scatterplot of Residual vs LogSalary
0.3
0.2
Residuals from a 0.1
regression of logs
0.0
Residual
of salaries on years
of experience. -0.1
Salary = αeβteε -0.2
We will explore this -0.3
model below.
-0.4
9.5 10.0 10.5 11.0 11.5
LogSalary
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
21/49
Part 15 – Regression Models
Should I Worry
About Heteroscedasticity?
Not a problem for using least squares
to estimate α or β.
But, there is a better method than
least squares.
Assessment of the uncertainty of the
least squares estimates may be too
optimistic.
(Not contagious)
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
24/49
Part 15 – Regression Models
Unusual Data Points
Outliers have (what appear to be) very large disturbances, ε
Scatterplot of Weight vs TLength Regression of Foreign Box Office on Domestic
Overseas = 6.693 + 1.051 Domestic
160
1400 S 73.0041
R-Sq 52.2%
1200 R-Sq(adj) 52.1%
140
1000
120
Weight
800
Overseas
600
100
400
80
200
0
60
10 12 14 16 18 20 22 24 26 28 0 100 200 300 400 500 600
TLength Domestic
Wolf weight vs. tail length The 500 most successful movies
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
25/49
Part 15 – Regression Models
Outliers (?)
Remember the empirical rule, 99.5% of observations will lie within
mean ± 3 standard deviations? We show (a+bx) ± 3se below.)
Regression of Foreign Box Office on Domestic
Overseas = 6.693 + 1.051 Domestic
Titanic is 8.1
1400 S 73.0041
standard
R-Sq 52.2% deviations
1200 R-Sq(adj) 52.1%
These from the
observations 1000 regression!
might 800 Only 0.86% of
Overseas
deserve a 600
the 466
close look. observations
400
lie outside
200 the bounds.
(We will
0
refine this
0 100 200 300 400 500 600
Domestic later.)
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
Part 15 – Regression Models
Prices paid at auction for Monet paintings vs. surface area (in logs)
logPrice = a + b logArea + e
Not an outlier: Monet chose to paint a small painting.
Possibly an outlier: Why was the price so low?
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
26/49
Part 15 – Regression Models
What to Do About Outliers
(1) Examine the data
(2) Are they due to mismeasurement error or obvious
“coding errors?” Delete the observations.
(3) Are they just unusual observations? Do nothing.
(4) Generally, resist the temptation to remove outliers.
Especially if the sample is large. (500 movies is
large. 10 wolves is not.)
(5) Question why you think it is an outlier. Is it really?
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
29/49
Part 15 – Regression Models
Regression Options
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
32/49
Part 15 – Regression Models
Minitab’s Opinions
Minitab uses ± 2S to flag
“large” residuals.
Influential observations
have very large values
of | xi - x | .
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
33/49
Part 15 – Regression Models
On Removing Outliers
Be careful about singling out particular
observations this way.
The resulting model might be a product of your
opinions
Removing outliers might create new outliers
that were not outliers before.
Statistical inferences from the model will be
incorrect.
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
Part 15 – Regression Models
Mechanically Remove
Outliers?
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
Part 15 – Regression Models
Removing Outliers Creates Outliers
N umber of Observations A fter R emoving Outliers
450
425
400
N U MO B S
375
350
325
300
0 10 20 30
IT ER
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
900000 14
Were they really outliers?
Meatball
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
34/49
Part 15 – Regression Models
Normal Distribution of ei?
Probability Plot of e_movies
Normal - 95% CI
99.9
Mean -4.89299E-14
StDev 72.93
99
N 466
AD 4.307
95 P-Value <0.005
90
80
70
Percent
60
50
40
30
20
10
5
1
0.1
-400 -200 0 200 400 600
e_movies
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
35/49
Part 15 – Regression Models
Probability Plot
Graph -> Probability Plots …
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
36/49
Part 15 – Regression Models
Using and Interpreting the Model
Interpreting the linear model
Semilog and growth models
Log-log model and elasticities
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
37/49
Part 15 – Regression Models
Statistical Cost Analysis
Fitted Line Plot
Cost = 2.444 + 0.005291 Output
The units of the LHS and RHS must
500 S 20.5111 be the same.
R-Sq 92.4%
R-Sq(adj) 92.3%
400 $M cost = a + b MKWH
300
Y = $ cost
Cost
200
a = $ cost = 2.444 $M
100
b = $M /MKWH = 0.005291 $M/MKWH
0
0 10000 20000 30000 40000 50000 60000 70000 80000
So,…..
Output
a = fixed cost = total cost if MKWH = 0
Generation cost ($M) and output b = marginal cost = dCost/dMKWH
(Millions of KWH) for 124 American b * MKWH = variable cost
electric utilities. (1970).
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
38/49
Part 15 – Regression Models
Semilog Models and Growth Rates
Fitted Line Plot
LogSalary = 9.841 + 0.04998 YEARS
Salary = e9.84+0.05Years
S 0.154111
11.5
R-Sq
R-Sq(adj)
86.4%
86.1%
Years = 0 at Starting Salary
11.0
Salary = e9.84 = 18,770
LogSalary
10.5 Marginal change. From year
10.0
t to year t+1, log Salary goes
up by 0.05. Salary changes
9.5
0 5 10 15 20 25 30
YEARS from e9.84+0.05Y to e9.84+0.05(Y+1) .
LogSalary = 9.84 + 0.05 Years + e
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
39/49
Part 15 – Regression Models
Growth in a Semilog Model
logSalary 9.84 0.05Years e
Salary = e9.84+0.05Years
Years = 0 at Starting Salary = 18,770
Marginal change. From year t to year t+1,
log Salary goes up by 0.05. Salary changes
from e9.84+0.05Y to e9.84+0.05(Y+1) . Salary change is
Salary(t+1)-Salary(t) Salary(t+1)
1
Salary(t) Salary(t)
e9.84+0.05(Y+1)
9.84+0.05Y 1 e0.05 1 0.0513.
e
Conclude: The slope is the growth rate per period or the
proportional increase for a 1 unit change in the "x."
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
40/49
Part 15 – Regression Models
Using Semilog Models for Trends
Scatterplot of Flights vs Month
350
300
250
200
Flights
150
100
50
0
0 10 20 30 40 50 60 70 80
Month
Frequent Flyer Flights for 72 Months.
(Text, Ex. 11.1, p. 508)
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
41/49
Part 15 – Regression Models
Regression Approach
logFlights = α + β Months + ε
a = 2.770, b = 0.03710, s = 0.06102
Fitted Line Plot
LogFlights = 2.770 + 0.03710 Month
6.0 S 0.247017
R-Sq 90.9%
5.5 R-Sq(adj) 90.8%
5.0
LogFlights
4.5
4.0
3.5
3.0
0 10 20 30 40 50 60 70 80
Month
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
43/49
Part 15 – Regression Models
Elasticity and Loglinear Models
logY = α + βlogx + ε
The “responsiveness” of one variable to changes
in another
E.g., in economics
demand elasticity = (%ΔQ) / (%ΔP)
Math: Ratio of percentage changes
%ΔQ / %ΔP = {100%[(ΔQ )/Q] / {100%[(ΔP)/P]}
Units of measurement and the 100% fall out of this eqn.
Elasticity = (ΔQ/ΔP)*(P/Q)
Elasticities are units free
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
8/49
Part 15 – Regression Models
Monet Regression
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
49/49
Part 15 – Regression Models
Summary
Residual analysis
Consistent with model assumptions?
Suggest missing elements in the
model
Building the regression model
Interpreting the model – cost function
Growth model – semilog
Double log and estimating elasticities
Marginal Plot of Listing vs IncomePC
Pie Chart of Percent vs Type Boxplot of Listing Scatterplot of Listing vs IncomePC Probability Plot of Listing Scatterplot of Listing vs IncomePC Histogram of Listing Empirical CDF of Listing
Normal - 95% CI Normal
Category 900000 900000
Meatball 900000 14
e mc
Pepperoni 99 Mean 369687
Garlic 5.0% Mean 369687 100
Mushroom and Onion
9.2%
2.3%
Pepperoni
21.8%
Plain
Mushroom
Sausage
800000 800000
95
StDev
N
156865
51
800000 2 12
StDev
N
156865
51
AD 0.994 80
Pepper and Onion 700000 90 700000
700000 P-Value 0.012
Mushroom and Onion 10
Garlic 80 1000000
600000 600000 60
Frequency
Pepper and Onion
Percent
Meatball 70
600000
Listing
Listing
7.3% 8
Percent
60 800000
500000 500000
Listing
50 40
Sausage 500000 40 6
Listing
5.8% 400000 30 400000 600000
400000 20 20
300000 300000 4
10 400000
300000 200000 5 200000 2 0
Mushroom Plain
16.2%
200000
32.5%
200000 0 00 00 00 00 00 00 00 00 00
100000 100000
1 0 00 00 00 00 00 00 00 00 00
15000 17500 20000 22500 25000 27500 30000 32500 0 200000 400000 600000 800000 1000000 15000 17500 20000 22500 25000 27500 30000 32500 200000 300000 400000 500000 600000 700000 800000 900000 10 20 30 40 50 60 70 80 90 15000 20000 25000 30000
IncomePC Listing IncomePC Listing Listing IncomePC
100000
Get documents about "