# Slides FINALgrp 1

Document Sample

```					    What decides the price of used cars?

Group 1

Jessica Aguirre
Keith Cody
Rui Feng
Jennifer Griffeth
Joonhee Lee
Hans-Jakob Lothe
Teng Wang
How we got data

   Collected from kbb.com (Kelley Blue Book)
   Used random number generator

   First collected 140 sets of data from various
types of cars
   Then collected 160 sets of data from
Toyota Camrys
Brand Population
Dodge        Chevrolet
Honda        433           1069
840

Volksvagen
425                                     Toyota
1350

Ford
1323

Nissan
1347

6787 Vehicles in Population
Models
Minivan
242                       SUV
Hatchback                            1463
382

Convertible
188

Coupe
383

Sedan
Pick-Up                               2807
1059

6524 Model Types in Population
Average Selling Price by Brand
\$25,000.00

\$21,513.60
\$20,543.20                                                        \$20,451.60
\$19,968.70
\$20,000.00

\$16,441.10
\$15,465.60
\$14,677.65
\$15,000.00

\$10,000.00

\$5,000.00

\$0.00
Chevrolet    Toyota       Nissan         Ford       Volkswagon    Honda        Dodge
Assumptions

   Random sample is representative of
population
   All prices are the selling price
   Residuals are homoskedastic
   Residuals are normally distributed
   The variables we choose affect the price of
used cars: age, color, etc
Preparations

   Created dummy variables
   e.g. Transmission, automatic = 0, manual = 1
   Color
   Type
   Engine
(V4 = 4, V8 = 8, etc)
All Cars: Regression of price against independent
variables (age, color, engine, miles and transmission)
Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 16:43
Sample: 1 140
Included observations: 140
Variable   Coefficient                 Std. Error   t-Statistic      Prob.
AGE       -671.2805                   191.5316     -3.504803       0.0006
COLOR       151.6366                   156.6386     0.968067        0.3348
ENGINE          1793.689                   292.1268     6.140105        0.0000
MILES      -3798.259                   590.6794     -6.430323       0.0000
TRANSMISSION        1462.702                   1248.129     1.171916        0.2433
C        48055.05                   6425.417     7.478900        0.0000
R-squared                        0.555991     Mean dependent var                       16859.54
Adjusted R-squared               0.539424     S.D. dependent var                       6831.670
S.E. of regression               4636.365     Akaike info criterion                    19.76316
Sum squared resid               2.88E+09      Schwarz criterion                        19.88923
Log likelihood                  -1377.421     F-statistic                              33.55917
Durbin-Watson stat               1.676795     Prob(F-statistic)                        0.000000
All Cars: Regression of price against
significant independent variables (p<0.05)
Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 16:40
Sample: 1 140
Included observations: 140
Variable    Coefficient                 Std. Error   t-Statistic      Prob.
AGE     -630.8791                   190.0333     -3.319834       0.0012
ENGINE       1751.229                   289.7728     6.043457        0.0000
LNMILE      -4013.936                   576.0638     -6.967866       0.0000
C        51372.11                   6106.150     8.413175        0.0000

R-squared                     0.547257     Mean dependent var                       16859.54

Adjusted R-squared            0.537271     S.D. dependent var                       6831.670

S.E. of regression            4647.190     Akaike info criterion                    19.75407

Sum squared resid            2.94E+09      Schwarz criterion                        19.83812

Log likelihood               -1378.785     F-statistic                              54.79716

Durbin-Watson stat            1.655862     Prob (F-statistic)                       0.000000

Price = -631.9880*AGE + 949.8378* ENGINE -0.051251* MILEAGE
+ 1977.688*TRIM + 18866.11
Some reasons why this model fails

   Color is randomly assigned a number (red = 9, blue = 7, etc)
   Engines: e.g. 4 cylinder = 4, V8 = 8  assumes the V8 is
twice the price of 4 cylinder
   We suspect that many models leads to low R-Square
Our solution: New model

    New model where we look at one model and brand (Toyota
Camry), only two engines (4 cylinder and 6 cylinder), and
disregard color
    Dummy variable for engine: 6 cylinder = 1, 4 cylinder = 0
    We also introduce a new variable called trim
    Dummy variable for trim: luxury = 1, standard = 0

    Toyota Camry
o Most Popular Car in America*

* Motor Trend
http://www.motortrend.com/features/auto_news/2010/112_1004_america_top_10_best_selling_vehicle_comparison_2009_2000/index.htm l
Camry Price Histogram
40
37
35

30

25
22
Frequency
20

15
15
11                  11
10                                         9
8
7           7                            7
6
5
5                        3                                                        3
2   2
1   1                                                                            1   1
0
0

Price
Toyota Camry: Regression of price against independent
variables (age, engine, mileage, trim and transmission)

Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 20:09
Sample: 1 160
Included observations: 160
Variable   Coefficient                 Std. Error   t-Statistic      Prob.
AGE       -625.4328                   64.45118     -9.703978       0.0000
ENGINE          917.0942                   324.9508     2.822256        0.0054
MILEAGE        -0.051027                   0.005406     -9.438689       0.0000
TRIM       1972.208                   309.7351     6.367400        0.0000
TRANSMISSION        967.9415                   1104.742     0.876170        0.3823
C        17888.66                   1141.740     15.66789        0.0000
R-squared                        0.828216     Mean dependent var                       14937.87
Adjusted R-squared               0.822638     S.D. dependent var                       3587.486
S.E. of regression               1510.845     Akaike info criterion                    17.51551
Sum squared resid               3.52E+08      Schwarz criterion                        17.63082
Log likelihood                  -1395.240     F-statistic                              148.4947

Durbin-Watson stat               1.275033     Prob(F-statistic)                        0.000000
Toyota Camry: Regression of price against
independent variables (age, engine, mileage and trim)
Dependent Variable: PRICE
Method: Least Squares
Date: 11/29/10 Time: 20:10
Sample: 1 160
Included observations: 160
Variable           Coefficient                 Std. Error   t-Statistic      Prob.
AGE               -631.9880                   63.96748     -9.879833       0.0000
ENGINE                  949.8378                   322.5527     2.944753        0.0037
MILEAGE                -0.051251                   0.005396     -9.497941       0.0000
TRIM               1977.688                   309.4398     6.391189        0.0000

C                18866.11                   242.7181     77.72850        0.0000

R-squared                                0.827359     Mean dependent var                       14937.87

Adjusted R-squared                       0.822904     S.D. dependent var                       3587.486

S.E. of regression                       1509.713     Akaike info criterion                    17.50798

Sum squared resid                       3.53E+08      Schwarz criterion                        17.60408

Log likelihood                          -1395.638     F-statistic                              185.7048

Durbin-Watson stat                       1.286429     Prob(F-statistic)                        0.000000

Price = -631.9880 * AGE + 949.8378 * ENGINE -0.051251 * MILEAGE + 1977.688 * TRIM + 18866.11
All Cars: mileage against price
120000

100000
R-Square ≈ 22%

80000
MILEGAE

60000

40000

20000

0
0   10000   20000     30000    40000

PRICE
Toyota Camrys: mileage against price
200000

R Square ≈ 66%

150000
MILEAGE

100000

50000

0
0   5000   10000 15000 20000 25000

PRICE
Alternative Model
PRICE^(1/2) = -0.0002263673136*MILEAGE + 4.59824795*ENGINE - 2.952776402*AGE +
7.704044111*TRIM + 139.1536581
Dependent Variable: NewPRICE

Method: Least Squares

Date: 11/30/10 Time: 12:16

Sample: 1 160

Included observations: 160

Variable        Coefficient                   Std. Error   t-Statistic      Prob.

MILEAGE          -0.000226                    2.23E-05     -10.16426       0.0000

ENGINE           4.598248                     1.331262     3.454051        0.0007

AGE             -2.952776                    0.264011     -11.18429       0.0000

TRIM            7.704044                     1.277142     6.032253        0.0000

C             139.1537                     1.001764     138.9087        0.0000

R-squared                             0.871118     Mean dependent var                         121.1827

Adjusted R-squared                    0.847276     S.D. dependent var                         15.94425

S.E. of regression                    6.230994     Akaike info criterion                      6.527700

Sum squared resid                     6017.920     Schwarz criterion                          6.623799

Log likelihood                        -517.2160    F-statistic                                221.5239

Durbin-Watson stat                    1.222753     Prob(F-statistic)                          0.000000
New Price                                        vs             Original Price
25000
160
20000
140
15000
120
6000
100                                                        10000
20                                                       4000
80                                                         5000
10                                                       2000
60
0
0                                                          0

-10                                                      -2000

-20                                                      -4000
20   40   60    80      100   120   140   160             20   40    60    80      100   120   140   160

Residual        Actual     Fitted                          Residual        Actual     Fitted
Conclusions

   As expected, older, higher mileage cars are
   Bigger engines and nicer levels of trim cost
more
   Our model explains 82% of price variations
What we learned from this project

   Communication can be difficult
   EViews is amazingly fun and can be useful in
analyzing social and economic phenomena

   Thanks!

```
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 4 posted: 11/5/2011 language: English pages: 20