Docstoc

Oregon Property Tax Estimator

Document Sample
Oregon Property  Tax Estimator Powered By Docstoc
					Business Statistics I
     MGT 515
  Intro Statistics
             Simple examples
• Should male drivers be charged higher auto
  insurance premiums than female drivers?
  – Types of cars driven
• What is the probability of mortgage default?
  – How is it conditioned on the macroeconomic
    environment such as unemployment, GDP….
• Do lower taxes attract businesses and create
  employment growth?
           Population versus Sample
• Observing the entire population is too costly
   – Drawing a small sample out of the population usually
     significantly reduces the cost
• Population
   – All observations
      • All US drivers constitute the population of US drivers
• Sample
   – A small selection drawn from the population
      • You, the students in this class, constitute a small sample of US
        drivers.
• Observation
   – Each driver represents an observation
   – Each observation contains information on one or more
     variables
      • Driver information: gender, age, type of car driven….
           – Variables: GENDER, AGE, TYPE OF CAR
       Population versus Sample
• Parameter
   – Population characteristic
   – Population mean is a parameter
      • Average age of US drivers
• Statistic
   – Sample characteristic
   – Sample mean is a statistic
      • Average age of the drivers in our class
                 Types of variables
• Continuous
   – Age of the driver
   – Miles driven per year
• Discrete Variables (integer count)
   – Number of cars in the household
• Categorical (Qualitative)
   – Driver Gender
   – Type of vehicle (sedan versus SUV)
      • Treated as binary variables
          – Quantifying qualitative variables
• Ranked (ordered) data
   – Censored data
      • Income (given in brackets rather than in the exact values)
                  US States
• Population: 50 States + DC
• Each State is an Observation
• Sample would be a selection of several states
                                                   High Income                       Private       Private
                                  High Income       Tax Rate                      Employment    Employment    Employment
Observation           State        Tax Rate          Bracket     Sales Tax Rate    1999 (000)    2008 (000)     growth %   No Income Tax
     1            ALABAMA               5             3,000             4           1568.7        1610.9       2.6901256         0
     2             ALASKA               0                               0            204.1         239.5       17.344439         1
     3            ARIZONA             4.54          150,000            5.6           1809         2182.8        20.66335         0
     4           ARKANSAS               7            31,000             6            954.3         989.9       3.7304831         0
     5          CALIFORNIA            9.3            44,815           7.25          11752.4       12474.8      6.1468296         0
     6           COLORADO             4.63                             2.9          1804.3        1965.2       8.9175858         0
     7         CONNECTICUT              5            10,000             6            1434         1447.1       0.9135286         0
     8           DELAWARE             5.95           60,000             0            357.8         370.5        3.549469         0
     9            FLORIDA               0                               6           5850.4        6635.7       13.423014         1
    10            GEORGIA               6             7,000             4            3265          3409        4.4104135         0
    11              HAWAII            8.25           48,000             4            422.2         494.2       17.053529         0
    12              IDAHO             7.8            24,736             6            433.7         529.2       22.019829         0
    13             ILLINOIS             3                             6.25          5132.9        5093.3      -0.7714937         0
    14             INDIANA            3.4                               6           2572.4        2518.3      -2.1030944         0
    15               IOWA             8.98           62,055             5           1229.1        1270.3       3.3520462         0
    16             KANSAS             6.45           30,000            5.3          1088.8        1130.9       3.8666422         0
    17           KENTUCKY               6            75,000             6           1494.3        1531.6        2.496152         0
    18           LOUISIANA              6            25,000             4           1523.7        1576.7        3.478375         0
    19              MAINE             8.5            19,450             5            489.6         511.8       4.5343137         0
    20           MARYLAND             5.5           500,000             6           1947.5        2111.1       8.4005135         0
    21        MASSACHUSETTS           5.3                               5           2814.5        2847.9       1.1867117         0
    22            MICHIGAN            4.35                              6           3917.7        3511.3      -10.373433         0
    23          MINNESOTA             7.85           71,591            6.5          2225.6        2340.7       5.1716391         0
    24          MISSISSIPPI             5            10,000             7            926.1         898.8      -2.9478458         0
    25            MISSOURI              6             9,000          4.225          2305.5        2346.3       1.7696812         0
    26            MONTANA             6.9            14,900             0            301.4         358.5       18.944924         0
    27           NEBRASKA             6.84           27,001            5.5           742.5         800.8       7.8518519         0
    28             NEVADA               0                              6.5           865.6        1104.7       27.622458         1
    29        NEW HAMPSHIRE             0                               0            524.2         550.9       5.0934758         1
    30          NEW JERSEY            8.97          500,000             7           3323.5        3407.1       2.5154205         0
    31          NEW MEXICO            5.3            16,000             5            549.4         649.2       18.165271         0
    32           NEW YORK             6.85           20,000             4           7013.6        7282.7       3.8368313         0
    33        NORTH CAROLINA          7.75           60,000           4.25          3252.1        3422.5       5.2396913         0
    34         NORTH DAKOTA           5.54          349,701             5            252.7         290.9       15.116739         0
    35               OHIO             6.24          200,000            5.5          4791.4        4571.5      -4.5894728         0
    36           OKLAHOMA             5.5             8,701            4.5          1170.3        1270.1       8.5277279         0
    37            OREGON                9             7,300             0           1313.6         1422        8.2521315         0
    38         PENNSYLVANIA           3.07                              6           4870.5        5051.6       3.7183041         0

    39          RHODE ISLAND      25% of federal                        7           402.1         418.3       4.0288485         0
    40         SOUTH CAROLINA           7            13,350             6            1515         1583.3      4.5082508         0
    41          SOUTH DAKOTA            0                               4             300         335.4           11.8          1
    42           TENNESSEE              0                               7           2295.1         2350       2.3920526         1
    43               TEXAS              0                             6.25          7625.5        8839.8      15.924202         1
    44               UTAH               5                             4.65            869         1043.2       20.04603         0
    45            VERMONT             9.5           357,700             6           243.9         252.1       3.3620336         0
    46             VIRGINIA           5.75           17,000             5           2801.1        3062.9      9.3463282         0
    47           WASHINGTON             0                              6.5          2174.4         2414       11.019132         1
    48          WEST VIRGINIA         6.5            60,000             6           585.1         614.4        5.007691         0
    49            WISCONSIN           6.75          145,460             5           2385.1        2449.7      2.7084818         0
    50            WYOMING               0                               4           173.6           229       31.912442         1
    51        DIST. OF COLUMBIA       8.5            40,000           5.75          404.9         470.2       16.127439         0
Visually Presenting The Data
                   Employment growth %
35
30
25
20                                                                                  Private Employment Growth by State
15
10
                                                                                    1999 – 2008 as % of State’s starting
 5                                                                                  employment values
 0
 -5 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
-10
-15




                                                                               States by Growth Category
                                                             35

                                                             30

                                                             25

                                                             20
                     Frequency Chart                         15
                     Large categories                        10

                                                               5

                                                               0
                                                                               <0          0<.<10          10<
                                         Frequency Distribution
                                 P.D.F.           C.D.F.

     Category        Count      Proportion   Cumulative                                                                Proportion
      -10..-5          1        0.019608      0.019608                                 0.45
       -5..0           4        0.078431      0.098039                                  0.4
        0..5          20        0.392157      0.490196                                 0.35
       5..10          11        0.215686      0.705882                                  0.3
      10..15           3        0.058824      0.764706                                 0.25
      15..20           7        0.137255      0.901961                                  0.2
      20..25           3        0.058824      0.960784                                 0.15
      25..30           1        0.019608      0.980392                                  0.1
      30..50           1        0.019608          1                                    0.05
                      51            1                                                    0
                                                                                              -10..-5   -5..0   0..5   5..10   10..15 15..20 20..25 25..30 30..50
          P.D.F. Point (Probability) Density Function

          C.D.F. Cumulative Density Function
                                                                                                                       Cumulative
                                    Count                                               1.2
25
                                                                                         1
20                                                                                      0.8

                                                                                        0.6
15
                                                                                        0.4                                                              Cumulative
10
                                                                                        0.2

5                                                                                        0

0
       -10..-5   -5..0   0..5    5..10   10..15    15..20   20..25   25..30   30..50
P.D.F. and C.D.F. Uniform Distribution

           Raw Data                        Sorted Data             P.D.F       C.D.F.
Observation ID        X   Observation ID          X      Count   Proportion
      1               4         2                 3       1      0.142857     0.142857
      2               3         1                 4       1      0.142857     0.285714
      3               5         3                 5       1      0.142857     0.428571
      4               7         5                 6       1      0.142857     0.571429
      5               6         4                 7       1      0.142857     0.714286
      6               9         7                 8       1      0.142857     0.857143
      7               8         6                 9       1      0.142857        1




 Range of the distribution: Maximum – Minimum = 9 – 3
 Number of Observations: 7
 P.D.F. = 1/7 = 1/N, as each observation has equal weight of one
 C.D.F. = (X – min+1)/(max-min+1) Discrete Variable Case
P.D.F. of the Normal Distribution

                        ( x )   2
           1          
f ( x)                   2 2
              e
          2
             Descriptive Measures
• The Mean
  – Arithmetic Mean
     • Grade Point Average
  – Weighted Mean
     • Consumer Price Index
• The Median
  – Center of the sorted (ranked) distribution
  – If the number of observations is odd, the middle ranked
    value is the median
  – If the number of observations is even, the median is the
    average of the two middle observations
• Mode
  – Most frequent occurrence
                Geometric Mean
                 x  ( x1  x2  ....  xn )                       1/ n



One application is in the computation of average rate of return


              1  i  (1  i1 )  ...(  in )
                                                                    1/ n
                                      1


                  interest          1+int

                             0.05            1.05                          1.306901

                             0.06            1.06 Geometric Mean           1.054991

                             0.05            1.05

                             0.06            1.06

                         0.055              1.055
              Employment
Observation
    22
    35
    24
               growth %
                -10.37
                 -4.59
                 -2.95
                                              Splitting the data into
    14           -2.10
    13           -0.77
                                                    QUARTILES

                           First Quartile
    7             0.91
    21            1.19
    25            1.77
    42            2.39
    17            2.50
    30            2.52
    1             2.69
    49            2.71
    15            3.35
    45            3.36
    18            3.48
                                                                Where i represents the ith quartile
    8             3.55
                                                  i (n  1)
                           Second Quartile




                                             Qi 
    38            3.72
    4             3.73                                          and n represents the number of
    32
    16
                  3.84
                  3.87                                4         RANKED observations
    39            4.03
    10            4.41
    40            4.51
    19            4.53
    48            5.01
    29            5.09
    23            5.17
    33            5.24
    5             6.15
                           Third Quartile




    27            7.85
    37            8.25                       Interquartile Range: Q3 – Q1: the middle 50% of the
    20            8.40
    36            8.53                       ranked observations
    6             8.92
    46            9.35
    47           11.02
    41           11.80
    9            13.42
    34           15.12
    43           15.92
    51           16.13
    11           17.05
                           Fourth Quartile




    2            17.34
    31           18.17
    26           18.94
    44           20.05
    3            20.66
    12           22.02
    28           27.62
    50           31.91
     Shape of the Distribution
• Mean
• Median
• Spread
                        Frequency                                        PDF
Category   INTC   HPQ   T      VZ   CAT   SO     INTC     HPQ        T         VZ    CAT      SO


  -80        1     0     0    0      0    0     0.008333     0        0        0        0       0
  -75        0     0     0    0      0    0         0        0        0        0        0       0
  -70        0     0     0    0      0    0         0        0        0        0        0       0
  -65        0     0     0    0      0    0         0        0        0        0        0       0
  -60        0     0     0    0      0    0         0        0        0        0        0       0
  -55        0     0     0    0      0    0         0        0        0        0        0       0
  -50        1     0     0    0      1    0     0.008333     0        0        0    0.008333    0
  -45        0     1     0    0      0    0         0    0.008333     0        0        0       0
  -40        0     1     0    0      1    0         0    0.008333     0        0    0.008333    0
  -35        1     0     0    0      0    0     0.008333     0        0        0        0       0
  -30        1     0     0    0      0    0     0.008333     0        0        0        0       0
  -25        3     1     0    1      1    0       0.025 0.008333      0    0.008333 0.008333    0
  -20        2     4     2    1      1    0     0.016667 0.033333 0.016667 0.008333 0.008333    0
  -15        5     5     5    0      4    0     0.041667 0.041667 0.041667     0    0.033333    0
  -10        7     7     8    10     6    7     0.058333 0.058333 0.066667 0.083333 0.05 0.058333
   -5       15    13    16    18    13    7       0.125 0.108333 0.133333 0.15 0.108333 0.058333
    0       22    24    31    30    28    31    0.183333    0.2   0.258333 0.25 0.233333 0.258333
    5       25    30    29    34    24    54    0.208333 0.25 0.241667 0.283333        0.2    0.45
   10       14    18    21    20    28    17    0.116667 0.15       0.175 0.166667 0.233333 0.141667
   15       13     8     5    3      7    3     0.108333 0.066667 0.041667 0.025 0.058333 0.025
   20        9     4     2    2      4    1       0.075 0.033333 0.016667 0.016667 0.033333 0.008333
   25        0     3     1    0      1    0         0      0.025 0.008333      0    0.008333    0
   30        1     1     0    1      1    0     0.008333 0.008333     0    0.008333 0.008333    0

  total    120    120   120   120   120   120      1        1        1         1      1        1
                            Shape of the Distribution
                           monthly stock price change
                                                                    HPQ                                                   T
                  INTC                            40                                                40
30                                                30                                                30
20                                                20                                                20
10                                                10                                                10
0                                                  0                                                 0
     -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30        -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30        -80 -70 -60 -50 -40 -30 -20 -10 0   10 20 30



                                                                                                                         SO
                   VZ                                                                               60

40                                                                    CAT                           50

30                                                30                                                40

20                                                20                                                30

10                                                10                                                20

0                                                  0                                                10
     -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30        -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 0
                                                                                                         -80 -70 -60 -50 -40 -30 -20 -10 0   10 20 30




     Data source: Yahoo.com.
     Data: monthly stock price changes from July 1999 – July 2009
                                               p.d.f.
 0.5


0.45


 0.4


0.35


 0.3                                                                                                       INTC
                                                                                                           HPQ
0.25                                                                                                       T
                                                                                                           VZ
 0.2                                                                                                       CAT
                                                                                                           SO

0.15


 0.1


0.05


  0
       -80 -75 -70 -65 -60 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10   -5   0   5   10   15   20   25   30
                    Spread
• In the case of stocks, spread is a measure of
  risk and should be captured in the option
  pricing
                                               SPREAD
                                Difference      xi  x

                              Sum of Squares   i
                                                 ( xi  x ) 2

 Sample Variance                                                Population Variance
        n

        (x  x)
                                                                       N

                                                                       
                          2
               i                                                         ( xi   ) 2
S2    i 1
                                                                2    i 1
              n 1                                                            N
Sample Standard Deviation                                   Population Standard Deviation
                      n                                                           N

                      ( xi  x ) 2                                             ( xi   ) 2
S  S2              i 1
                                                             2                i 1
                              n 1                                                       N
           INTCchange                      HPQchange                        Tchange

 Mean                   -1.27715 Mean                   -0.51072 Mean                  -0.62194
 Standard Error         1.308779 Standard Error          1.07527 Standard Error        0.744948
 Median                 0.542984 Median                 0.592682 Median                -0.24774
 Mode                    #N/A    Mode                    #N/A    Mode                   #N/A
 Standard Deviation     14.33695 Standard Deviation     11.77899 Standard Deviation    8.160501
 Sample Variance        205.5482 Sample Variance        138.7446 Sample Variance       66.59378
 Kurtosis               7.822148 Kurtosis               2.740998 Kurtosis              0.612623
 Skewness               -1.98331 Skewness               -0.91781 Skewness              -0.39071
 Range                  105.4057 Range                  73.20115 Range                 45.68781
 Minimum                -80.1469 Minimum                -47.0547 Minimum               -23.0226
 Maximum                 25.2588 Maximum                26.14648 Maximum               22.66521
 Sum                    -153.258 Sum                    -61.2867 Sum                   -74.6324
 Count                       120 Count                       120 Count                     120



           VZchange                         CATchange                       SOchange

Mean                    -0.48819 Mean                   0.010754 Mean                    0.831933
Standard Error          0.694313 Standard Error         0.992893 Standard Error          0.492301
Median                  0.097096 Median                  0.91357 Median                  1.030665
Mode                     #N/A    Mode                    #N/A     Mode                    #N/A
Standard Deviation      7.605816 Standard Deviation      10.8766 Standard Deviation      5.392892
Sample Variance         57.84844 Sample Variance        118.3003 Sample Variance         29.08328
Kurtosis                2.054396 Kurtosis               6.289378 Kurtosis                1.683112
Skewness                0.065181 Skewness               -1.56712 Skewness                -0.29751
Range                   54.74623 Range                  80.40594 Range                   33.39245
Minimum                 -26.5565 Minimum                  -54.464 Minimum                -13.9316
Maximum                 28.18969 Maximum                 25.9419 Maximum                 19.46083
Sum                     -58.5828 Sum                    1.290532 Sum                     99.83195
Count                        120 Count                        120 Count                       120
      Coefficient of Variation
     Standard Deviation        S
CV                     100%  100
           Mean                x


         INTC    HPQ      T       VZ     CAT      SO

 CV      41.79   38.67   25.48   15.95   52.25   33.92
                     Skewness of the Distribution
             INTCchange
                                                                   INTC
Mean                       -1.27715   30
Standard Error            1.308779
Median                    0.542984    25
Mode                      #N/A
Standard Deviation        14.33695    20
Sample Variance           205.5482
Kurtosis                  7.822148    15
Skewness                   -1.98331
Range                     105.4057    10
Minimum                    -80.1469
                                       5
Maximum                     25.2588
Sum                        -153.258    0
Count                           120
                                           -80-75-70-65-60-55-50-45-40-35-30-25-20-15-10 -5 0 5 10 15 20 25 30




     Negative, or left skewed


          Mean < Median (mean is pushed to the left by “outlying” observations on the left
          end of the distribution)

          Skewness < 0
          Tchange                                                   T
Mean                 -0.62194     35
Standard Error       0.744948     30
Median               -0.24774
Mode                   #N/A       25
Standard Deviation   8.160501
                                  20
Sample Variance      66.59378
Kurtosis             0.612623     15
Skewness             -0.39071
Range                45.68781     10
Minimum              -23.0226      5
Maximum              22.66521
Sum                  -74.6324      0
Count                     120            -80-75-70-65-60-55-50-45-40-35-30-25-20-15-10 -5 0 5 10 15 20 25 30




      This distribution is only slightly skewed to the left

      Mean is slightly less than the median

      Skewness measure is close to zero
                         Chebyshev Rule

                                1 
                            1  2   100
                             k 
The Rule:
            No matter what the distribution is
            the Rule defines the minimum percentage of observations that will be found
            within the k standard deviations of the mean
                                Chebyshev Rule
        INTCchange

Mean                 -1.27715
Standard Error       1.308779         Based on the Rule, 75% of all observations should be
Median               0.542984
Mode                   #N/A           found within 2 standard deviations of the mean
Standard Deviation   14.33695
Sample Variance      205.5482
Kurtosis             7.822148
Skewness             -1.98331
Range                105.4057
Minimum              -80.1469
Maximum               25.2588
                                    Defining 75% confidence interval:
Sum                  -153.258
Count                     120
                                      x  2  S  1.27715 2 14.33695


              75% of the time the stock of INTC is likely exhibit a monthly change between
              -29.95% and +28.68%

              In reality, during the past decade, 116 times out of 120 has INTC stock demonstrated
              growth within the two standard deviations of the mean over the past decade.
                      Covariance and Correlation
                       Stock Price Distribution
                                                  INTC       HPQ       T          VZ           CAT          SO
                  n

                  ( xi  x )( yi  y )
                                          INTC     103.47
                                          HPQ       45.48    134.69
COV ( x, y )    i 1
                         n 1             T        18.517    56.478        43.2
                                          VZ        19.14    32.899    28.279     24.377
                                          CAT      -79.73    108.82    38.508     7.0726       396.84

                                          SO          -54    21.263    -0.342     -8.726       133.16       63.539



                                                  INTC       HPQ        T          VZ           CAT          SO
                                          INTC           1

     cov(x, y )                           HPQ    0.385253          1
  r                                      T      0.276967 0.740395            1
       SxS y                              VZ     0.381103 0.574143 0.871444                1
                                          CAT    -0.39346 0.470676 0.294101 0.071908                    1
                                          SO     -0.66604 0.229844 -0.00653 -0.22172 0.838598                        1


 SO – Southern Power, is strongly correlated with CAT, and negatively correlated with INTC
 VZ and T are strongly correlated
                     ? Regression ?
SUMMARY OUTPUT

   Regression Statistics
Multiple R       0.871444
R Square         0.759415
Adjusted R
Square           0.757394
Standard Error   3.250819
Observations            121

ANOVA
                                                        Significan
                     df         SS       MS        F       ce F
Regression                  1 3969.577 3969.577 375.6287 1.28E-38
Residual                  119 1257.571 10.56782
Total                     120 5227.148

                 Coefficien Standard                   Lower   Upper    Lower   Upper
                     ts       Error  t Stat  P-value    95%     95%     95.0%   95.0%
Intercept         -10.1602 1.88385 -5.3933 3.57E-07 -13.8904 -6.42996 -13.8904 -6.42996
VZ                1.160089 0.059857 19.38114 1.28E-38 1.041567 1.27861 1.041567 1.27861
                Covariance (revisited)
                    N
             XY   ( X i  E[ X ]) (Yi  E[Y ]) P( X iYi )
                    i 1




• Measure of relationship between X and Y variables
   – Zero indicates that the variables are independent

                   X         Y     X - E[X]    Y - E[Y]
                   2         4           -1.75      -0.75 0.328125
                   3         6           -0.75       1.25 -0.23438
                   6         5            2.25       0.25 0.140625
                   4         4            0.25      -0.75 -0.04688

     E[.]         3.75      4.75                           0.1875

Positive relationship: higher values of X correspond to higher values of Y
Properties of the sum of two random
               variables
           E[ X  Y ]  E[ X ]  E[Y ]


  Var ( X  Y )   2 X Y   2 X   2Y  2 XY
              Expected Return
• Lottery sells 20 tickets with one winning ticket.
  The winning ticket pays 10 dollars. What is the
  expected return from purchasing one ticket?
  – Probability the ticket is winning: 1/20 = 0.05
  – Expected Return: Payoff times its probability
     • From holding only one ticket:
     Probability (Winning) times Prize
     0.05 * $10 = $0.5
             Expected Return II
• What if the lottery has three winning tickets:
  – One ticket paying 10 dollars
  – Two tickets paying 5 dollars each
• What is the expected return now?
                      N
             E[ X ]   Pr( X i ) * X i
                      i 1
    Expected Return of an investment
               portfolio
                               Daily Stock Return ($)    Weighted   Return
                                  INTC         T         INTC          T     Expected
                                 -0.17      -0.11       -0.119      -0.033      -0.152
                                  0.45      -0.02        0.315      -0.006       0.309
                                  0.02       0.08        0.014       0.024       0.038
                                  0.73       0.08        0.511       0.024       0.535
                                 -0.36      -0.01       -0.252      -0.003      -0.255

        E[.]                       0.134       0.004    0.0938      0.0012     0.095
        Weights                     0.7         0.3
        Weighted Aver Return       0.095




Standard Deviation of the Portfolio Return as the Measure of Risk of the Portfolio

       
 p  wx   2    2
                     x    w y2      2
                                          y    2wx wy xy         1/ 2
                         PROBABILITY
• Likelihood
• Coin has two sides: reverse, obverse
   – Number of all possible outcomes of a coin toss
      • 2
   – Probability that obverse shows:
      • One particular outcome (obverse)
            – Probability = outcome/all possible outcomes = ½ = 0.5 = 50%
• Mutually Exclusive events
   – Obverse OR Reverse
      • Cannot occur simultaneously
• Collectively Exhaustive
   – Set of events where one event must occur, an event drawn
     from the set has 100% probability
      • Set of events: Obverse, Reverse. One of these must occur.
        Between these two events all possibilities are covered.
 Growth
Category
  (%)      INTC   Probability
                  p.d.f.

  -80        1       0.008333                                                             Probability that the stock of Intel will lose




                                          Probability that the stock of intel will lose
  -75        0              0
  -70        0              0
                                                                                          at least 20 % would be: 7.5%
  -65        0              0
  -60        0              0                                                             What is the probability that the stock of Intel




                                                    value during a month
  -55        0              0
  -50        1       0.008333                                                             will gain between 5 and 15%?
  -45        0              0
  -40        0              0
  -35        1       0.008333
  -30        1       0.008333
  -25        3          0.025
  -20        2       0.016667
  -15        5       0.041667
  -10        7       0.058333
   -5       15          0.125
                                gain value during a
                                Probability that the




    0       22       0.183333
                                 stock of Intel will




    5       25       0.208333
   10       14       0.116667
                                      month




   15       13       0.108333
   20        9          0.075
   25        0              0
   30        1       0.008333

  total    120
                     Conditional Probability
What is the probability that the stock of INTC will increase after falling?

Possible States of the World (outcomes, total outcomes 120, 58 declines and 62 increases)
          Joint Probabilities:
          1)         Stock declines after an increase 34 outcomes; 28.33%
          2)         Stock declines after a decline 24 outcomes; 20%
          3)         Stock increases after an increase 27 outcomes; 22.5%
          4)         Stock increases after a decline 35 outcomes; 29.17%

Conditional Probability (applied to the subset on which the conditioning is being made):
Pr(A|B) = Pr(A and B)/Pr (B)
          Probability that the stock of Intel will increase after falling in the previous
          month is: Pr(Increase|Decline) = 35/58 = 60.3%

          Pr(Decline|Decline) = 24/58 = 41.4%

          Pr(Decline) = 58/120 = 48.3%

          Pr(Increse) = 62/120 = 51.7%

Two Events are Independent if: Pr(A|B) = Pr(A)
We can argue that in the case of INTC, the behavior of the stock in a given month depends
on its behavior in the prior month.
                Multiplication Rule
                                   Pr( A and B)
                      Pr( A | B) 
                                       Pr( B)


                      Solve for the joint probability

                  Pr(A and B)  Pr(A | B) Pr(B)

Pr(Increase and Decline)  29.17
Pr(Decline)  49.2
Pr(Increase | Decline)  59.3
Multiplication Rule for INDEPENDENT
               EVENTS

Events A and B are statistically independent iff Pr(A|B) = Pr(A)

Multiplication rule simplifies:

Pr(A and B) = Pr(A|B) Pr(B) = Pr(A) Pr(B)
             Additional Probability Rules
                 N
                                            Where N represents N mutually exclusive and
       Pr( A)   Pr( A | Bi ) Pr( Bi )
                 i 1
                                            Collectively exclusive events
Pr(INTC increases) = Pr(increase|decline) Pr(decline) + Pr(increase|increase) Pr(increase)

62/120               =   ( 35      / 58 )    (58/120) + ( 27      /    62 )    ( 62/120)



         Pr(A and B)  Pr(A | B) Pr(B)
                               Bayer’s Theorem

                                                        Pr(A and B)  Pr(A | B)Pr(B)
Conditional                          Pr( A and B)
                      Pr( A | B) 
Probability                              Pr(B)




                             Pr( A and B) Pr( A | B) Pr( B)
Similarly:    Pr( B | A)                
                                 Pr(A)         Pr( A)

Note:         Pr(A)  Pr(A | B) Pr(B)  Pr(A | C ) Pr(C )




                                             Pr(A | B)Pr(B)
Combining these:       Pr(B | A) 
                                   Pr( A | B) Pr( B)  Pr( A | C ) Pr( C )
                                  Bayes’ Theorem

  •    Consider the following scenario: on average the stock of INTC posts monthly
       declines 40% of the time. The stock is rated by a number of analysts. In the past
       when the analysts assigned INTC stock accumulate rating they were correct 70% of
       the time (the stock posted a monthly increase). However, in the remaining 30% of
       the time, the stock declined. Recently, the stock of INTC again received the average
       rating of accumulate, what is the probability that the stock will increase given this
       rating?
   • Pr(Increase) = 60%
   • Pr(Decline)=40%
   • Pr(Accumulate|Increase)=70%
   • Pr(Accumulate|Decline)=30%
Pr(• The |Question is 
                                         Pr( Accumulate | Increase ) Pr( Increase )
   Increase Accumulate) what is Pr(Accumulate|Increase)=?
                        Pr( Accumulate | Increase ) Pr( Increase )  Pr( Accumulate | Decline) Pr( Decline)

                               0.7 * 0.6
                                                0.75
                         0.7 * 0.6  0.3 * 0.4
             Practice example
• The student in the past failed every on
  average every fifth exam he took.
  Furthermore, 50% of the time when he failed
  his exams he studied hard for them. However,
  in 90% of exams that he passed he also
  studied hard for them. For the upcoming exam
  the student has been studying hard, what is
  the probability that he will pass this exam?
                                       Solution
• Pr(F) = 0.2
• Pr(P) = 0.8
These two are mutually exclusive and exhaustive events

•     Pr(S|F)=0.5 Half of the time when the exam was failed, the student had been studying
•     Pr(S|P)=0.9, 90% of the time when passing occurred, the student had been studying
•     Pr(P|S)=?

                            Pr( S | P) Pr( P)                   0.9 * 0.8        0.72
    Pr( P | S )                                                              
                  Pr( S | P) Pr( P)  Pr( S | F ) Pr( F ) 0.9 * 0.8  0.5 * 0.2 0.82
      Discrete Random Variable
• Discrete Variable – variables generated from a
  counting process
  - Enrollment in class
     - Enrollment in econ 101 depending on the time of day it
       is offered
  - Number of accidents on Buffalo highways on a
    given day.
  - Daily number of listings on eBay of a given item
- These variables are numerical but NOT
  continuous
         Characteristics of The Distribution
   • Each observed value has its own probability.
             number of         number of      Number of
       Day    listings   Day    listings       Listings       Count   Probability
         1       12       4         8              8            1        0.05
         2       15      12         9              9            1        0.05
         3       10       3        10             10            3        0.15
         4        8       7        10             11            2         0.1
         5       14      17        10             12            5        0.25
         6       17      10        11             13            2         0.1
         7       10      18        11             14            3        0.15
         8       14       1        12             15            1        0.05
         9       12       9        12             16            1        0.05
        10       11      11        12             17            1        0.05
        11       12      15        12
        12        9      19        12
        13       16      16        13
        14       14      20        13
        15       12       5        14
        16       13       8        14
        17       10      14        14
        18       11       2        15
        19       12      13        16
        20       13       6        17




                               N

Computing the average,         x     i                   N
mean (expected value)        i 1
                                           E[ x]   xi Pr( xi )
                                 N                    i 1
Variance of a Discrete Random Variable:

        N
   ( xi  E[ xi ])2 Pr(xi )
  2

        i 1



 Standard Deviation:



  2
       Binary (“dummy”) Variables

• Assume two values only, usually 0, 1
• Weather: good/bad
  – Good weather =1 if good, =0 otherwise
  – Passing grade/failing grade
  – Winning/losing the lottery
  •
                        Possible Combinations weather in
          Consider a simple setup: On average 80% of the time we have good
          summer time. Given that, what is the probability that next weekend we will have
          two good weather days and one bad weather day?
  •       ASSUMPTION: weather is independent from day to day! That is the probability of
          good weather on any day is 0.8, independent of the weather on the previous day
  •       One possibility: Friday – good; Saturday – good; Sunday – bad. What is the
          probability of that?
            – Pr(Friday=1)*Pr(Sat=1)*Pr(Sun=0)=0.8*0.8*0.2=0.128 * refer to Pr(“”=1) = p, Pr(“”=0)=(1-p)
                 •     p – probability of success
            – The above is just a particular draw.
  •       Another possibility: Friday – bad; Saturday – good; Sunday – good and etc….
               Using Factorials to determine the number of possible combinations
                              Drawing X objects from a set of n objects

                                    n!
                        n CX 
                               X !(n  X )!
           3!      3  2 1
3 C2                       3
       2!(3  2)! 2 1 (1)
      Fri        Sat        Sun




                                                                       
      G          G           B        p  p  (1  p)  p 2 (1  p)1                              Pr(2G)=0.128*3
      G          B           G        p  (1  p)  p  p (1  p)
                                                          2        1       p X  (1  p ) n  X
      B          G           G       (1  p)  p  p  p 2 (1  p)1
              Simple Practice
• What is the probability that we will have two
  bad days in a weekend?
• What is the probability that we will have AT
  LEAST two bad days in a weekend?
• What is the probability that two consecutive
  days in will be good?
                 Example 2
• What if there are only about 20 people in the
  US who are used as referees by journals that
  publish papers in a certain area of research?
  Each journal assigns 2 referees to a paper.
  What if you consider submitting two papers,
  one to each journal, what is the probability
  that at least one of the referees will be the
  same?
                 Binomial Distribution
                                    n!
                     Pr( X )               p X (1  p) n  X
                               X !(n  X )!

Pr(X) – probability of X successes drawn from the sample of n observations with
p representing the probability of success of each observation

Assumptions:
Probability p stays constant with each draw (effectively this assumes that the sample is
large)
Outcomes are independent
The variable is binary

                               E[ X ]  np
                            2  np(1  p)
Binomial Distribution Review Example
• Let’s say that historically it so happens that in
  the first 10 days of September it only rains for
  two days. What is the probability that it will
  not rain during the 4 day labor day weekend?
  – Note how the probability simplifies to the simple
    multiplication of individual probabilities
• What is the probability that it will not rain on
  any three of the four days of the weekend?
Continuous Distributions
Unsorted    Sorted
                           UNIFORM DISTRIBUTION
X           X             X    Count   PDF         CDF
        2             0    0     2      0.047619    0.047619
        4             0    1     2      0.047619    0.095238
        3             1    2     2      0.047619    0.142857
        1             1    3     2      0.047619    0.190476
                                                                     min max
        5
        7
                      2
                      2
                           4
                           5
                                 2
                                 2
                                        0.047619
                                        0.047619
                                                    0.238095
                                                    0.285714   
        8             3    6     2      0.047619    0.333333            2
        6             3    7     2      0.047619    0.380952
        9             4    8     2      0.047619    0.428571
       10             4    9     2      0.047619     0.47619       (max  min) 2
        0
       12
                      5
                      5
                          10
                          11
                                 2
                                 2
                                        0.047619
                                        0.047619
                                                     0.52381
                                                    0.571429
                                                                
                                                                 2

       11             6   12     2      0.047619    0.619048
                                                                        12
       15             6   13     2      0.047619    0.666667
       13             7   14     2      0.047619    0.714286
       14             7   15     2      0.047619    0.761905
       20             8   16     2      0.047619    0.809524
                                                                                   1
       18             8   17     2      0.047619    0.857143
                                                               PDF  f ( x)             , x  [min, max]
                                                                                max min
       19             9   18     2      0.047619    0.904762
       17             9   19     2      0.047619    0.952381
       16            10   20     2      0.047619           1
        2            10
        4            11
        3            11
        1            12
        5            12
        7            13
        8            13
        6            14
        9            14
       10            15
        0            15
       12            16
       11            16
       15            17
       13            17
       14            18
       20            18
       18            19
       19            19
       17            20
       16            20
             Normal Distribution
• Sometimes called Gaussian Distribution
• The Bell Curve
   – Symmetrical
   – Greater mass at the center
      • PDF diminishes as you move away from the center
• Measures of central tendency are equal
   – Mean, median, mode
• Interquartile Range is 4/3 standard deviations
   – 50% of all observations are contained between mean plus
     or minus 2/3 standard deviation
• Infinite Range
    Why Study Normal Distribution


• Central Limit Theorem Property
• Extensive use of the Normal Distribution in
  Statistics/Econometrics
 Standardizing the Normal Distribution
          Normal Distribution
                                         ( x )2
                                1      
P.D.F. (X)           f ( x)       e       2 2

                               2
                X ~ N (, 


   Standardized Normal Distribution
                     X 
                Z
                      
                                     1 1Z 2
                          f (Z )       e 2
                                     2
   P.D.F. (Z)



                Z ~ N (0,1)
  Computing Normal CDF, CDF as the
            probability
• In Excel use NORMDIST
   – NORMDIST(X, , , cumulative) = CDF (X)
      • TRUE if cumulative is computed (CDF up to the X value)
      • False if PDF is computed instead (PDF at X)
   – NORMINV ( prob, ,   = X
   – NORMSDIST (Z) = CDF (Z)
      • Computes the CDF of the standardized normal distribution
   – NORMSINV (probability) = Z
• For Standardized Normal Distribution
   – Use Textbook Z tables
   Example using the textbook Z-dist.
                tables
• Consider that X ~ N (10, 4)
  – What is the probability that we will draw at
    random a value of X that is less than 8?
     • Do it using the tables first
     • Do it using excel
  – What is the probability that we will draw at
    random a value of X that is greater than 11?
  – What is the probability that we will draw at
    random a value of X between 7 and 9?
      Simple examination of Data for
                Normality
                       B
Period_Year Period_Month UF_RetailAs%Total

      2009           8   13.15379
      2009           7   13.21556
      2009           6   13.20713            1 Comparison of mean, mode and
      2009           5   13.09604            median
      2009           4   13.29532
      2009           3   13.48186
                                             2 Kurtosis and Skewness
      2009           2   13.61643            2 Quantile-Quantile Plot
      2009           1    13.9316
      2008          12   14.12047
      2008          11    13.9948
      2008          10   13.57838
      2008           9   13.40963
      2008           8   13.45029
      2008           7   13.42442
      2008           6   13.44574
      2008           5    13.3493
      2008           4   13.33186
      2008           3   13.47661
      2008           2   13.49046
      2008           1    13.9232
      2007          12   14.25173
      2007          11     13.955
                    Confidence Interval
                                                N

                                              ( xi   ) 2
Distribution of X                  2     i 1
                                                    N


Distribution of the means
•Samples of size n
•Distribution of the sample means
•The means is the unbiased estimator of the population mean

                                         
Standard Error of the mean
                                  X 
                                            n

Central Limit Theorem implies:


  X   ~   N ( , X )
          Confidence Interval
                                            
         X z             X z
                     n                         n

Increased sample size concentrates the distribution around the
population mean
As the sample size approaches infinity (population size) the
distributional standard error approaches zero as the sample
becomes equal the population
                                Confidence Interval
 • General Form
     •If St. Dev. is known
                                                                x
        z  x    z                                 z
                                                              
      •If St. Dev. is unknown
                                                             xx
      x  tn 1S  x  x  t n 1S                        t
                                                              S

• Confidence Interval for the mean
    •Standard Error for the distribution of the means is used

                  S                   S
     x  t n 1         x  t n 1
                   n                   n
• As n increases the z and t distributions converge
• Increasing the sample size reduces the confidence interval
                           Sampling
• Probability versus non-probability sampling
   – For instance, if Alaska represents 0.5% of the US
     population, a US population sample should allocate a 0.5%
     weight to Alaska, in this case, each state can be considered
     as a strata, and a certain number (based on the
     representation in the population) of random draws can be
     drawn
   – Random sample
   – Stratified sample
      • Dividing data (strata)
          – Common characteristics, gender….
   – Clustering Data
      • Clusters represent the population
          – Geographical sub samples
              Sampling Problems
• Selection Bias
  – Not all of population is represented in the frame
  – Non-response in survey (omitted) bias
• Measurement Error
  – Function of the sample size
     • With sample size the standard error diminishes

				
DOCUMENT INFO
Shared By:
Categories:
Stats:
views:4
posted:11/18/2010
language:English
pages:66
Description: Oregon Property Tax Estimator document sample