Docstoc

Data and Scatter.xls

Document Sample
Data and Scatter.xls Powered By Docstoc
					For years, the Forbes magazine has put the Top 100 Celebrity Power List, where they ran
Earnings in Millions, Number of Press Clippings, Number of Magazine Covers and a few
Somehow they average all these explanatory variables together and get the top 100 list o
Now, I'm just as fascinated with these dazzling celebrites as the next person, so I though
interesting to regress their ranking and see how much they are explained by the things m
Whenever any normal (non-celebrity) person sees this ranking they ask, "Where did this
this really explain anything?" I thought it would be interesting to see if these variables re
ranking of celebrities. I thought that Earnings, Number of Press Clippings, and Number o
the variables most likely to explain these celebrities power ranking out of the ones provid
website. I also thought it would be interesting to look at gender and the role it plays in th
notice there are not very many women in the top 50. And, in an effort to keep my project
only look at the top 50 of these "powerful" celebrites, which is a large enough sample to
population of 100 celebrities. I hypothesize that the earnings of a celebrity per million wi
on the Celebrities power ranking.




                                                                                         The Effect of
Data from www.forbes.com: Forbes Celebrity 100 2001

                      Earnings      Press    Magazine Gender                   60
Rank      Name       in millions    Clips     Covers m=0, f=1
                                                                               50
          Tom
    1     Cruise         43.2      11,715         11    0                      40
          Tiger                                                                30
                                                                     Ranking




    2     Woods           53       47,149         5     0                      20
    3     Beatles         70       26,142         1     0
                                                                               10
          Britney
    4     Spears         38.5      19,607         5     1                       0
                                                                                     0
                                                                               -10
                                                       0
     Bruce                                    -10
5    Willis     70     8,841    2   0         -20
     Michael
6    Jordan     37     28,350   1   0
     Backstr
     eet                                This scatter plot shows
7    Boys       35.5   11,666   3   0   celebrity ranks on the p
8    'N Sync     42    12,506   5   0   some celebrites the mo
     Oprah                              closer their rating is to b
9    Winfrey    150    9,495    0   1   line is negative, implyin
     Mel                                closer the celebrity will
                                        appears to be several o
10   Gibson     31.8   9,591    4   0
                                        trend line. So, outliers
     Mike
11   Tyson      48     15,770   0   0
     George
12   Lucas      250    4,002    0   0
     Stephen
13   King       44     6,747    1   0
     Steven
     Spielber                                     60
14   g          51     10,950   1   0             50
     Michael
     Schuma                                       40




                                        Ranking
15   cher       59     8,595    0   0             30
     Julia                                        20
16   Roberts    18.9   10,422   7   1
                                                  10
     Shaquill                                     0
17   e O'Neal   24     21,380   2   0                  0
     Metallic
18   a          28     5,077    0   0
     Eddie
19   Murphy     39.5   4,689    0   0
     J.K.                               This scatter plot shows
20   Rowling     36    3,109    0   1   celebrity's power ranki
                                        because the relationsh
21   Dr. Dre    31.5   7,157    1   0
                                        the celebrity is to being
     Regis
                                        1. The basic trend see
22   Philbin    35     10,133   0   0
     David                              the closer they are to n
     Copperfi                           variability and several
23   eld        60     2,010    0   0   seemed to be logartihm
     David                              logs and run a regress
24   Letterm    20     12,576   2   0   explain my data better
     Kobe
25   Bryant      20     15,554   2   0
     Rosie
     O'Donn
26   ell         25     7,207    2   1
     Tina
27   Turner      31     5,183    0   1
     Rush                                          60
     Limbaug
28   h           31     3,486    0   0             50
                                                   40
29   Brad Pitt   23.8   6,814    2   0




                                         Ranking
                                                   30
     Tom
30   Clancy      37     2,761    0   0             20
     Howard                                        10
31   Stern       30     3,988    0   0
     Nicolas                                        0
32   Cage        28.4   4,823    0   0                   0
                                                   -10
     Dixie
33   Chicks      25     5,739    1   1
     Jennifer
34   Lopez       14.4   9,109    5   1
                                         This scatter plot shows the
     Dale
                                         of a Celebrity. The slope i
     Earnhar                             that the higher the salary,
35   dt          24.5   11,592   0   0   shows that as the celebrity
     Keanu                               There seems to be curvatu
36   Reeves      25.5   3,942    0   0   logarithmic also. There als
     Grant                               line. Later in the project w
37   Hill        26     11,187   0   0   check for outliers.
     Lennox
38   Lewis       23     9,807    0   0

     John
39   Grisham     28     3,959    0   0
     Martin
     Lawrenc
40   e           33.2   2,560    0   0
     Jay
41   Leno        17     9,482    0   0
     Siegfrie
42   d & Roy     50      697     0   0
     Andre
43   Agassi      17.5   16,390   0   0
     Ben
44   Affleck    18.3   5,653    1   0

     Robin
45   Williams   17.1   6,075    0   0
     Brian
     Grazer/
     Ron
46   Howard      45     554     0   0
47   Kiss       24.5   28,386   0   0
     Arnold
48   Palmer     18     7,487    0   0
     Ridley
49   Scott      26.2   3,014    0   0
     Oscar
     De Le
50   Hoya       23     4,403    0   0
wer List, where they rank celebrities by their
azine Covers and a few other categories..
 nd get the top 100 list of powerful celebrities.
 ext person, so I thought that it would be
xplained by the things mentioned above.
ey ask, "Where did this come from?" and "Does
 see if these variables really do explain the power
Clippings, and Number of Magazine Covers were
 g out of the ones provided to me on the Forbes
 nd the role it plays in the power ranking. If you
ffort to keep my project manageable, I decided to
arge enough sample to represent the whole
  celebrity per million will have the greatest effect




           The Effect of Magazine Covers on Power Ranking




                                                        Magazine Covers

                                                        Linear (Magazine
                                                        Covers)


                    5           10          15
                   5           10          15



            Number of Magazine Covers



This scatter plot shows the effect of Magazine Covers on how a
celebrity ranks on the power scale. A trend can be seen that for
some celebrites the more magazine covers that they have the
closer their rating is to being number 1. The slope of the trend
line is negative, implying that the more magazine covers, the
closer the celebrity will be to the number one ranking. But, there
appears to be several outliers in this data, making it hard to fit a
trend line. So, outliers will need to be tested for this data.




           The effect of Press Clips on Power Ranking




                                                         Press Clips
                                                         Log. (Press Clips)




       0          20,000        40,000          60,000
              Number of Press Clipings




This scatter plot shows the effect of the number of Press Clips on a
celebrity's power ranking. The slope is negative, which makes sense,
because the relationship should be the more press clips, the closer
the celebrity is to being at the top of the power list, that being number
1. The basic trend seems to be the more press clips on the celebrity
the closer they are to number one. There seems to be quite a bit of
variability and several outliers. The best fit line for this scatter plot
seemed to be logartihmic, so I will convert the data for press clips to
logs and run a regression later in the project to see if that helps to
explain my data better. Press clips will also be checked for outliers.
               The effect on earnings on Power Ranking




                                                             Earnings in millions

                                                             Log. (Earnings in
                                                             millions)



                    100             200         300

                   Earnings (in millions)



 his scatter plot shows the effect on Earnings in millions on the Power Rank
 f a Celebrity. The slope is negative because the relationship should be
hat the higher the salary, the higher the power ranking. The trend line
 hows that as the celebrity earns more money the closer they are to one.
 here seems to be curvature in the graph, and the best fit line seemed to be
ogarithmic also. There also seem to be outliers that are upsetting the trend
ne. Later in the project we will calculate the log of Earnings and we will
 heck for outliers.
11,715

47,149

26,142

19,607

8,841

28,350



11,666



12,506

9,495



9,591

15,770

4,002

6,747
10,950

8,595


10,422

21,380
5,077



4,689

3,109



7,157


10,133

2,010

12,576

15,554

7,207

5,183



3,486

6,814

2,761

3,988



4,823



5,739

9,109

11,592

3,942
11,187



9,807




3,959
2,560

9,482

 697



16,390

 5,653
 6,075
  554
28,386
 7,487
 3,014
 4,403
43.2
 53

 70

38.5



 70

  37
35.5
  42
 150
31.8
  48
 250
  44
  51
  59
18.9
  24
  28
39.5
  36
31.5
  35
  60
  20
  20
  25
  31
  31
23.8
  37
  30
28.4
  25
14.4
24.5
25.5
  26
  23
  28
33.2
  17
  50
17.5
18.3
17.1
  45
24.5
  18
26.2
  23
11   1

5    2

1    3



5    4



2    5

1    6



3    7

5    8

0    9

4    10
0    11

0    12


1    13

1    14
0   15



7   16

2   17



0   18


0   19

0   20

1   21

0   22

0   23

2   24



2   25

2   26

0   27

0   28



2   29



0   30

0   31

0   32

1   33
5   34



0   35




0   36
0   37

0   38

0   39



0   40

0   41
0   42
0   43
1   44
0   45
0   46
0   47
0   48
0   49
0   50
RESIDUAL OUTPUT
Observation Predicted Rank     Residuals Standard Residuals
           1   -4.196714645    5.196714645      0.518937435
           2   -5.529570889    7.529570889      0.751893546
           3    13.46347981   -10.46347981     -1.044870026
           4    9.182997449   -5.182997449     -0.517567653
           5    18.94110681   -13.94110681     -1.392141516
           6     18.0780916    -12.0780916     -1.206103144
           7    20.58552173   -13.58552173     -1.356633234
           8    13.19926818   -5.199268179     -0.519192427
           9    9.524224834   -0.524224834     -0.052348437
          10    19.30253125   -9.302531248     -0.928939153
          11     25.2237989    -14.2237989     -1.420370795
          12   -3.872394215    15.87239421      1.584997464
          13    27.36854184   -14.36854184     -1.434824644
          14    24.11987197   -10.11987197     -1.010557777
          15    26.81430125   -11.81430125     -1.179761369
          16    11.16616326    4.833836739         0.4827009
          17    20.77641333   -3.776413335      -0.37710792
          18    33.86647744   -15.86647744     -1.584406622
          19    32.07300661   -13.07300661     -1.305454114
          20    32.27482863   -12.27482863     -1.225749058
          21    29.32367069   -8.323670691      -0.83119136
          22    30.20333537   -8.203335371     -0.819174826
          23    29.84160104   -6.841601043     -0.683193736
          24    25.74366295   -1.743662953     -0.174120005
          25    24.29665452    0.703345482      0.070235202
          26    26.32160146   -0.321601461      -0.03211472
          27    32.12881231   -5.128812305     -0.512156792
           28     34.12249997    -6.122499967          -0.611385201
           29     27.88849367     1.111506334           0.110993634
           30     33.44069019    -3.440690186          -0.343583025
           31     34.05092627    -3.050926274          -0.304661688
           32     33.92095676    -1.920956762          -0.191824343
           33     29.96382044     3.036179561           0.303189099
           34     18.43755162     15.56244838           1.554046661
           35     31.30406013     3.695939874            0.36907194
           36     34.84884279     1.151157215           0.114953122
           37     31.24232767     5.757672327           0.574953968
           38     32.42991228     5.570087718           0.556222003
           39     34.40971297     4.590287026            0.45838033
           40     34.19327764      5.80672236           0.579852043
           41     33.62191644     7.378083559            0.73676621
           42     32.20306505     9.796934949           0.978309689
           43     30.17914943     12.82085057           1.280274127
           44     32.32945447     11.67054553           1.165406102
           45     35.26014094     9.739859059           0.972610162
           46     33.13428771     12.86571229           1.284753962
           47     23.14386548     23.85613452           2.382243801
           48     34.41893795     13.58106205           1.356187896
           49     35.17911397     13.82088603           1.380136419
           50     35.05571268     14.94428732           1.492317869

Now, that we have identified some of the residuals we could
elimate them all together or we could turn that data set into
binary numbers. Both options do not seem to be viable options
for this project because the outliers were not the same in every
data set and would do little to help explain the Power Ranking of
Celebrities.




  Even though there are outliers that may be causing the trend line to look more loga
  am going to convert Earnings and Press Clips into logs so as to control for curvatu
  make the graphs more linear. I will only convert the X values to log, so I will use wh
  calls semi-log. This may or may not have any impact on the regression. I cannot c
  Magazine covers and gender to logarithms because there are X variables of 0. It is
  possible to take the log of zero.

Earnings        Log earnings    Rank
43.2   1.635483747    1
 53     1.72427587    2
 70     1.84509804    3
38.5    1.58546073    4                           Effect of Log Earnings
 70     1.84509804    5
 37    1.568201724    6                   60
35.5   1.550228353    7                   50
 42     1.62324929    8




                          Power Ranking
                                          40
150    2.176091259    9
31.8    1.50242712   10                   30
 48    1.681241237   11                   20
250    2.397940009   12                   10
 44    1.643452676   13
                                           0
 51    1.707570176   14
 59    1.770852012   15                   -10 0         1

18.9   1.276461804   16                                 Log Earnings
 24    1.380211242   17
 28    1.447158031   18
39.5   1.596597096   19              By taking the log of the data for E
 36    1.556302501   20              and then re-graphing the data, I c
                                     has become more linear and easi
31.5   1.498310554   21
                                     slope is still negative, but the rea
 35    1.544068044   22
                                     increased effect on Power Rankin
 60     1.77815125   23              increased Earnings in Millions is
 20    1.301029996   24              although there are still a few outl
 20    1.301029996   25
 25    1.397940009   26
 31    1.491361694   27
 31    1.491361694   28
23.8   1.376576957   29
 37    1.568201724   30
 30    1.477121255   31
28.4    1.45331834   32
 25    1.397940009   33
14.4   1.158362492   34
24.5   1.389166084   35
25.5    1.40654018   36
 26    1.414973348   37
 23    1.361727836   38
 28    1.447158031   39
33.2   1.521138084   40
 17    1.230448921   41
 50    1.698970004   42
   17.5        1.243038049       43
   18.3         1.26245109       44
   17.1         1.23299611       45
    45         1.653212514       46
   24.5        1.389166084       47
    18         1.255272505       48
   26.2        1.418301291       49
    23         1.361727836       50




Press Clips   Log of Clipping   Rank
  11,715         4.068742293      1
  47,149         4.673472486      2
                                                                Effect of Log Press Clip
  26,142           4.41733881     3
  19,607         4.292411149      4
   8,841                          5                    60
                 3.946501391
  28,350         4.452553063      6                    50




                                       Power Ranking
  11,666         4.066921972      7                    40
  12,506         4.097118424      8
                                                       30
   9,495         3.977494969      9
   9,591         3.981863891     10                    20
  15,770         4.197831693     11                    10
   4,002         3.602277084     12
                                                        0
   6,747           3.82911071    13                         0        1
  10,950         4.039414119     14
   8,595         3.934245881     15                                      Log of Press Cl
  10,422         4.017951069     16
  21,380         4.330007701     17
   5,077         3.705607163     18                    In this scatter plot, I can see th
   4,689         3.671080233     19                    to logarithmic functions, did in
                                                       helped the graph to become mo
   3,109         3.492620722     20
                                                       clips before taking the log show
   7,157         3.854731017     21
                                                       graph may seem to show a littl
  10,133         4.005738043     22                    still present.
   2,010         3.303196057     23
  12,576         4.099542529     24
  15,554         4.191842095     25
   7,207         3.857754522     26
   5,183         3.714581209     27
   3,486         3.542327383     28
 6,814   3.833402129   29
 2,761   3.441066407   30
 3,988    3.60075515   31
 4,823   3.683317262   32
 5,739   3.758836225   33
 9,109   3.959470702   34
11,592   4.064158372   35
 3,942    3.59571662   36
11,187   4.048713638   37
 9,807   3.991536175   38
 3,959   3.597585502   39
 2,560   3.408239965   40
 9,482   3.976899951   41
  697    2.843232778   42
16,390   4.214578954   43
 5,653   3.752278985   44
 6,075   3.783546282   45
  554    2.743509765   46
28,386   4.453104198   47
 7,487   3.874307833   48
 3,014   3.479143248   49
 4,403   3.643748685   50
                                                            There do not seem to be any outl
                                                            for the gender residual plot, nor s
                                                            there be any since it is a binary
                      Gender Residual Plot                  variable.

               30
               20
  Residuals




               10
                0
              -10 0         0.5               1       1.5
              -20                                            Jennifer Lopz
                                  Gender
                                                            There are quite a few outliers in
                                                            that increase its variability. And
                      Covers Residual Plot                  a few on the residual plot for Ma
                                                            Covers are Jennifer Lopez with 5
                                                            covers and only a rank of 35, an
               30                                           Robers with 7 magazine covers
               25
               20                                           rank of only 16. Most celebrites
               15                                           ranking than them have 2 or less
Residuals




               10                                           covers.
                5
                0
               -5 0         5              10        15
              -10
              -15                                                Julia Roberts
              -20
                            Magazine Covers
                                                                  KISS




                      Clips Residual Plot                       In the Press Clips Residual P
                                                                variablity. There are several
              30                                                28,386 press clips with a ran
                                                  Andre
                          30                                                                 28,386 press clips with a ran
                                                                            Andre
                          25                                                                 Agassi with 16,390 press clip
                                                                            Agassi
                          20                                                                 There is also Tiger Woods wi
                          15
             Residuals


                                                                                             clips and his rank is number
                          10                                                                 have less than 14,000 press
                           5
                           0
                          -5 0     10,000    20,000   30,000       40,000    50,000
                         -10
                         -15
                         -20
                                                                                         Tiger Woods
                                                   Clips



                                    Earnings Residual Plot                                  In the Earnings Residual Plot,
                                                                                            major outliers that are causing
                         30                                                                 skewed or varied. There is G
                                                                                            with 250 million dollars last ye
                         20                                                                 ranked only 12, and then there
             Residuals




                         10                                                                 Winfrey who earned 150 millio
                                                                                            is ranked only 9.
                          0
                               0            100              200               300
                         -10                                                          George Lucas
                         -20
                                                  Earnings


                                                                      Oprah




line to look more logarmithmic, I
 to control for curvature and
to log, so I will use what Excel
egression. I cannot convert
 X variables of 0. It is not
Effect of Log Earnings on Power Ranking




                                       Log Earnings
                                       Linear (Log Earnings)




                  2           3
      Log Earnings


he log of the data for Earnings in Millions
  graphing the data, I can see that the graph
e more linear and easier to interpret. The
 l negative, but the reationship between the
 ffect on Power Ranking based on
  arnings in Millions is easier to see,
 ere are still a few outliers.
ffect of Log Press Clippings on Power Ranking




                                                Press Clippings

                                                Linear (Press
                                                Clippings)



             2        3        4        5
      Log of Press Clippings



atter plot, I can see that the converting Press Clips
hmic functions, did in fact control for curvture and
e graph to become more linear. The graph of press
 re taking the log showed some variability, but this
y seem to show a little bit more, and outliers are
 ere do not seem to be any outliers
 the gender residual plot, nor should
ere be any since it is a binary




ennifer Lopz


here are quite a few outliers in the data
 at increase its variability. And example of
 few on the residual plot for Magazine
overs are Jennifer Lopez with 5 magazine
overs and only a rank of 35, and Julia
obers with 7 magazine covers and the
 nk of only 16. Most celebrites with higher
 nking than them have 2 or less magazine




   Julia Roberts




  In the Press Clips Residual Plot it shows much
  variablity. There are several outliers. Kiss has
  28,386 press clips with a rank of 47. And, Andre
28,386 press clips with a rank of 47. And, Andre
Agassi with 16,390 press clips and a rank of 43.
There is also Tiger Woods with 47,129 press
clips and his rank is number 2. Most celebrites
have less than 14,000 press clips.

                                                   These are some of the most powerful
                                                   people in the world!???? SCARY!




In the Earnings Residual Plot, it shows only 2
major outliers that are causing the data to be
skewed or varied. There is George Lucas
with 250 million dollars last year and he his
ranked only 12, and then there is Oprah
Winfrey who earned 150 million last year and
is ranked only 9.
                                                    When examining the regression, I first looked
                                                    is 0.486131057, meaning that the explanatory
SUMMARY OUTPUT                                      ranking of celebrities. I used the adjusted R S
                                                    explanatory variable. The Significance F is 5.
       Regression Statistics                        indicated that this model is statistically signific
Multiple R            0.726690816                   our regression model to be fairly accurate. My
                                                    less than 0.05, which means at the 0.05 level
R Square              0.528079542
                                                    different from zero.
Adjusted R Square     0.486131057
Standard Error        10.44974404
Observations                   50

ANOVA
                               df              SS            MS        F
Regression                            4      5498.62823     1374.7 12.5887631
Residual                             45      4913.87177      109.2
Total                                49         10412.5

                         Coefficients Standard Error        t Stat     P-value
Intercept                 41.15912737    2.920398184        14.094    4.0829E-18
Earnings                 -0.172347809    0.040484706       -4.2571    0.00010384
Clips                    -0.000485899    0.000189422       -2.5652    0.01371704
Covers                   -2.928918648    0.743870326       -3.9374    0.00028326
Gender                   -1.169116368    4.212586323       -0.2775    0.78264472



Equation: Power Ranking = 41.15912737-0.17234781(Earnings) - 0.0004859(Pre
                   - 2.92891865 (Magazine Covers) - 1.16911637(Gender)
            The p-value for Earnings is less than 0.05 which means that as earnings increase it wi
            ranking to decrease (come closer to one) by 0.1723. Earnings has the smallest p
            the variable that most explains the power ranking.
            The p-value for press clips is less than 0.05. As the number of press clippings increas
            cause the the ranking of a celebrity to decrease (closer to one) by 0.0004959.
         cause the the ranking of a celebrity to decrease (closer to one) by 0.0004959.
         The p-value of magazine covers is also less than 0.05, meaning that as the number of
         covers increase, the ranking of the celebrity will decrease (again, closer to one) by 2.9
         The p-value for gender is greater than 0.05, so at this level, it means that it has no sta
         significant barrier on the power ranking of celebrities.




                                                    Since there did appear to be curvature in my
                                                    take the log of two of our explanatory variabl
                                                    above without the log values only explained
                                                    ranking, I decided to take another regression
                                                    Earnings and Magazine Covers.
SUMMARY OUTPUT
                                             When examining the regression, I first looked at the
       Regression Statistics                 meaning that the explanatory variables explain 72%
Multiple R            0.864423018            higher percent explained than in the regression use
R Square              0.747227154            than 0.05. This indicated that this model is statistica
Adjusted R Square     0.724758456            regression model to be fairly accurate. The Significa
Standard Error           7.6477989           believe that I might have more confidence in this mo
                                             except for Gender, less than 0.05, which means at
Observations                    50
                                             from zero.
ANOVA
                            df               SS             MS        F
Regression                          4     7780.502739      1945.1 33.2563628
Residual                           45     2631.997261      58.489
Total                              49         10412.5

                       Coefficients Standard Error         t Stat    P-value
Intercept                  147.47781   14.78077979         9.9777   5.5773E-13
Log earnings            -39.6358063    4.743043213        -8.3566   1.0494E-10
log clips              -15.06270601    3.226425648        -4.6685   2.7493E-05
Covers                 -2.518800882    0.547950385        -4.5968   3.4757E-05
Gender                 -1.902418836    3.049351464        -0.6239   0.53585863

Equation: 147.47781-39.6358063(log earnings)-15.062706(log press clips)
                 -2.51880088(Magazine Covers-1.90241884(Gender)
 The p-values here are much more statistically significant than the ones in the regression above
The p-values here are much more statistically significant than the ones in the regression abov
much more of an impact on my coefficents of Log earnings, log clips, magazine covers, and ge

The p-value for log earnings is much less than 0.05, so therefore we can conclude that an incr
a 39.636 decrease in power ranking, making it much closer to one. This is a huge jump from t
the regression that did not take the log of earnings. The p-value for earnings in this regression
causing it to have the most effect on power ranking.
The p-value for log press clips is also much less than 0.05, and we can conclude that an incre
cause a 15.063 decrease in power ranking, making it closer to the number one ranking. This a
the regression above.
The p-value for Magazine Covers is still less than 0.05, and it still has about the same coefficie
in magazine covers will have a 2.51 decrease in power ranking (making it closer to one).
The p-value for Gender is still greater than 0.05, in this regression much greater than 0.05 and
statistical significance.
 he regression, I first looked at the Adjusted R Square term which
meaning that the explanatory variables explain 49% of the power
 es. I used the adjusted R Square because I have more than one
  le. The Significance F is 5.91E-07 which is less than 0.05. This
 model is statistically significant and we can have confidence in
 del to be fairly accurate. My p-values are all, except for Gender,
 ich means at the 0.05 level they are statistically significantly




             Significance F
               5.91595E-07




              Lower 95%    Upper 95% Lower 95.0% Upper 95.0%
               35.27714374 47.04111099 35.2771437 47.04111099
              -0.253888189 -0.09080743 -0.25388819 -0.09080743
              -0.000867416 -0.00010438 -0.00086742 -0.00010438
              -4.427150327 -1.43068697 -4.42715033 -1.43068697
               -9.65370039 7.315467655 -9.65370039 7.315467655



  rnings) - 0.0004859(Press Clips)
 .16911637(Gender)
 at as earnings increase it will cause the
 ings has the smallest p-value, so it is

 er of press clippings increase it will
 one) by 0.0004959.
one) by 0.0004959.
 aning that as the number of magazine
(again, closer to one) by 2.9289.
 , it means that it has no statistically




 ppear to be curvature in my scatter plots and we did
wo of our explanatory variables and the regression
 e log values only explained 49% of the celebrity power
 d to take another regression using the log values for
agazine Covers.


gression, I first looked at the Adjusted R-Square term which is 0.72478456,
natory variables explain 72% of the power ranking of celebrities. This is a much
 d than in the regression used above. The Significance F is 6.49E-13 which is less
d that this model is statistically significant and we can have confidence in our
 fairly accurate. The Significance F is much lower in this regression, causing me to
e more confidence in this model than the one above. My p-values, again, are all,
  than 0.05, which means at the 0.05 level they are statistically significantly different



            Significance F
              6.48876E-13




              Lower 95%    Upper 95% Lower 95.0% Upper 95.0%
               117.7077926 177.2478273 117.707793 177.2478273
               -49.1887853 -30.0828273 -49.1887853 -30.0828273
              -21.56106056 -8.56435146 -21.5610606 -8.56435146
              -3.622429561   -1.4151722 -3.62242956  -1.4151722
              -8.044127688 4.239290016 -8.04412769 4.239290016

 6(log press clips)
1884(Gender)
ones in the regression above not using logs, and have
ones in the regression above not using logs, and have
ps, magazine covers, and gender.

we can conclude that an increase in earnings will cause
   This is a huge jump from the miniscule coefficient in
or earnings in this regression is again the lower p-value

  can conclude that an increase in press clips will
 number one ranking. This again, is a huge jump from

has about the same coefficient as before. An increase
aking it closer to one).
 much greater than 0.05 and therefore has no
2   1
0   1
0   0
2   0
0   0
0   0
0   0
1   1
5   1
0   0
0   0
0   0
0   0
0   0
0   0
0   0
0   0
0   0
1   0
0   0
0   0
Now when looking at Hollywood Power Lists, at least
the ranking with some confidence. The explanatory v
ranking. The regression without using logs explained
taken when using the log of Earnings and Press Clips
Earnings, Press Clips, and Magazine Covers of celebr
ranking of celebrities in these Hollywood Power LIsts
a little miffed that only 5 or 6 women were on the list a
significance, did not seem to have any affect on expla
still think "the list" is a little gender biased).
There were a few problems with this project, there see
sets, especially the data set for press clips, that was v
model of some of the data in per capita terms might h
and explanatory variables. Because of some of the ex
curvature of the graph. I chose to use the semi
might also have been used, although, I did not have m
how people were ranked is still unclear...How did Tom
the number one rating in magazine covers? How in th
earned 250 million dollars a year!? (That is A LOT of m
that were available to me, such as web hits, and tv an
believe them to have a great explanatory effect on the
All in all, most of my hypotheses at the beginning of t
analysis. Earnings, magazine covers, and press clipp
power list a celebrity is ranked, and all of the outcome
thought gender might play a role, but my suspicions w
My hypothesis that Earnings would have the greatest
regression analysis using log values and in the regre
like George Lucas with 250 million dollars and Oprah
the top 15, it would seem from my regression analysis
with a measly 43 million would have to take a back se
cannot account for the factor of attractiveness, and m
ranking!
  ywood Power Lists, at least the Forbes Hollywood Power list, we can look at
 nfidence. The explanatory variables seemed to have some impact on the
 without using logs explained almost 50% of the ranking and the regression
  of Earnings and Press Clips explained almost 75% of the ranking. So,
 d Magazine Covers of celebrities, seem to have a pretty vital impact on the
hese Hollywood Power LIsts. Gender, the variable that I put in because I was
or 6 women were on the list and I hoped it would have some statistical
m to have any affect on explaining the power ranking of celebrities (although I
 tle gender biased).
ms with this project, there seemed to be heteroscedacity in some of the data
set for press clips, that was very hard to control for, so perhaps recasting the
a in per capita terms might have helped to increase the validity of the data set,
s. Because of some of the extreme variablity, it was difficult to control the
 chose to use the semi-log to control for the curvature, but the quadratic form
ed, although, I did not have much success with that method. And, some of
 is still unclear...How did Tom Cruise get to be number one, when he only had
 magazine covers? How in the world is George Lucas number 12 when he
s a year!? (That is A LOT of money!) There were other explantory variables
   such as web hits, and tv and radio hits per celebrity, but I truly did not
 eat explanatory effect on the power ranking.
otheses at the beginning of the project were confirmed by my regression
azine covers, and press clippings all do have an effect on how high on the
anked, and all of the outcomes of the regressions, logically make sense. I
  y a role, but my suspicions were not confirmed by the regression analysis.
ngs would have the greatest effect on the power ranking was confirmed in the
g log values and in the regression without using log values. Although, people
50 million dollars and Oprah with 150 million dollars are year, were certainly in
  from my regression analysis that they would be higher up and Tom Cruise
would have to take a back seat to these high powered stars. But, I guess one
 ctor of attractiveness, and maybe that's how Tom Cuise got his number one

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:1
posted:8/2/2012
language:
pages:115
suchufp suchufp http://
About