Powerpoint

An Intelligence Approach to Evaluation of Sports Teams

You must be logged in to download this document
Reviews
Shared by: sammyc2007
Stats
views:
75
rating:
not rated
reviews:
0
posted:
5/28/2008
language:
English
pages:
0
An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D. 1 Agenda I. II. III. IV. V. VI. VII. College Football Linear Model Generalized Linear Model Intelligence (Bayesian) Approach Results Other Sports Future Work General Background Goals Forecast winners of future games Beat the Bookie! Estimate the outcome of unscheduled games What’s the probability that Iowa would have beaten Ohio St? Generate reasonable rankings Major College Football No playoff system “Computer rankings” are an element of the BCS 114 teams 12 games for each in a season Linear Model Rothman (1970’s), Harville (1977), Stefani (1977), …, Kambour (1991), …, Sagarin??? Response, Y, is the net result (point-spread) Parameter, , is the vector of ratings For a game involving teams i and j, E[Y] = i - j Linear Model (cont.) Let X be a row vector with if k  i 1  X k  1 if k  j  0 otherwise  E[Y]=X Regression Model Notes Least Squares  Normality, Homogeneity College Football Estimate 100 parameters Sample size for a full season is about 600 Design Matrix is sparse and not full rank Home-field Advantage Generic Advantage (Stefani, 1980) Force i to be home team and j the visiting team Add an intercept term to X Adds one more parameter to estimate UAB = Alabama Rice = Texas A&M Team Specific Advantage Doubles the number of parameters to estimate Linear Model Issues Normality Homogeneity Lots of parameters, with relatively small sample size Overfitting The bookie takes you to the cleaners! Linear Model Issues (cont.) Should we model point differential A and B play twice A by 34 in first, B by 14 in the second A by 10 each time Running up the score (or lack thereof) BCS: Thou shalt not use margin of victory in thy ratings! Logistic Regression Rothman (1970s) Linear Model Use binary variable Winning is all that matters Avoid margin of victory Coin Flips Logistic Regression Issues Still have sample size issues Throw away a lot of information Undefeated teams Transformations Transform the differentials to normality Power transformations Rothman logistic transform Transforms points to probabilities for logistic regression “Diminishing returns” transforms Downweights runaway scores Power Transforms Transform the point-spread  Y = sign(Z)|Z|a a = 1  straight margin of victory a = 0  just win baby a = 0  Poisson or Gamma “ish” Maximum Likelihood Transform 1995-2002 seasons Power 0.1 0.3 0.5 -2ln(likelihood) 52487 41213 35128 0.67 0.8 1 32597 31418 31193 MLE = 0.98 Predicting the Score Model point differential Additionally model the sum of the points scored  Y2 = Si + Sj Fit a similar linear model (different parameter estimates)  Y1 = Si – Sj Forecast home and visitors score H = (Y1 + Y2 )/2, V = (Y2 - Y1)/2 Another Transformation Idea Scores (touchdowns or field goals) are arrivals, maybe Poisson Final score = 7 times a Poisson + 3 times a Poisson + … Transform the scores to homogeneity and normality first The differences (and sums) should follow suit Square Root Transform Since the score is “similar” to a linear combination of Poissons, square root should work Transformation T  S k  Why k? For small Poisson arrival rates, get better performance (Anscombe, 1948) Likelihood Test LRT: No transformation vs. square root with fitted k Used College Football results from 1995-2002 k = 21 Transformation was significantly better p-value = 0.0023, chi-square = 9.26 Predicting the Score with Transform Model point differential  Y1  Si  21  S j  21 Additionally model the sum of the points scored  Y2  Si  21  S j  21 Forecast home and visitors score H = ((Y1 + Y2 )/2)2 , V = ((Y2 - Y1)/2)2 Note the point differential is the product Unresolved Linear Model Issues Overfitting History Going into the season, we have a good idea as to how teams will do The best teams tend to stay the best The worst teams tend to stay the worst Changes happen Kansas State Intelligence Model Concept The ratings and home-ads for year t are similar to those of year t-1. There is some drift from one year to the next. Model   t   t 1   t where  t ~ N(0,  2 ) Intelligence Model (Details) Notation  L teams  M seasons of data  Ni games in the ith season Xi : the Ni by 2L “X” matrix for season i Yi : the Ni vector of results for season i  i : the Ni vector of results for season I Details (cont.) Data Distribution: For all i = 1, 2, …, M  Yi  N  Xi  i ,  2  (independent) Details (cont.) Prior Distribution  I 0  2  1   N  0,     0 0.05I   2  0  2 0.25I  i   N   i 1 ,     for i  2,..., M 0.01I    0   2    2,0.5  2 Details (finally, the end) The Posterior Distribution of M and -2 is closed form and can be calculated by an iterative method The Predictive Distribution for future results (transformed sum or difference) is straightforward correlated normal (given the variance) Forecasts For Scores Simply untransform E[Z2] = Var[Z] + E[Z]2 For the point-spread Product of two normals Simulate 10000 results Enhanced Model Fit the prior parameters Hierarchical models Drifts and initial variances No closed form for posterior and predictive distributions (at least as far as I know) The complete conditionals are straight-forward, so Gibbs sampling will work (eventually) (www.geocities.com/kambour/football.html) Results Home 0.21 (0.04) 0.44 (0.03) 0.04 (0.03) 2002 Final Rankings Team Miami Kansas St USC Rating 72.23 (1.03) 72.04 (1.04) 71.95 (1.03) Oklahoma Texas Georgia Alabama Iowa Florida St Virginia Tech Ohio St 71.85 (1.02) 71.57 (1.03) 71.49 (1.03) 71.45 (1.03) 71.30 (1.03) 71.29 (1.02) 71.25 (1.03) 71.18 (1.03) 0.18 (0.03) 0.36 (0.03) 0.02 (0.03) -0.09 (0.03) 0.21 (0.04) 0.43 (0.03) 0.12 (0.03) 0.27 (0.03) Results 2002 Final Rankings Team Miami Kansas St USC Rating 72.23 72.04 71.95 Home 0.21 0.44 0.04 Oklahoma Texas Georgia Alabama Iowa Florida St Virginia Tech Ohio St 71.85 71.57 71.49 71.45 71.30 71.29 71.25 71.18 0.18 0.36 0.02 -0.09 0.21 0.43 0.12 0.27 Results 2002 Final Rankings Team Miami Kansas St USC Rating 72.23 72.04 71.95 Home 0.21 0.44 0.04 Oklahoma Texas Georgia Alabama Iowa Florida St Virginia Tech Ohio St 71.85 71.57 71.49 71.45 71.30 71.29 71.25 71.18 0.18 0.36 0.02 -0.09 0.21 0.43 0.12 0.27 Bowl Predictions Ohio St Miami Fl (-13) 17 31 0.8255 0.7347 0.7174 0.5228 0.5797 0.5721 Washington St Oklahoma (-6.5) Iowa USC (-6) NC State (E) Notre Dame Florida St (+4) Georgia 21 31 21 30 20 17 24 27 0.5639 0.5719 0.5639 0.5320 2002 Final Record Picking Winners 522 – 157 0.769 Against the Vegas lines 367 – 307 – 5 0.544 0.563 Best Bets 9 – 7 In 2001, 11 - 4 ESPN College Pick’em (http://games.espn.go.com/cpickem/leader) 1. 2. 3. 4. 5. 6. 7. Barry Schultz Jim Dobbs Michael Reeves Fup Biz Joe * Rising Cream Intelligence Ratings 5830 5687 5651 5594 5587 5562 5559 Ratings System Comparison (http://tbeck.freeshell.org/fb/awards2002.html) Todd Beck Ph.D. Statistician Rush Institute  Intelligence Ratings – Best Predictors College Football Conclusions Can forecast the outcome of games Capture the random nature High variability Sparse design Scientists should avoid BCS Statistical significance is impossible Problem Complexity Other issues NFL Similar to College Football Square root transform is applicable Drift is a little higher than College Football Better design matrix Small sample size Playoff (www.geocities.com/kambour/NFL.html) NFL Results Home 0.29 0.28 0.10 2002 Final Rankings (after the Super Bowl) Team Tampa Bay Oakland Philadelphia Rating 70.72 70.57 70.55 New England Atlanta NY Jets Pittsburgh Green Bay Kansas City Denver Miami 70.16 70.13 70.10 69.95 69.92 69.90 69.89 69.89 0.12 0.20 -0.01 0.28 0.28 0.51 0.50 0.49 2002 Final NFL Record Picking Winners 162 – 104 – 1 0.609 Against the Vegas lines 135 – 128 – 4 0.513 0.529 Best Bets 9 – 8 NFL Europe Similar to College and NFL Square root transform Dramatic drift Teams change dramatically in mid-season Few teams Better design matrix College Basketball Transform? Much more normal (Central Limit Theorem) A lot more games Intersectional games Less emphasis on programs than in College Football More drift NCAA tournament NCAA Basketball Pre-tournament Ratings Rating 100.06 99.33 95.89 93.42 92.90 90.19 90.65 88.70 Home 3.97 4.32 3.85 4.44 4.66 4.31 3.99 3.65 Team Arizona Kentucky Kansas Texas Duke Oklahoma Florida Wake Forest Syracuse Xavier Louisville 88.50 87.89 87.88 3.49 3.37 4.16 NBA Similar to College Basketball Normal – No transformation A lot more games – fewer teams Playoffs are completely different from regular season Regular season – very balanced, strong home court Post season – less balanced, home court lessened Hockey Transform Rare events = “Poissonish” Square root with k around 1 A lot more games History matters Playoffs seem similar to regular season Balance Soccer Similar to hockey Transform Square root with low k Not a lot of games Friendlys versus cup play Home pitch is pronounced Varies widely Soccer Results Correctly forecasted 2002 World Cup final Brazil over Germany Correctly forecasted US run to quarter-finals Won the PROS World Cup Soccer Pool Future Enhancements Hierarchical Approaches Conferences More complicated drift models Correlations Individual drifts Drift during the season Mean correcting drift More informative priors

Related docs
Appreciative Intelligence
Views: 43  |  Downloads: 0
Learning Teams
Views: 4  |  Downloads: 0
Competitive Intelligence
Views: 62  |  Downloads: 14
Teams A Terran Empire story
Views: 1  |  Downloads: 0
Intelligence_analysis
Views: 36  |  Downloads: 8
Computational Intelligence
Views: 16  |  Downloads: 0
Data Mining in Sports
Views: 365  |  Downloads: 17
6 Habits of Highly Effecive Teams
Views: 162  |  Downloads: 0
Teams without Walls
Views: 3  |  Downloads: 0
Intelligence and Happiness
Views: 0  |  Downloads: 0
General Sports
Views: 0  |  Downloads: 0
premium docs
Other docs by sammyc2007
What are the indications for intubation
Views: 329  |  Downloads: 13
VENTILATORY MANAGEMENT ENDOTRACHEAL INTUBATION
Views: 115  |  Downloads: 4
The Neonatal Airway and Neonatal Intubation
Views: 261  |  Downloads: 11
The Airway and Intubation
Views: 191  |  Downloads: 15
RSI RAPID SEQUENCE INTUBATION
Views: 279  |  Downloads: 6
Rapid Sequence Intubation The Role of the NH
Views: 120  |  Downloads: 2
PROTOCOL POST INTUBATION MANAGEMENT
Views: 138  |  Downloads: 4
PEDIATRIC INTUBATION POLICY AND PROCEDURE
Views: 156  |  Downloads: 1
Pediatric Airway Management
Views: 133  |  Downloads: 9
Pediatric Airway Emergencies
Views: 87  |  Downloads: 9
Non invasive ventilation and LV dysfunction
Views: 64  |  Downloads: 2
NASOGASTRIC INTUBATION
Views: 161  |  Downloads: 6
Mechanical Ventilation for Nursing
Views: 303  |  Downloads: 16
Management of the Routine Pediatric Airway
Views: 88  |  Downloads: 2