An Intelligence Approach to Evaluation of Sports Teams
by Edward Kambour, Ph.D.
1
Agenda
I. II. III. IV. V. VI. VII. College Football Linear Model Generalized Linear Model Intelligence (Bayesian) Approach Results Other Sports Future Work
General Background
Goals
Forecast winners of future games
Beat the Bookie!
Estimate the outcome of unscheduled games
What’s the probability that Iowa would have beaten Ohio St?
Generate reasonable rankings
Major College Football
No playoff system “Computer rankings” are an element of the BCS 114 teams 12 games for each in a season
Linear Model
Rothman (1970’s), Harville (1977), Stefani (1977), …, Kambour (1991), …, Sagarin???
Response, Y, is the net result (point-spread) Parameter, , is the vector of ratings For a game involving teams i and j,
E[Y] = i - j
Linear Model (cont.)
Let X be a row vector with
if k i 1 X k 1 if k j 0 otherwise
E[Y]=X
Regression Model Notes
Least Squares Normality, Homogeneity College Football
Estimate 100 parameters Sample size for a full season is about 600 Design Matrix is sparse and not full rank
Home-field Advantage
Generic Advantage (Stefani, 1980)
Force i to be home team and j the visiting team Add an intercept term to X Adds one more parameter to estimate UAB = Alabama Rice = Texas A&M
Team Specific Advantage
Doubles the number of parameters to estimate
Linear Model Issues
Normality Homogeneity Lots of parameters, with relatively small sample size
Overfitting The bookie takes you to the cleaners!
Linear Model Issues (cont.)
Should we model point differential
A and B play twice
A by 34 in first, B by 14 in the second A by 10 each time
Running up the score (or lack thereof) BCS: Thou shalt not use margin of victory in thy ratings!
Logistic Regression
Rothman (1970s) Linear Model Use binary variable
Winning is all that matters Avoid margin of victory Coin Flips
Logistic Regression Issues
Still have sample size issues Throw away a lot of information Undefeated teams
Transformations
Transform the differentials to normality
Power transformations Rothman logistic transform
Transforms points to probabilities for logistic regression
“Diminishing returns” transforms
Downweights runaway scores
Power Transforms
Transform the point-spread
Y = sign(Z)|Z|a
a = 1 straight margin of victory a = 0 just win baby a = 0 Poisson or Gamma “ish”
Maximum Likelihood Transform
1995-2002 seasons
Power
0.1 0.3 0.5
-2ln(likelihood)
52487 41213 35128
0.67
0.8 1
32597
31418 31193
MLE = 0.98
Predicting the Score
Model point differential
Additionally model the sum of the points scored
Y2 = Si + Sj Fit a similar linear model (different parameter estimates) Y1 = Si – Sj
Forecast home and visitors score
H = (Y1 + Y2 )/2, V = (Y2 - Y1)/2
Another Transformation Idea
Scores (touchdowns or field goals) are arrivals, maybe Poisson
Final score = 7 times a Poisson + 3 times a Poisson + …
Transform the scores to homogeneity and normality first
The differences (and sums) should follow suit
Square Root Transform
Since the score is “similar” to a linear combination of Poissons, square root should work Transformation
T S k
Why k?
For small Poisson arrival rates, get better performance (Anscombe, 1948)
Likelihood Test
LRT: No transformation vs. square root with fitted k
Used College Football results from 1995-2002 k = 21 Transformation was significantly better
p-value = 0.0023, chi-square = 9.26
Predicting the Score with Transform
Model point differential
Y1 Si 21 S j 21
Additionally model the sum of the points scored
Y2 Si 21 S j 21
Forecast home and visitors score
H = ((Y1 + Y2 )/2)2 , V = ((Y2 - Y1)/2)2
Note the point differential is the product
Unresolved Linear Model Issues
Overfitting History
Going into the season, we have a good idea as to how teams will do
The best teams tend to stay the best The worst teams tend to stay the worst
Changes happen
Kansas State
Intelligence Model
Concept
The ratings and home-ads for year t are similar to those of year t-1. There is some drift from one year to the next.
Model
t t 1 t where
t ~ N(0, 2 )
Intelligence Model (Details)
Notation
L teams M seasons of data Ni games in the ith season Xi : the Ni by 2L “X” matrix for season i Yi : the Ni vector of results for season i i : the Ni vector of results for season I
Details (cont.)
Data Distribution:
For all i = 1, 2, …, M Yi N Xi i , 2 (independent)
Details (cont.)
Prior Distribution
I 0 2 1 N 0, 0 0.05I
2
0 2 0.25I i N i 1 , for i 2,..., M 0.01I 0 2 2,0.5
2
Details (finally, the end)
The Posterior Distribution of M and -2 is closed form and can be calculated by an iterative method The Predictive Distribution for future results (transformed sum or difference) is straightforward correlated normal (given the variance)
Forecasts
For Scores
Simply untransform
E[Z2] = Var[Z] + E[Z]2
For the point-spread
Product of two normals
Simulate 10000 results
Enhanced Model
Fit the prior parameters
Hierarchical models Drifts and initial variances No closed form for posterior and predictive distributions (at least as far as I know)
The complete conditionals are straight-forward, so Gibbs sampling will work (eventually)
(www.geocities.com/kambour/football.html)
Results
Home 0.21 (0.04) 0.44 (0.03) 0.04 (0.03)
2002 Final Rankings
Team Miami Kansas St USC Rating 72.23 (1.03) 72.04 (1.04) 71.95 (1.03)
Oklahoma
Texas Georgia Alabama Iowa Florida St Virginia Tech Ohio St
71.85 (1.02)
71.57 (1.03) 71.49 (1.03) 71.45 (1.03) 71.30 (1.03) 71.29 (1.02) 71.25 (1.03) 71.18 (1.03)
0.18 (0.03)
0.36 (0.03) 0.02 (0.03) -0.09 (0.03) 0.21 (0.04) 0.43 (0.03) 0.12 (0.03) 0.27 (0.03)
Results
2002 Final Rankings
Team Miami Kansas St USC Rating 72.23 72.04 71.95 Home 0.21 0.44 0.04
Oklahoma
Texas Georgia Alabama Iowa Florida St Virginia Tech Ohio St
71.85
71.57 71.49 71.45 71.30 71.29 71.25 71.18
0.18
0.36 0.02 -0.09 0.21 0.43 0.12 0.27
Results
2002 Final Rankings
Team Miami Kansas St USC Rating 72.23 72.04 71.95 Home 0.21 0.44 0.04
Oklahoma
Texas Georgia Alabama Iowa Florida St Virginia Tech Ohio St
71.85
71.57 71.49 71.45 71.30 71.29 71.25 71.18
0.18
0.36 0.02 -0.09 0.21 0.43 0.12 0.27
Bowl Predictions
Ohio St Miami Fl (-13) 17 31 0.8255 0.7347 0.7174 0.5228 0.5797 0.5721
Washington St Oklahoma (-6.5)
Iowa USC (-6) NC State (E) Notre Dame Florida St (+4) Georgia
21 31
21 30 20 17 24 27
0.5639
0.5719
0.5639
0.5320
2002 Final Record
Picking Winners
522 – 157 0.769
Against the Vegas lines
367 – 307 – 5 0.544
0.563
Best Bets
9 – 7 In 2001, 11 - 4
ESPN College Pick’em
(http://games.espn.go.com/cpickem/leader)
1. 2. 3. 4. 5. 6. 7.
Barry Schultz Jim Dobbs Michael Reeves Fup Biz Joe * Rising Cream Intelligence Ratings
5830 5687 5651 5594 5587 5562 5559
Ratings System Comparison
(http://tbeck.freeshell.org/fb/awards2002.html)
Todd Beck
Ph.D. Statistician Rush Institute
Intelligence Ratings – Best Predictors
College Football Conclusions
Can forecast the outcome of games
Capture the random nature
High variability Sparse design
Scientists should avoid BCS
Statistical significance is impossible Problem Complexity Other issues
NFL
Similar to College Football Square root transform is applicable Drift is a little higher than College Football Better design matrix
Small sample size
Playoff
(www.geocities.com/kambour/NFL.html)
NFL Results
Home 0.29 0.28 0.10
2002 Final Rankings (after the Super Bowl)
Team Tampa Bay Oakland Philadelphia Rating 70.72 70.57 70.55
New England
Atlanta NY Jets Pittsburgh Green Bay Kansas City Denver Miami
70.16
70.13 70.10 69.95 69.92 69.90 69.89 69.89
0.12
0.20 -0.01 0.28 0.28 0.51 0.50 0.49
2002 Final NFL Record
Picking Winners
162 – 104 – 1 0.609
Against the Vegas lines
135 – 128 – 4 0.513
0.529
Best Bets
9 – 8
NFL Europe
Similar to College and NFL Square root transform Dramatic drift Teams change dramatically in mid-season Few teams
Better design matrix
College Basketball
Transform?
Much more normal (Central Limit Theorem)
A lot more games
Intersectional games
Less emphasis on programs than in College Football
More drift
NCAA tournament
NCAA Basketball
Pre-tournament Ratings
Rating 100.06 99.33 95.89 93.42 92.90 90.19 90.65 88.70 Home 3.97 4.32 3.85 4.44 4.66 4.31 3.99 3.65 Team Arizona Kentucky Kansas Texas Duke Oklahoma Florida Wake Forest
Syracuse
Xavier Louisville
88.50
87.89 87.88
3.49
3.37 4.16
NBA
Similar to College Basketball
Normal – No transformation
A lot more games – fewer teams Playoffs are completely different from regular season
Regular season – very balanced, strong home court Post season – less balanced, home court lessened
Hockey
Transform
Rare events = “Poissonish”
Square root with k around 1
A lot more games History matters Playoffs seem similar to regular season Balance
Soccer
Similar to hockey Transform
Square root with low k
Not a lot of games Friendlys versus cup play Home pitch is pronounced
Varies widely
Soccer Results
Correctly forecasted 2002 World Cup final
Brazil over Germany
Correctly forecasted US run to quarter-finals Won the PROS World Cup Soccer Pool
Future Enhancements
Hierarchical Approaches
Conferences
More complicated drift models
Correlations Individual drifts Drift during the season Mean correcting drift More informative priors