Algasjdlg by chenmeixiu

VIEWS: 8 PAGES: 16

									Simulating Sports: The Inputs
      and the Engines

                Paul Bessire
        General Manager, Co-Founder
          PredictionMachine.com
            September 29, 2010
               Table of Contents
• Intro

• PredictionMachine.com & Simulation Overview

• Simulating Baseball

• Plate Appearance Decision Tree

• Examples (more second presentation)
                    Introduction
• 2004 University of Cincinnati BBA, Finance and QA

• 2005 MSQA - Master’s Project (with Dr. Fry):

    Measuring Individual and Team Effectiveness in the NBA
                Through Multivariate Regression

• 2004 – 2009 WhatIfSports.com/FOXSports.com, Director,
  Content and Quantitative Analysis

• 2010 Launched PredictionMachine.com in February
      About PredictionMachine.com
•   “We play the game 50,000 times before it’s actually played.”

•   Built by Paul Bessire to focus on content after six years at WhatIfSports.com/FOXSports

•   February 2010 - Launched with Super Bowl Prediction (Indianapolis 28 – New Orleans 27)

•   “Predictalator” – Simulation engine plays entire NFL season 50,000 times in 8 seconds

•   March Madness, NBA Playoffs, MLB Daily, College Football, NFL

•   Customizable Predictalator – Any teams, Any where, Any line

•   Fantasy Football Projections

•   Live simulator built to analyze in-game winning probabilities and value in coaching decisions
                            Sports Simulation
•   Play-by-play
     –   A “play” means something different for each sport
     –   Probabilities for every individual outcome
     –   Random number generation
     –   Pitch-by-pitch (or basketball/hockey pass-by-pass) not needed
     –   Account for every possible statistical interaction during a game

•   Can be recreated quickly
     –   50,000+ games/second
     –   All data tracked
     –   Every outcome is different
     –   Boxscores
                            Significant Stats
                   Pitchers                              Hitters
•   HBP/BF                           •   HBP/PA
•   BB/(BF – HBP)                    •   BB/(PA – HBP)
•   OAV                              •   AVG
•   1B/Hit Allowed                   •   1B/Hit
•   2B/Hit Allowed                   •   2B/Hit
•   3B/Hit Allowed                   •   3B/Hit
•   HR/Hit Allowed                   •   HR/Hit
•   K/Out                            •   K/Out
•   GO/FO                            •   GO/FO
•   BF                               •   PA
•   Pitches Thrown/BF                •   Relative Range Factor
•   Relative Range Factor            •   Fielding Percentage
•   Fielding Percentage              •   Catcher Arm Rating
•   Handedness                       •   CS% (Runner)
•   Ballpark Effects                 •   Speed Rating
•   League Averages                  •   Handedness
                                     •   Ballpark Effects
                                     •   League Averages
                         Insignificant Stats
                    Pitchers                                  Hitters
•   Wins                               •   RBI
•   Losses
                                       •   IBB
•   Saves
                                       •   Runs (kind of – in Speed Formula)
•   Holds
•   Complete Games                     •   GIDP (kind of – in Speed Formula)
•   Shutouts                           •   SF (kind of – in PA, but also situational)
•   ERA (kind of – 2B and 3B approx)   •   SH (kind of – in PA, in but also situational)
•   Unearned Runs
                                       •   SBA (kind of – attempts, but also setting)
•   Games Started
                                       •   Performance in Counts
•   Pitch Types
                                       •   Other Situational Stats
•   Performance in Counts
•   Other Situational Stats
Ballpark Effects
       Ballparks – Extremes (Min. 3 seasons)
Effect     Ballpark             High    Ballpark                  Low
Hits       Coors Field          1.182   Petco Park                .908
2B         Baker Bowl           1.291   Dodger Stadium            .795
3B         Palace of the Fans   1.868   Great American Ballpark   .523
HR_RF      Coors Field          1.374   Municipal Stadium         .636
HR_LF      Coors Field          1.385   Municipal Stadium         .634
Runs
(unused)
           Coors Field          1.380   Petco Park                .830
      PA Decision Tree - Normalization
Every step in PA uses modified* log5 normalization (Bill James AVG example):

                                       H/AB = ((AVG * OAV) / LgAVG) /

                           ((AVG * OAV) / LgAVG + (1- AVG )*(1- OAV)/(1-LgAvg))

                                   Where, LgAVG = (PLgAVG + BLgAVG)/2



                                     2000 Pedro vs. 1923 Ruth Example:

                                       H/AB = ((.393 * .167) / .2791) /

                             ((.393 * .167) / .2791+ (1- .393)*(1- .167)/(1-.2791))

                                  Where, LgAVG = (.283 + .276)/2 or .2791

                                                Result = .2504

* Modified due to a flaw in the assumption above that the batter and pitcher carry equal (50/50) weights on
   each possible outcome of the PA event. Also accounts for handedness and ballpark.
            PA Decision Tree – Steps 1*
                           Plate Appearance



     Unusual Event
  (IBB, WP, PB, SB, CS, SH,
                                                      Normal PA
   Hit and Run, Pickoff, Balk)




                                     HBP
                                                                           Not HBP
                                 (per PA or BFP)




                                                           BB                        At Bat…
                                                   (per PA or BFP – HBP)




* No ballpark or handedness adjustments made yet.
              PA Decision Tree – Steps 2
                                                                          At-Bat




                     Out                                                              Hit…
                                                                                   (AVG vs. OAV)*




  Strikeout                         Normal
                             (Logic to determine direction
    (K/Out)                         and GO or FO)




      Hit                             Error                      Normal
   (Poor Play)               (Fielding Percentage)




* Historical handedness adjustment and ballpark hits multiplier used.
            PA Decision Tree – Steps 3
                                                             Hit*



                                                                              HR*
                               Normal – In Play
                                                                             (HR/Hit)



              Out
                                                         Normal Hit
          (Plus Play)


                                       3B*                   2B*                1B
                               (3B/Hit * multiplier   (2B/Hit * multiplier
                                  for lost HR)           for lost HR)



* Ballpark multipliers used.
       PA Decision Tree – Matchup Weights

Addresses previous 50/50 assumption using League-Adjusted Variance to form batter and
   pitcher weights for each step:




                HBP/PA BB/(PA-HBP)          H/AB K/(OUT) HR/HIT 2B/HIT 3B/HIT

   Pitcher%       47.8          43.5        46.7      45.6      39.7      15.2     11.6

   Hitter%        52.2          56.5        53.3      54.4      60.3      84.8     88.4
     Matchup Weights: What does this mean?

• Batter always has more control (even with HBP and BB)

    – Makes final decision (Swing or not)
    – Dictates strike zone
    – Less consistent

• Doubles and Triples are (mostly) out of pitcher’s control (BABIP)

• Does not necessarily batting is more important

    – 9 vs. 1
    – Fewer pitcher outliers means elite pitchers are more valuable
        PA Decision Tree - Normalization
Batting Average Example using Matchup Weights:

                              H/AB = ((1.066*AVG * .934*OAV) / LgAVG) /

           ((1.066*AVG * .934*OAV) / LgAVG + (1.066- 1.066*AVG )*(.934- .934*OAV)/(1-LgAvg))

                           Where, LgAVG = (.934*PLgAVG + 1.066*BLgAVG)/2

                         2000 Pedro vs. 1923 Ruth Example (with handedness):

                              H/AB = ((1.066*.393 * .167 * .934) / .2795) /

                           ((.393 * .167) / .2795+ (1- .393)*(1- .167)/(1-.2795))

                          Where, LgAVG = (1.066*.283 + 0.934*.276)/2 or .2795

                                  Result * Handedness = .2502 * 1.045

                                           Final Result = .2614

								
To top