final models FINAL version by karaswisher

VIEWS: 5,956 PAGES: 12

										
     	
     	
     2012	
  Presidential	
  Election	
  Poll	
  
	
     	
     	
     Philip	
  Garland,	
  Ph.D.,	
  VP	
  Methodology	
  
	
     	
     	
     Dave	
  Goldberg,	
  CEO	
  
                     Liana	
  Epstein,	
  Ph.D.,	
  Senior	
  Methodologist	
  
	
     	
     	
     Annabell	
  Suh
                             SURVEYMONKEY	
  
                                2012	
  PRESIDENTIAL	
  ELECTION	
  POLL	
  
SurveyMonkey has surveyed roughly 1.2 million people from August 17th to November 2nd. Still,
skeptics will ask, “Can an internet poll really be successful at approximating voter turnout?”

Here	
  is	
  the	
  map	
  of	
  actual	
  voter	
  turnout	
  in	
  2008	
  by	
  county…	
  




This	
  is	
  the	
  map	
  of	
  respondents	
  to	
  SurveyMonkey’s	
  2012	
  presidential	
  election	
  poll…	
  

.




This report contains our newest wave of data from the 600,000 people who responded to our
presidential election poll from October 3rd through November 2nd. Results will be displayed in
two different ways: first, as popular vote percentages and second as Electoral College
distributions. With this data, we seek to show that internet data is as good as phone data (if not
better) at assessing public opinion.



                                                                                                  1	
  
	
  a	
  few	
  notes	
  about	
  our	
  data	
  
	
                                                                                  Understand	
  the	
  data	
  that	
  we’re	
  reporting.	
  	
  

  Why	
  does	
  all	
  the	
  data	
  begin	
  on	
  10/10	
  rather	
  than	
  10/3?	
  

             The data reported below begins at 10/10 due to the fact that we chose to use a seven-day
             trailing sum. This was done for three main reasons. First, all publicly available polls
             report data using trailing sums as well. Matching their methodology in this way will
             facilitate comparisons between SurveyMonkey and other polling firms. This provides a
             reality check for how well SurveyMonkey is doing measuring public opinion. Second,
             using a trailing sum, rather than a daily measure, provides a statistic that is less swayed
             by any single day’s events. Essentially, averaging over a week’s worth of data smoothes
             out and otherwise jagged curve. Lastly, for analyses at the state level, using more than
             one day of data gives us a larger sample that increases the power and accuracy of our
             analyses.




  Why	
  is	
  only	
  weekday	
  data	
  reported?	
  

             It is also important to note that all results that will be reported below exclude weekend
             data. This was done for two reasons. First, we observed that the graphs of our raw, daily
             data showed spikes every weekend that were aberrant from the trend line, and from
             publicly available polling data. We speculate that this is due to two main problems. First,
             our traffic volume is much lower on weekends, with traffic sinking as low as 15% of
             typical weekday traffic. This lower volume makes our results more susceptible to
             outliers. Second, we have found in prior studies of our SurveyMonkey traffic that the
             people who take surveys on weekends are often not representative of the general U.S.
             population and, consequently, qualitatively different from those who take surveys on
             weekdays.




                                                                                                           2	
  
	
  model	
  1:	
  a	
  RAW	
  look	
  	
  
	
                                              This	
  model	
  provides	
  a	
  transparent	
  view	
  of	
  the	
  data.	
  	
  

  RATIONALE: We have included this model not because we think it will accurately predict what
             happens on Election Day, but because we want to be as transparent as possible
             about our methodology.
  WEIGHTS:           None. Other than excluding weekends and using a 7-day trailing sum, this is
                     purely raw data. No corrections. No weighting.

                                                                                         RESULTS
  As can be seen in the graph below the raw results from our survey suggest that the two
  candidates standing in the Electoral College has flipped back and forth almost daily. This is
  strikingly different from all other polls, which have had Obama consistently ahead in the
  Electoral College for October. This inconsistency of electoral college projections was the main
  reason that we pursued weighted models rather than merely reporting our raw data. As of Friday
  (11/2), Model #1 predicts: Obama, 266; Romney, 272.




  The above graph was created through a forced choice for each state between the candidates.
  Separating toss up states provides a glimpse into why SurveyMonkey’s numbers show a tighter
  race. RCP uses a 5% margin of error to determine if a state is a clear win for either candidate.
  SurveyMonkey, on the other hand, uses a slimmer 3% margin of error. Overall the graphs below
  show that SurveyMonkey has roughly half the number of toss up states that RCP does, with more
  of these going to Romney than Obama. This accounts for why this model estimates a much
  higher number of electoral votes for Romney than other polls.	
  




                                                                                        3	
  
Although the Electoral College decides the election, the popular vote is also of interest. Because
we oversampled swing states to be able to conduct analyses at the state-level, the proportions of
states in our sample relative to their representation in the population of American voters varied
wildly. Additionally, due to low traffic, some states were under-represented in our sample. For
example, the percentage of voters from Ohio was inflated, because we directed more respondents
to our survey there—and percentage of voters from North Dakota was lower, as we directed less
traffic there. Thus, publicly available statistics were used to adjust the weights of the state
popular vote totals so that they accurately reflected the proportions of U.S. voter turnout by state
in 2008.
Unsurprisingly, given that SurveyMonkey’s electoral college shows an inconsistent margin of
victory for Obama than other polls do, the SurveyMonkey popular vote total shows a lower
margin of Obama supporters than other polls have.




                                                                         4	
  
	
  model	
  2:	
  the	
  “HOW”	
  correction	
  	
  
	
                                                       This	
  model	
  corrects	
  for	
  sampling	
  method.	
  

  RATIONALE: The anonymity of internet polling is a blessing and a curse. Because the person
             being polled has anonymity, he or she is free to respond without feeling self-
             conscious. This minimizes the demand characteristics of phone polls to change
             their answer in response to what they think the phone pollster wants to hear.
             When people are answering surveys online, as opposed to on the phone—they are
             “talking” to a computer instead of a real, live person. This matters because
             research has shown that when speaking with a real, live person, respondents are
             more concerned about what that person thinks of them. This makes respondents
             less willing to say “I don’t know,” when asked who they would vote for, because
             it would suggest that they haven’t thought about the election much.
             Unfortunately, this anonymity can also artificially inflate “don’t know” responses
             making accurate predictions tougher to make. Moreover, anonymity can also lead
             to people not taking the survey seriously enough, randomly clicking responses or
             not thinking through the questions sufficiently.
  WEIGHTS:
             •   Leaning	
   voters:	
   The “don’t know” response percentage in the SurveyMonkey
                 dataset was much higher than that of the average phone poll (9% versus 5%).
                 Consequently, we used a question that asked what candidate voters were “leaning
                 towards” to add a small subset of otherwise undecided voters to the results.	
  
             •   Volatility:	
  Each day was compared to the previous day to compute a “volatility”
                 index. This weight was applied to the day’s average so that more consistent days
                 were weighted more heavily. This makes our averages less susceptible to random
                 error and “satisficers” (people who don’t take online surveys seriously).	
  



                                                                                      RESULTS
  Although RCP and Nate Silver’s “fivethirtyeight” blog have consistently predicted an Obama
  victory in the Electoral College by a fairly wide margin, Model # 2 shows a much tighter race.
  As can be seen in the graph below SurveyMonkey results suggest that if the election had been
  held anytime between 10/10 to 10/18, Mitt Romney would have won. Beginning on 10/18,
  however, all the way through Friday, Barack Obama has regained the edge in the Electoral
  College. As of Friday (11/2), Model #2 predicts: Obama, 272; Romney, 266.




                                                                              5	
  
Again, the above graph was created through a forced choice for each state between the
candidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbers
show a tighter race. Overall the graphs below show that Model #2 has roughly half the number of
toss up states that RCP does, with 50% of these going to Obama and 50% to Romney.




Despite the fact that SurveyMonkey’s electoral college shows a thinner margin of victory for
Obama than other polls do, the SurveyMonkey popular vote total shows a greater margin of
Obama supporters than other polls have. Thus, while other polls indicate that Romney is ahead in
the popular vote, SurveyMonkey data indicates that Obama is actually in the lead. Model #2’s
estimation of the popular vote mirrors Nate Silver’s popular vote estimation more closely than
RCP’s estimation.




                                                                      6	
  
	
  model	
  3:	
  the	
  “WHO”	
  correction	
  	
  
	
  	
                                                      This	
  model	
  corrects	
  for	
  sampling	
  frame.	
  

   RATIONALE: Whether you’re reaching people through their computer or their phone, having
              them answer your survey does not guarantee that they are going to show up at the
              polls on Election Day. The people who respond to surveys (whether on the
              internet or on the phone) and the people who show up to vote are not exactly the
              same set of people.
   WEIGHTS:
     • Party	
  ID:	
    Using voter turnout statistics from 2008, we adjusted the proportions of
        Democrats, Republicans, and Independents in our sample. A state was coded as too
        “blue” or too “red” and the vote of Republicans or Democrats respectively was weighted
        heavier to even out the percentage. This correction was applied within a 5% margin of
        error, as this is the typical polling error.	
  
     • Education:	
   Having adjusted on party ideology, we then performed a mathematical
        correction for the representation of educational level (see Appendix for the question
        options) in the population of U.S. voters.	
  
     • Undecideds:	
   Finally, we eliminated any voters who responded “don’t know” twice
        when asked who to vote for. If a voter is not leaning towards any political candidate only
        a few days before the election, chances are low that they will vote at all, and if they do
        they should be equally split between the two candidates. Eliminating these truly
        undecided voters from our sample allowed for a more realistic estimate of the popular
        vote.	
  

                                                                                            RESULTS
   Model #3 predicts a consistent victory for Obama over the past month—even when he was
   trailing in the popular vote. Unlike Model #2, which is more conservative in its Electoral College
   estimations than both RCP and Nate Silver, Model #3 predicts a wider margin of victory than
   either. The electoral vote estimations of Model 3 more closely mirror Nate Silver’s estimations
   (more so than RCP). Nevertheless, there is a striking difference in our graph for 10/22-10/25,
   which shows Romney briefly ahead in the electoral college. As of Friday (11/2), Model #3
   predicts: Obama, 305; Romney, 233.




   Again, the above graph was created through a forced choice for each state between the
   candidates. Separating toss up states provides a glimpse into why SurveyMonkey’s numbers

                                                                               7	
  
show a bigger lead for Obama. Overall the graphs below show that SurveyMonkey has roughly
half the number of toss up states that RCP does, but the majority of these tossup states tend to be
attributed to Obama in a forced-choice scenario, creating a wide lead for Obama.




Despite the fact that SurveyMonkey’s electoral college shows a thinner margin of victory for
Obama than RCP polls do, the SurveyMonkey popular vote total shows a greater margin of
Obama supporters than RCP polls have. Thus, while RCP polls indicate that Romney is ahead in
the popular vote, SurveyMonkey data indicates that Obama is actually in the lead.




                                                                        8	
  
	
  calling	
  the	
  race	
  	
  

Ultimately, each model is only as good as the calls it makes on the Electoral College and the
overall popular vote percentages. Below are the electoral map predictions for each model and the
estimations of the popular vote for each. Key differences in swing states are highlighted.

                                                                                                         MODEL	
  #1:	
  RAW	
  

  TALLY:	
  	
  	
  OBAMA	
  	
  	
  266	
  	
  	
  	
  ROMNEY	
  	
  	
  272	
  
  	
  
  KEY	
  TOSSUPS:	
  
  	
              CO	
  	
  	
  IA	
  	
  	
  NH	
  	
  
  	
              FL	
  	
  	
  NC	
  	
  	
  NV	
  	
  	
  OH	
  	
  	
  VA	
  
  	
  
  POPULAR	
  VOTE:	
  
  	
  	
  	
  OBAMA	
  	
  	
  47.38	
  %	
  	
  	
  	
  	
  	
  ROMNEY	
  	
  	
  46.24%	
  


                                                                                                                            	
  
                                                                                                                            	
  
                                                                                                         MODEL	
  #2:	
  HOW	
  

  TALLY:	
  	
  	
  OBAMA	
  	
  	
  272	
  	
  	
  	
  ROMNEY	
  	
  	
  266	
  
  	
  
  KEY	
  TOSSUPS:	
  
  	
              CO	
  	
  	
  IA	
  	
  	
  NH	
  	
  	
  NV	
  
  	
              FL	
  	
  	
  NC	
  	
  	
  OH	
  	
  	
  VA	
  
  	
  
  POPULAR	
  VOTE:	
  
  	
  	
  	
  OBAMA	
  	
  	
  48.33%	
  	
  	
  	
  	
  	
  ROMNEY	
  	
  	
  47.11%	
  


                                                                                                                               	
  
                                                                                                                               	
  
                                                                                                         MODEL	
  #3:	
  WHO

  TALLY:	
  	
  	
  OBAMA	
  	
  	
  305	
  	
  	
  	
  ROMNEY	
  	
  	
  233	
  
  	
  
  KEY	
  TOSSUPS:	
  
  	
              CO	
  	
  	
  IA	
  	
  	
  NC	
  	
  	
  NH	
  	
  	
  NV	
  	
  	
  OH	
  
  	
              FL	
  	
  	
  VA	
  
  	
  
  POPULAR	
  VOTE:	
  
  	
  	
  	
  OBAMA	
  	
  	
  49.46%	
  	
  	
  	
  	
  	
  ROMNEY	
  	
  	
  47.51%	
  



                                                                                                 9	
  
                                                                      MODEL	
  SUMMARY:	
  TOSSUPS	
  
To provide the best possible prediction, we looked at our three models to determine which states
should be labeled definitively as “tossups”. If a state was predicted differently in different
models, or if the difference in Obama and Romney votes was less than 2% in any given state, we
determined that it was too close to call. This led to the following overall prediction…




        ELECTORATE:	
  	
  	
  	
     OBAMA	
  	
  	
  250	
   	
                                                            	
  
                                                                                                        ROMNEY	
  	
  	
  220	
          TOSSUPS	
  	
  	
  68	
  
        	
       	
        	
         	
  
        KEY	
  TOSSUPS:	
  	
         IA	
  	
  	
  NC	
  	
  	
  NV	
  	
  	
  OH	
  	
  	
  VA	
  	
  	
  WI	
  
        	
  
        POPULAR	
  VOTE:	
            OBAMA	
  	
  	
  48.90%	
        ROMNEY	
  	
  	
  47.31%	
  
                                      (average	
  of	
  Model	
  #2	
  &	
  Model	
  #3)

A	
  final	
  note	
  on	
  swing	
  states:	
  It is important to note that we do not consider Colorado, Florida,
New Hampshire, and Pennsylvania swing states. Our data has shown consistent advantages for
Obama in Colorado, New Hampshire, and Pennsylvania and a consistent advantage for Romney
in Florida. We have only six toss up states, nearly half the number of RCP. Among our three
previous models, there are only three states that vary among them, accounting for the electoral
differentials. Thus, regardless of which model is used, 48 out of 51 electorates stay consistent.
	
  

                                                                                                           OUR	
  PICK?	
                           MODEL	
  #3	
  
RATIONALE:	
   Model #3 accounts for the differential of polled and actual voters without getting
caught up in the pros and cons of an internet sample in particular. It is similar, but not identical
to what other pollsters are saying and has shown itself to be consistently ahead of the curve of
other polls for the past month.	
  


                                                                                                                                    10	
  
  Appendix	
  –	
  Questionnaire	
  
Voting Registration.                                                                                    	
  
    • Are	
  you	
  currently	
  a	
  registered	
  and	
  eligible	
  voter,	
                   8.How	
  important	
  is	
  the	
  presidential	
  election	
  to	
  
        or	
  not?	
                                                                                you?	
  
                       Yes	
                                                                                       Extremely	
  important	
  
                       No	
                                                                                        Very	
  important	
  
                                                                                                                   Somewhat	
  important	
  
Zip Code.                                                                                                          Slightly	
  important	
  
    • What	
  is	
  the	
  five-­‐digit	
  zip	
  code	
  for	
  the	
  address	
                                  Not	
  at	
  all	
  important	
  
       you	
  registered	
  to	
  vote	
  from,	
  or	
  if	
  you’re	
  not	
                  9. If	
  the	
  election	
  were	
  held	
  tomorrow,	
  would	
  you	
  
       registered	
  to	
  vote,	
  what	
  is	
  the	
  zip	
  code	
  you	
                       know	
  where	
  to	
  go	
  vote?	
  
       would	
  use?	
                                                                                             Yes	
  
                     [open-­‐ended]	
                                                                              No	
  
                                                                                                10. How	
  often	
  would	
  you	
  say	
  you	
  vote	
  –	
  always,	
  
Voting Likelihood.                                                                                  nearly	
  always,	
  part	
  of	
  the	
  time,	
  or	
  seldom?	
  
    1. How	
  much	
  thought	
  have	
  you	
  given	
  to	
  the	
                                               Always	
  
        upcoming	
  election	
  for	
  president?	
                                                                Nearly	
  always	
  
                       Quite	
  a	
  lot	
                                                                         Part	
  of	
  the	
  time	
  
                       Some	
                                                                                      Seldom	
  
                       Only	
  a	
  little	
                                                                       Never	
  
                       None	
                                                                                      Don’t	
  know	
  
                       Don’t	
  know	
                                                          11. Thinking	
  back	
  to	
  the	
  elections	
  held	
  for	
  Congress	
  
    2. Do	
  you	
  happen	
  to	
  know	
  where	
  people	
  who	
  live	
                        in	
  November	
  2010,	
  did	
  you	
  vote?	
  
        in	
  your	
  neighborhood	
  go	
  to	
  vote?	
                                                          Yes,	
  voted	
  
                       Yes	
                                                                                       No,	
  did	
  not	
  vote	
  
                       No	
                                                                                        	
  
                       Don’t	
  know	
                                                      Voting Preference.
    3. Have	
  you	
  ever	
  voted	
  in	
  your	
  precinct	
  or	
  election	
               • Suppose	
  the	
  presidential	
  election	
  were	
  held	
  
        district?	
                                                                                 today.	
  Who	
  would	
  you	
  be	
  likely	
  to	
  vote	
  for?	
  
                       Yes	
                                                                                       Barack	
  Obama	
  
                       No	
                                                                                        Mitt	
  Romney	
  
                       Don’t	
  know	
                                                                             Don’t	
  know	
  /	
  Other	
  
    4. Do	
  you,	
  yourself,	
  plan	
  to	
  vote	
  in	
  the	
  election	
  this	
         • Which	
  candidate	
  are	
  you	
  leaning	
  towards?	
  
        November,	
  or	
  not?	
                                                                                  Barack	
  Obama	
  
                       Yes	
                                                                                       Mitt	
  Romney	
  
                       No	
                                                                                        Other	
  
                       Don’t	
  know	
                                                                             Don’t	
  know	
  
    5. How	
  certain	
  are	
  you	
  that	
  you	
  will	
  vote?	
  
                       Absolutely	
  certain	
                                              Demographics.
                       Fairly	
  certain	
                                                     • Generally	
  speaking	
  do	
  you	
  usually	
  think	
  of	
  
                       Not	
  certain	
                                                           yourself	
  as	
  a	
  Republican,	
  a	
  Democrat,	
  an	
  
                       Don’t	
  know	
                                                            Independent	
  or	
  something	
  else?	
  
    6. How	
  likely	
  are	
  you	
  to	
  vote	
  in	
  November’s	
                                          Democrat	
  
        presidential	
  election?	
                                                                             Republican	
  
                       Extremely	
  likely	
                                                                    Independent	
  
                       Very	
  likely	
                                                                         Something	
  else	
  
                       Somewhat	
  likely	
                                                    • What	
  is	
  the	
  highest	
  level	
  of	
  school	
  you	
  have	
  
                       Slightly	
  likely	
                                                       completed	
  or	
  the	
  highest	
  degree	
  you	
  have	
  
                       Not	
  at	
  all	
  likely	
                                               received?	
  
    7. Thinking	
  back	
  to	
  the	
  elections	
  held	
  for	
  Congress	
                                  Less	
  than	
  high	
  school	
  degree	
  
        in	
  November	
  2010,	
  did	
  things	
  come	
  up	
  that	
  kept	
                                High	
  school	
  degree	
  or	
  equivalent	
  
        you	
  from	
  voting,	
  or	
  did	
  you	
  happen	
  to	
  vote?	
                                   Some	
  college	
  but	
  no	
  degree	
  
                       Yes,	
  voted	
                                                                          Associate	
  degree	
  
                       No,	
  did	
  not	
  vote	
                                                              Bachelor	
  degree	
  
                       Don’t	
  know	
                                                                          Graduate	
  degree	
  

								
To top