How can we predict voter outcome to help manage our outreach resources? We used our Golden program to perform 60,000 daily simulations to remove polling biases and keep us as accurate as possible during the 2012 Obama re-election campaign.
Campaign decision makers work in a data rich environment. Internal metrics, consultant pollsters, and public polling inundate campaigns, but reconciling disparate data points into a comprehensive understanding of the political landscape can be a challenge. But uncertainty cannot be a road block: in the absence of clear and consistent information, political decision makers still must act.
What We Did
Data scientists and analysts, now at Civis Analytics, developed an election forecasting algorithm (code named “Golden”) while working for Obama for America (OFA) during the 2012 election cycle. The algorithm ingests internal predictive modeling, consultant polling, and public pollster toplines, and in turn generates state-level estimates of candidate support, orderings of relative competitiveness, and an overall likelihood of victory on Election Day.
Why it Matters
While much of predictive analytics fixates on the micro-level, some of the most crucial choices decision-makers face are macro-level. In a presidential election, deciding which states in which to invest and the proper mix of resource allocation between them is often the difference between victory and defeat. Rigorous election forecasting helps make sense of divergent data points and leverages uncertainty for more informed decision making.
Reconciling Public Opinion
How it Worked
On October 4th, 2012, the day after the first Presidential debate, the Obama campaign leadership was faced with an unenviable question: after speculation and some unfavorable initial polling following the debate, was the President still on track for victory? Members of the internal Analytics Department, now data scientists and analysts at Civis Analytics, were tasked with answering this question.
Over the previous months, the analytics team had developed an election forecasting algorithm that could reconcile the large number of internal surveys the campaign was conducting with the polling results of external consultant and public pollsters. The “Golden” model had two major facets. First, for each pollster and poll, estimates of partisan bias and expected error were generated. When new polling results were released, the model could efficiently situate the new piece of information in a larger political context. For example, if a pollster with evidence of past Republican bias released a poll showing the President trailing, the model would automatically recognize that the poll may be pointing to a political reality less pessimistic than its topline suggests. As opposed to seeing each new piece of polling information as disjointed and discrete, this information could then contribute to building a holistic image of the race. Second, the “Golden” algorithm would use this base of reconciled public opinion to simulate the Election Day results 62,000 times each evening. These simulations were then interpreted to generate state-by-state estimates of support, rank competitiveness of states, and give an overall likelihood of victory.
This was crucial in October of 2012. After two nights of internal polling, the analytics team found support for the President stabilizing in a winning position. However, public pollsters continued to see declines in support each night their calls were in the field. The algorithm was able to confidently account for these pollster effects and reassure campaign management that the President’s strategic position was strong and that there was no need for panic. The above comparison between OFA and Gallup’s support estimates contrasts the smooth, stable, and actionable predictions from the analytics team, with the noisy public polling toplines reported in the final weeks of the cycle.
How it was Used
Beyond being a tool for reconciling different measures of public opinion, “Golden” became a core driver of the campaign’s resource allocation decisions. When 2012 campaign leadership decided to adopt the analytics method of resource allocation based on the “Golden” algorithm, a principle approached now Civis CEO Dan Wagner and stressed “if we lose this election, a lot of that will be on your shoulders,” but that “we’re going to trust you, because there is reasoning behind this algorithm and that reasoning makes sense1.” In the aftermath of President Obama’s re-election victory, campaign manager Jim Messina reflected that the 62,000 simulations of the election run in “Golden” each night was “how I spent the $1 billion dollars2.”
Getting It Right
In the first few days after the first Presidential debate (approximately one month before Election Day), the “Golden” algorithm showed stable levels of candidate support and accurately predicted the ultimate outcome of all 50 states. Those forecasts, refreshed and refined daily throughout the rest of the cycle, never presented an alternate view of the electoral landscape. The final “Golden” estimates in the days before the election were within 1.1% of the actual results in every battleground state.
When compared with other polling aggregation algorithms, the “Golden” algorithm consistently had a lower average error and smaller partisan bias (left). While the algorithm is robust to noisy results from public pollsters, a major driver of the model’s success was the internal polling and modeling conducted by the analytics department. For example, the final OFA analytics polling results had an average error of 1.6 points, as compared with Rasmussen’s average error of 4.6 points. Here, the ability to center the model’s estimate on high-quality polling and modeling proved crucial.
A Highly Generalizable Approach
Now at Civis Analytics, the analysts and data scientists responsible for creating the “Golden” forecasting algorithm have worked to adapt and generalize the tool. National political organizations can view their Civis modeling in the context of public polling and other opinion research metrics to help them distribute resources across senate, gubernatorial, and congressional races. Statewide political groups can similarly fine tune resource allocation across state legislative races. The same methods can be used to track and reconcile public opinion on a range of other issues including attitudes towards climate change and education reform.
Civis Analytics’s Dan Wagner on Data Solutions to Social Problems. (J. Brustein, Interviewer) Retrieved from http://www.businessweek.com/articles/2014-03-06/civis-analyticss-dan-wagner-on-data-solutions-to-social-problems#r=lr-sr ↩
Obama Campaign Manager Jim Messina Talks Big Data at the Milken Institute’s 2013 Global Conference. (M. I. Conference, Interviewer) Retrieved from http://www.youtube.com/watch?v=mZmcyHpG31A ↩