Tuesday night all but officially certified Donald Trump as the Republican nominee for president. While Cruz and Kasich’s announcements were surprising to many (including us), they confirm what the data has been telling us since we started polling on the Republican primary in August of last year: Donald Trump was always going to be the nominee.
The time-series above shows estimated (more on methods below) candidate support week-by-week over the primary race. The trends largely bear out what our data showed The New York Times back in August: Donald Trump won’t fold.
For the more than 8 months of our polling, Donald Trump led the field, with the only exception of a six week period in mid-fall of 2015 when Carson lead Mr. Trump by, at his peak, 4 percentage points. At all other times, Trump earned more support, often by double-digit margins.
Even as Cruz picked up steam after winning in Wisconsin when we were conducting our final polls (which concluded on 4/17/2016), Trump was still leading Cruz, albeit narrowly. This does not even account for Trump’s subsequent surge in the last few weeks with big wins in New York and Pennsylvania.
While Carson was the only candidate able to top Trump, his support was fleeting in a way that Trump’s was not. By the end of 2015, Carson’s once-nation-wide support receded to only a few districts. And by March, when Carson dropped out, he won a plurality of support in only one Congressional district.
As we observed Mr. Carson’s bubble of support internally, we suspected it would burst. Our subgroup estimates showed Carson was buoyed by the same base that other candidates were drawing from: ideological conservatives and core Republicans. Because these voters had many similar candidates to chose from, their support was often transient. In contrast, Mr. Trump, we observed, drew together a demographically distinct coalition that he could rely on.
Moreover, because we can accurately estimate candidate support in the same geographies that delegates are awarded in, we were able to not only understand Trump’s support levels, but also calculate how many delegates he would win in each contest. While conventional wisdom suggested he would do worse in highly Democratic areas, our estimates showed the opposite. Because these “Blue Zones” represent a disproportionate share of delegates, strong performance allowed him to rack up a substantial, and ultimately insurmountable, delegate lead.
In the end, Trump’s unique coalition and a splintered field interacted with GOP delegate rules to benefit Trump in a way that our data science methods revealed early on.
While the first time-series above may look similar to aggregates of public polls, we take a different approach. First of all, we collect more data than any single poll. Since we first added the question, we have recorded conversations with over 20,000 self-identified Republicans among a broader sample of 70,000 adults in the United States.
However, polling aggregation sites average dozens of polls and tens of thousands of responses every month. In contrast, our methodology relied on only 2,375 interviews in March while generating more accurate and smoother results. By leveraging Civis proprietary data science tools, we can accurately forecast candidate support using much smaller (and cheaper) surveys than traditional methods, in real time.
As we described in December, we run tens of thousands of simulations using proprietary Bayesian algorithms that leverage all of that data to make estimates of survey responses. Previously, we used those to produce maps on the Congressional district level and the subgroup estimates that allowed us to understand Trump’s unique coalition. However, the same techniques allow us to generate week by week estimates with greater accuracy than relying on surveys alone. On average, our margins of error are just +/- 2.8 percentage points – 30% smaller than a standard survey of the same (weekly) size.
While the Republican primary may be over now, we will be waiting to see how this general election plays out.