What is the probability Harris wins? Building a Statistical Model.
By David Gros. Last Updated:

After Joe Biden made the historic decision to exit the presidential race, the country no longer has to choose between two octogenarians. It's a new exciting time with lots of uncertain questions. In this article we'll try to see just how uncertain we are, and build a probabilistic model of whether Kamala Harris will win if nominated.
There are lots of online election forecasts (e.g., from Silver et al., Morris et al., Gelman et al., among others). So why build another one? In a previous article I created an election model to help understand if Biden should drop out (focusing on Gretchen Whitmer as a replacement). That had a motivation and rhetorical purpose. I partially made a version for Harris, but didn't publish before Biden dropped out. So while this Harris model is now mostly recreational, it has some potential contributions for those interested in election modeling:
- Transparency - Each part tries to include links to corresponding source code
- Analysis and visualization of the amount of available polling (Section 1.2). This is to model the potentially limited initial Harris polling.
- Alternative approach/presentation of errors and poll movement (Section 2)
- Timeliness/impatience - At time of writing, prominent modelers like FiveThirtyEight, Silver, and the Economist haven't released a version for Harris yet. This gives one initial estimate.
For those just looking for a number, here's jumping ahead for the top-line:
This estimate should update ~daily.
We'll go through each of the steps to reach this estimate. The general approach follows that of other polls-based models. We create average of polls in each swing state, estimate the expected polling miss, and then estimate how polls might move. Then we run thousands of random simulationsto get a fraction
where each candidate wins.
Where are Polls Today
Gathering Data
The first step is gathering polling data (sourced via FiveThirtyEight).
Not all pollsters are equal, with some pollsters having a better track record. Thus, we weight each poll. Our weighting is intended to be scaled where 1.0 is the value of a poll from a top-rated pollster (eg, Siena/NYT, Emerson College, Marquette University, etc.) that interviewed their sample yesterday or sooner.
Less reliable/transparent pollsters are weighted as some fraction of 1.0. Additionally, older polls are weighted lessWeight decays with an approximate half-life of 9 days. We use between the start and end date, and wait 1.5 days to start decay.. Polls before Biden dropped out carry ¼ weight. This function is only a heuristic estimate (see codefor exact definition).
If a pollster reports multiple numbers (eg, with or without RFK Jr., registered voters or likely voters, etc), we use the version with the largest sum covered by the Democrat and Republican.
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.73 | Siena/NYT (3.0) | 07/22-47% : 48% | 49.5 | 0.72 | AtlasIntel (2.7) | 07/23- | 48% : 50% | 48.9 | 0.69 | YouGov (2.9) | 07/22- | 44% : 46% | 48.9 | 0.64 | Ipsos (2.8) | 07/22- | 44% : 42% | 51.2 | 0.62 | Marist (2.9) | 07/22- | 45% : 46% | 49.5 | 0.45 | RMG Research (2.3) | 07/22- | 46% : 48% | 48.9 | 0.42 | Morning Consult (1.8) | 07/26- | 47% : 46% | 50.5 | 0.39 | Morning Consult (1.8) | 07/25- | 47% : 45% | 51.1 | 0.37 | Angus Reid (2.0) | 07/23- | 44% : 42% | 51.2 | 0.36 | Morning Consult (1.8) | 07/24- | 46% : 45% | 50.5 | 0.33 | Morning Consult (1.8) | 07/23- | 46% : 45% | 50.5 | 0.31 | Morning Consult (1.8) | 07/22- | 46% : 45% | 50.5 | 0.29 | SurveyMonkey (1.9) | 07/22- | 38% : 39% | 49.4 | 0.26 | HarrisX (1.6) | 07/22- | 49% : 51% | 49.0 | 0.24 | Fabrizio/GBAO (?) | 07/23- | 47% : 49% | 49.0 | 0.22 | Big Village (1.6) | 07/22- | 43% : 44% | 49.1 | 0.17 | YouGov (2.9) | 07/21- | 41% : 44% | 48.2 | 0.16 | Change Research (1.4) | 07/22- | 44% : 43% | 50.6 | 0.15 | YouGov (2.9) | 07/19- | 46% : 46% | 50.0 | 0.13 | Echelon Insights (2.7) | 07/19- | 47% : 49% | 49.0 | 0.12 | Quinnipiac (2.8) | 07/19- | 47% : 49% | 49.0 | 0.11 | YouGov (2.9) | 07/16- | 48% : 51% | 48.5 | 0.10 | YouGov (2.9) | 07/13- | 39% : 44% | 47.0 | 0.09 | Ipsos (2.8) | 07/15- | 44% : 44% | 50.0 | 0.09 | SurveyUSA (2.8) | 07/12- | 42% : 45% | 48.3 | 0.07 | Morning Consult (1.8) | 07/21- | 45% : 47% | 48.9 | 0.06 | Marist (2.9) | 07/09- | 50% : 49% | 50.5 | 0.06 | Beacon/Shaw (2.8) | 07/07- | 48% : 49% | 49.5 | 0.06 | YouGov (2.9) | 07/07- | 38% : 42% | 47.5 | 0.05 | Emerson (2.9) | 07/07- | 43% : 49% | 46.6 | 0.05 | ActiVote (?) | 07/21- | 50% : 50% | 49.5 | 0.05 | Ipsos (2.8) | 07/05- | 49% : 47% | 51.0 | 0.05 | HarrisX (1.6) | 07/19- | 47% : 53% | 47.0 | 0.05 | Hart/POS (2.6) | 07/07- | 45% : 47% | 48.9 | 0.04 | Noble Predictive Insights (2.4) | 07/08- | 44% : 48% | 47.8 | 0.04 | SoCal Research (?) | 07/21- | 43% : 51% | 45.7 | 0.04 | Morning Consult (1.8) | 07/15- | 45% : 46% | 49.5 | 0.04 | Florida Atlantic University/Mainstreet Research (?) | 07/19- | 44% : 49% | 47.1 | 0.03 | SoCal Research (?) | 07/17- | 44% : 52% | 45.8 | 0.03 | Ipsos (2.8) | 07/01- | 42% : 43% | 49.4 | 0.03 | HarrisX (1.6) | 07/13- | 48% : 52% | 48.0 | 0.03 | YouGov (2.9) | 06/28- | 45% : 47% | 48.9 | 0.03 | Data for Progress (2.7) | 06/28- | 45% : 48% | 48.4 | 0.02 | Big Village (1.6) | 07/12- | 37% : 42% | 47.3 | 0.02 | Manhattan Institute (?) | 07/07- | 46% : 48% | 48.9 | 0.02 | Redfield & Wilton Strategies (1.8) | 07/08- | 37% : 44% | 45.7 | 0.01 | J.L. Partners (1.6) | 07/01- | 38% : 49% | 43.7 | 0.01 | Split Ticket/Data for Progress (?) | 07/01- | 46% : 46% | 50.0 | 0.01 | HarrisX (1.6) | 06/28- | 47% : 53% | 47.0 | 0.01 | CNN/SSRS (?) | 06/28- | 45% : 47% | 48.9 | 0.01 | Bendixen & Amandi International (1.0) | 07/02- | 42% : 41% | 50.6 | Sum 9.1 | Total | Avg 49.6 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.92 | From Natl. Avg. (0.91⋅x + 3.70) | 48.7 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
0.73 | Beacon/Shaw (2.8) | 07/22-49% : 49% | 50.0 | 0.67 | Emerson (2.9) | 07/22- | 49% : 51% | 48.9 | 0.25 | Redfield & Wilton Strategies (1.8) | 07/22- | 42% : 46% | 47.7 | 0.23 | Bullfinch (?) | 07/23- | 48% : 47% | 50.5 | 0.07 | Siena/NYT (3.0) | 07/09- | 47% : 48% | 49.5 | 0.06 | Civiqs (2.5) | 07/13- | 44% : 46% | 48.9 | 0.05 | InsiderAdvantage (2.0) | 07/15- | 40% : 47% | 46.0 | 0.04 | SoCal Research (?) | 07/20- | 46% : 50% | 47.9 | 0.03 | North Star Opinion Research (1.2) | 07/20- | 45% : 47% | 48.9 | 0.03 | PPP (1.4) | 07/17- | 43% : 45% | 48.9 | 0.02 | PPP (1.4) | 07/11- | 45% : 51% | 46.9 | Sum 3.1 | Total | Avg 49.1 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.81 | From Natl. Avg. (0.91⋅x + 3.80) | 48.8 | ||||||||||||||||||||||||||||||||
0.73 | Beacon/Shaw (2.8) | 07/22-49% : 50% | 49.5 | 0.67 | Emerson (2.9) | 07/22- | 51% : 49% | 50.6 | 0.22 | Redfield & Wilton Strategies (1.8) | 07/22- | 44% : 44% | 50.0 | 0.06 | Civiqs (2.5) | 07/13- | 48% : 48% | 50.0 | 0.02 | PPP (1.4) | 07/10- | 48% : 49% | 49.5 | 0.01 | North Star Opinion Research (1.2) | 07/06- | 47% : 48% | 49.5 | Sum 2.5 | Total | Avg 49.6 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.91 | From Natl. Avg. (0.71⋅x + 11.45) | 46.5 | ||||||||||||||||||||||||||||||||||||||||||
0.66 | Emerson (2.9) | 07/22-49% : 51% | 48.9 | 0.31 | Landmark Communications (2.1) | 07/22- | 47% : 48% | 49.3 | 0.26 | Redfield & Wilton Strategies (1.8) | 07/22- | 42% : 47% | 47.2 | 0.22 | SoCal Research (?) | 07/25- | 46% : 50% | 48.2 | 0.05 | U. Georgia SPIA (2.2) | 07/09- | 46% : 50% | 47.6 | 0.05 | InsiderAdvantage (2.0) | 07/15- | 37% : 47% | 43.8 | 0.02 | Florida Atlantic University/Mainstreet Research (?) | 07/14- | 44% : 49% | 47.3 | 0.02 | Florida Atlantic University/Mainstreet Research (?) | 07/12- | 43% : 49% | 46.7 | Sum 2.5 | Total | Avg 47.7 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.90 | From Natl. Avg. (1.13⋅x + -7.24) | 48.8 | ||||||||||||||||||||||||||||||||||||||||||
0.73 | Beacon/Shaw (2.8) | 07/22-49% : 49% | 50.0 | 0.66 | Emerson (2.9) | 07/22- | 49% : 51% | 49.1 | 0.22 | SoCal Research (?) | 07/25- | 46% : 49% | 48.4 | 0.21 | Redfield & Wilton Strategies (1.8) | 07/22- | 41% : 44% | 48.2 | 0.20 | Glengariff Group Inc. (1.5) | 07/22- | 42% : 41% | 50.2 | 0.06 | Civiqs (2.5) | 07/13- | 46% : 46% | 50.0 | 0.03 | PPP (1.4) | 07/17- | 41% : 46% | 47.1 | 0.02 | PPP (1.4) | 07/11- | 46% : 48% | 48.9 | Sum 3.0 | Total | Avg 49.2 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.91 | From Natl. Avg. (1.08⋅x + -7.22) | 46.3 | ||||||||||||
0.22 | Redfield & Wilton Strategies (1.8) | 07/22-43% : 46% | 48.3 | 0.03 | PPP (1.4) | 07/19- | 44% : 48% | 47.8 | Sum 1.2 | Total | Avg 46.7 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | |||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.94 | From Natl. Avg. (0.99⋅x + -1.75) | 47.3 | |||||||||||||||||||||||||||
0.66 | Emerson (2.9) | 07/22-47% : 53% | 47.4 | 0.21 | Redfield & Wilton Strategies (1.8) | 07/22- | 43% : 46% | 48.3 | 0.05 | InsiderAdvantage (2.0) | 07/15- | 42% : 48% | 46.7 | 0.03 | PPP (1.4) | 07/19- | 40% : 46% | 46.5 | 0.02 | PPP (1.4) | 07/10- | 44% : 52% | 45.8 | Sum 1.9 | Total | Avg 47.4 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.94 | From Natl. Avg. (1.52⋅x + -28.61) | 46.7 | ||||||||||||
0.21 | Redfield & Wilton Strategies (1.8) | 07/22-43% : 45% | 48.9 | 0.05 | InsiderAdvantage (2.0) | 07/15- | 40% : 50% | 44.4 | Sum 1.2 | Total | Avg 47.0 | |
Weight | Pollster (rating) | Dates | Harris: Trump | Harris Share | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.93 | From Natl. Avg. (1.73⋅x + -41.30) | 44.2 | ||||||||||||
0.22 | Redfield & Wilton Strategies (1.8) | 07/22-39% : 47% | 45.3 | 0.05 | InsiderAdvantage (2.0) | 07/15- | 39% : 49% | 44.2 | Sum 1.2 | Total | Avg 44.4 | |
Estimating Poll Miss
Morris (2024) at FiveThirtyEight reports that the polling average typically misses the actual swing state result by about ~2 points for a given candidate (or ~3.8 points for the margin). This is pretty remarkable. Even combining dozens of pollsters each asking thousands of people their vote right before the election, we still expect to be several points off. Elections are hard to predict.
Our current situation is even more uncertain. The ~2 points of miss is for a typical candidate on election day. With Harris we start in a time with potentially limited polling data. To estimate this we look at the weighted count of polls.
Right now we estimate we have the equivalent of 10.4 top-quality national polls for Harris. For comparison, we estimate we had 21.5 top-quality national polls for Biden the day before he dropped out and 58.4 top-quality polls for Biden on 2020 Election Day.
For swing state polls we apply the same weighting. To fill in gaps in swing state polling, we also combine with national polling. Each state has a different relationship to national polls. We fit a linear function (ie, a slope and a intercept) going from our custom national polling average to 538's state polling average for Biden in 2020 and 2024. We average this mapped value with available polls (its weight is somewhat arbitrarily defined as the of the linear fit). We highlight that the national polling-average was highly predictive of 538's swing state polling-averages (avg ).
In Figure 1 we show both the weighted count of polls and the square root of the weighted count. Probability theory tells us that sampling is not linear. As an example, say we had a poll of 1000 people estimating a vote of 45% with some amount of error. If we repeated that poll and now had 2000 people, we would not be twice as confident of our estimate. Instead, we would need roughly 4000 people to be 2x as confident. Having 10.4 polls now compared to 58.4 polls on election day 2020 is like having as much polling information.
We make the assumption that the amount of swing state polls we had on election day 2020 was enough for the typical ~2 point average miss. We then estimate expected error in each swing state given how many polls we currently have there. We assume that only half of the average miss (ie, ~1 point) could be reduced with more polling. The other half this is half fraction is purely heuristic. More rigorous work could try to estimate this empirically of the miss is some unrecoverable error (for example due inability to contact subsets of voters, or industry-wide methodology flaws). See codefor precise details.
Using this method, we estimate that the average swing state polling miss is currently 3.7 points.
We emphasize the average miss here is just an average across thousands of simulations. The actual miss can be higher or lower in either direction. Following Morris (2024), we model the distribution of these errors as a t-distribution with five degrees of freedom.
Additionally, we assume poll misses are correlatedbetween states. Similar to our previous Whitmer model, we use from the 2020 538 and Economist models (via Pearce (2020)). A more pure version of this model would try to reestimate this from data.
Estimates For Today
If we use this expected miss and pretend the election was today, we would estimate a 37% chance Harris would win.
Where will Polls be in 98 Days
If a candidate is behind, they would hope for more variance in outcomes. If they are ahead, less variance is better. In addition to typical poll misses, we would expect some variance from movement polls in the next 98 days to Election Day.
Here we show the trends in 2020 and 2024. Dem Share is share of vote with just the Democrats and Republicans, ie . Keep in mind, the magnitude of shifts in this value is half that of shifts in margin.
The average 98-day movement in 2020 was 1.09 points and the average 98-day movement in Biden 2024 was 0.62 points (movements are absolute value. It could be up or down.). The largest movement observed is 4.13 points in Michigan between March 14, 2020 and June 20, 2020. During this time Covid went from an abstract concept to most Americans, to over 600,000 Americans dead while Trump mused about injecting people with disinfectants and slowing testing to make himself look better.
We attempt to estimate the mean expected move for Harris, just based on limited polling so far. We do this via a rough process of random walks sampled from her movement so far (src). We then have an estimate of an average expected move of 8.24 using just Harris 2024 data.
To get a final estimate of Harris's expected move in each state, we average together Biden 2020, Biden 2024, and Harris 2024 and blend data for a given state and the national moves. Please refer to the codefor a precise definition. This process estimates an average expected move across the 8 swing states is 3.41 points using all data.
Using this expected average, we modelthe distribution of movements as a t-distribution with 5 degrees of freedom.
Results with Movement
Here show estimated win probability taking into account both the average 3.7 point polling miss and the expected 3.41 point average poll movement.
Thus, allowing variance from movement slightly changes odds for Harris. One might intuitively expect a larger change, however we must remember that under this model, the polling miss and poll movement is assumed to be independent. As an example, we could sample a 2 point move up in polls, but this could be cancelled out by sampling a -3 point poll miss. The average combined poll miss is 5.2 points.
Model Limitations
There are several limitations of this model
Polling is a limited tool: As mentioned earlier, even with data from dozens of pollsters right before Election Day, polls typically miss by several points. One thing I took away from building this model is an encouragement to just care about polls less. We are unlikely to end up in a world on election day where polls tell us much different than a coinflip.
No mean-reversion or trends: This model assumes misses in either direction equally probable. However, in reality we should expect moves far from the historical mean to be less likely (as these voters become increasingly partisan).
Poll uncertainty quantification not empirically validated: We make assumptions about the behavior of poll uncertainty for a given number of polls that might not be valid (in particular the fraction of aleatoric uncertainty is unclear). As mentioned, stronger work would better estimate this using data from past races.
Not all states: We only model 8 swing states. This is because in situations where a state like Texas or Iowa goes blue, the election is almost certainly already decided elsewhere. We also don't model the atypical way Nebraska and Maine distribute their electors.
Ignoring 3rd party votes: Better factoring in RFK Jr. might slightly change things. Also, undecided voters are essentially assumed to split equally rather than more complex schemes.
Correlations simple: This model is not completely pure, as it uses state correlations extracted from other models. Additionally, we only consider correlations at a state-level correlations, rather than more complex schemes.
Many assumptions of this model are rough and not particularly principled.
Conclusion
The race is highly uncertain. Harris is likely slightly behind, but our "Election Today" results imply there's a fairly high chance that is polling error. Making confident predictions either way is unwise.
As a reference, using polling data from July 21 this model would have estimated that Biden had a 27% chance of winning when he dropped out with 107 days to the election. Thus, with Harris estimated at 43% today, we'd estimate she so far has improved odds over Biden then. There are potentially exciting times ahead.