We started our Twitterendum series asking what would happen if only Twitter users voted in the referendum. We concluded that post by saying:

“Of course, tweets are not votes. Twitter users do not reflect the UK population as a whole. Twitter users account for roughly a quarter of the population (23%) and tend to skew young and urban.”

While we were well aware of the limitations in the Twitter dataset, we were equally curious to see what it could tell us about voting intentions. So, now the UK has voted and the results are in, how did the Twitter model fare?

WHAT WORKED
All in all our Twitter analysis accurately forecast the direction of the vote (whether the location skewed ‘stay’ or ‘leave’) in 248 out of 381 Local Authority Districts (LADs). It inaccurately forecast the direction in 91 LADs, and a further 39 LADs didn’t have enough data so were unable to be placed in either camp.

The predictions for the frontrunners in the ‘remain’ camp were very accurately predicted. CambridgeOxfordExeterCardiffBrighton and HoveGlasgowEdinburgh and parts of London, all led the referendum’s ‘remain’ category with a margin of 20% or more. Ceredigion was the only exception, which was forecast as the ‘remain’ frontrunner by the model, but only did so by a margin of 10%. The model accurately predicted the top Bremain locations in England, and only fared slightly worse in Scotland (it didn’t predict as much intensity for remain in places like East Dunbartonshire), and obviously didn’t include places like Gibraltar (where an impressive 96% voted ‘remain’).

The ‘leave’ camp were marginally less well represented. All leading members of ‘leave’ in the Twitter model were accounted for in the final vote tally, but not with the same level of intensity. BurnleyHartlepoolKingston upon Hull and Wakefield scored margins of above 35% in favour of ‘leave’, and experienced similar levels in the model. Predictions for Eastbourne and Oldham were also broadly in line with voting outcomes, albeit less so. However, Twitter frontrunners for ‘leave’ didn’t line up with actual outcomes. BostonSouth HollandCastle PointThurrockGreat Yarmouth and Fenland experienced margins of 40% and above for Brexit in voting and had much lower ratios in the twitter model. A large part of this was due to the fact that our model de-emphasised areas with a low Twitter handle representation, a factor in those six locations. Havering was easily the worst call made by the model, which was seen to be slightly in the ‘stay’ camp; final referendum result placed Havering as radically in favour of Brexit, with 70% of the population there in favour of ‘leave’.

WHAT DIDN’T
It wasn’t so much the direction of the vote in LADs that was erroneous, but the extent of the vote. The referendum is a total voting tally, and is called when either ‘Leave’ or ‘Remain’ passes the winning post by achieving 50% of ballots cast, plus at least one vote, so the actual margin in each area is extremely important. That is to say, rather than determining victory on a per LAD basis, the overall number of votes were the most important. Our model was constructed primarily using unique accounts backing either camp from the LAD, and the percentage of the population they represented. In almost all LADs, Twitter results overestimated the margin in favour of the remain camp, overemphasising victory margins and downplaying the losses, pointing to a firm ‘remain’ victory. London was especially problematic, which we estimated as a single entity rather than breaking it into multiple zones. This effect was greatest in outer London areas, proving completely inaccurate in forecasting the result in the aforementioned Havering.

YOUNG VERSUS OLD
No matter what the size of the sample, uncontrolled bias skewed the results. Age was a major determining factor for the model’s shortcomings. Simply put, people aged 45 and above were scarcely represented.

Twitter skews young, urban and only accounted for approximately 23% of the population (30% of internet users in UK). Our model forecast a decisive victory, and under the above conditions it was fairly predictive. However, not only did voters aged 18-34 account for only a fraction of the population, they appeared to have voted a lot less (only 36% of 18-24s and 58% of 25-34s voted, according to Sky Data.

As a result, the model’s forecasting generally biased towards a group that leaned towards ‘remain’ (75% of 18-24s voted to ‘remain’ according to YouGov). Whilst this explains the underlying lean toward Bremain, the areas with a disproportionate amount of older voters were inherently less accurate. We can see that for the majority of areas with a disproportionate amount of people above 45, the model predicted completely inaccurate results.

URBAN BIAS
The same bias was was even more present in urban centres, where data was much more concentrated.

When choosing the source of a tweet, we assigned based on self-reported locations in each twitter handle bio.  Overall, twitter users reporting their location were far more likely to identify a major city than a rural place, even if only peripherally attached to it. The result was that LADs containing a major city had a disproportionate amount of content, higher percentage of representation and thus higher scores in our model.

Ultimately, cities likely had even more of a ‘stay’ lean due to high proportion of younger people combined with higher scores due to a larger amount of unique accounts identified.

OTHER MISCELLANEA
We calibrated the model to only include people living UK through analysing self-reported city of origin and the usage of the English language. While this is not necessarily a bad way of representing people who live in the UK, there is no doubt a high number of non-voting migrants in our model. According to a House of Commons Briefing Paper on migration, 5.3 million migrants lived in the UK, 2.9m of which came from the 27 EU member countries. It’s not unreasonable to imagine that this population would have been very active in the run up to the vote. This lends further bias to the ‘remain’ camp, as reflected in Figure 3 above.

IN CONCLUSION
So, Twitter is useful at understanding a very specific audience in a very specific context, but we must be wary of stretching it further. A referendum is a truly seismic event, with all walks of life represented in the electorate. While public and accessible to most, Twitter is far from representative of the UK population as a whole. The value of its data lies in illuminating a particular part of society and in the ease of access and quantity of its data. Size of sample is certainly important (the number of unique accounts observed is much higher than that accessible through traditional polling), but with such heavy bias it ultimately overpowers the ability for the model to make predictions about events encompassing the totality of society. The Brexit result was surprising to many, and shows how easily we’re caught in our own echo-chamber, surrounded by like-minded people and unable to fathom the full spectrum of opinion. Looking at Brexit through Twitter underscores this phenomena and the importance of a balanced dataset if one is to make observations of any kind.

Predictions using data fare well when the underlying elements follow specific rules. Politics can be hard to predict because those rules tend to be opaque, or only sporadically followed. Recent failures in data-driven models of the political landscape (The Brexit result or Trump’s nomination, for example) could be down to the fact that the electorate is ultimately changing the underlying rules on what they will vote for. Making sense of the world using data is an important advantage and a cornerstone to better making better predictions. Nonetheless, when making predictions with data (irrespective of robustness of model and its accompanying data), it is good to remember that the world remains an uncertain place, and to approach predicting it with a healthy dose of humility and scepticism.

By Lucas Galan