Predicting the future from the social media swamp


About 20 years ago my techno-utopian friends in California began to have a strange glint in their eye. It’s the sort of glint people acquire after joining a religious cult, and they were obsessed with a new idea. What they believed was that with so many people now communicating via the internet, what they called a “Hive Mind” was beginning to emerge. And from this we could derive people’s thoughts and intentions.

Some even hoped it would help replace today’s creaking democratic institutions with whizzy new virtual ones.

From the late 1990s, Harvard University’s new Berkman Centre, a part of its law school that was once memorably described as a “New Age crank tank”, devoted considerable energy to this idea, which it called “emergent democracy”.

Less surprisingly, Google’s founders also loved the idea – particularly after their new search engine was flatteringly described as a “database of intentions”. 

However, many more of us found both the Hive Mind and the people advancing it to be sinister and creepy. They seemed unhealthily keen on global government, and of course, they envisaged themselves as the Mind’s chief soothsayers. It was a naked attempt at social climbing, and thankfully, the phrase fell out of use.

Yet today, billions of people are recording their thoughts in public. These are now routinely now sampled by marketing companies and political consultancies to gauge the public’s feelings – something called sentiment analysis.

So it seems reasonable to ask, with so much data being produced in real-time, does this vast corpus have any predictive power, too? And if so, what hoops must a serious forecaster jump through to see useful patterns in the deluge of rants and memes? In an era when pundits and pollsters consistently get the big calls wrong, and the Bank of England can’t even predict inflation, this seems worth exploring, at least.

The most celebrated success in post-war forecasting was attributed to the Royal Dutch/Shell group, where Peter Schwartz and his team predicted the 1973 energy crisis. Schwartz eschewed the F-word (forecasting), preferring to call it “scenario planning”, a term that had been devised by the architect of the US nuclear strategy, Herman Kahn – the original model for Kubrick’s Dr Strangelove.

You’ll note the weaselly distinction between a forecast, and what may be a range of possible outcomes. Schwartz couldn’t repeat the trick.

The most notorious of all the model oracles was the one assembled for a group of industrialists called Club of Rome, in an exercise described as “Malthus with a computer”. But the doom-mongers failed to take note of innovation, in particular, the revolution in agricultural productivity, led by Nobel Prize Winner Norman Borlaug, that was taking place before their eyes.

“There was a golden age of forecasting, but it was quickly demolished,” notes Prof James Woudhuysen, a professor of forecasting and innovation at South Bank University. So the idea of using the toxic swamp of social media to make successful predictions today would therefore seem to be a fool’s errand. But here there is an unlikely success story to report.

In 2016 a former McKinsey consultant who had also been a senior member of BT’s strategy group, Alan Patrick, was examining social media sentiment as the EU membership referendum debate unfolded.

He blogged about a model he and a small team had devised at his consultancy, Broadsight. He described which trends were relevant, and which were not, and the tweaks the team made as he went along, all with an engineer’s scepticism.

Patrick correctly predicted Leave would win, and the team repeated the success with the 2016 US Presidential Election, and then six months later, Macron’s victory in France. That was the hardest to call, Patrick notes, as votes dispersed from many candidates to just two, and voters decided only very late in the day.

Broadsight became DataSwarm, and in 2019 it won wider recognition, in a Geopolitical Forecasting competition overseen by IARPA. This is the US intelligence office agency whose competitions, devised with the aid of Prof Philip Tetlock, revived interest in forecasting.

Contestants must make forecasts across 350 economic, social and financial data points. Despite the hype around “superforecasting”, a term Tetlock made popularised, IARPA is really concerned with making predictions a little better, and a little more predictable, and seeing what might work.

Patrick describes the social media swamp as a very imperfect mirror to social reality, one that is heavily distorted. We know that, because so many Labour supporters were convinced from Twitter that they were going to win the 2015 General Election – and even the 2019 vote.

What matters is removing one’s own biases. The experts didn’t predict Brexit, Patrick suggests, because they lived in an echo chamber: “The BBC were mainly talking to Remainers,” he says. One needs to look past the bots, the superfans, and the “superhaters” and discern what are the valuable signals, and the velocity at which they are moving.

To the Panglossian techno utopians two decades ago, the Hive Mind was rather like Jung’s collective unconscious, but without Jung’s corresponding idea of the Shadow – the dark side that every individual and society must confront.

Perhaps DataSwarm’s relative success is because, unlike the utopians, it is prepared to confront it. As Patrick says: “There’s what people believe humans are like, and what they are really like. The world is not beautifully happy people. It is what it is, not what you want to see.”

0 responses to “Predicting the future from the social media swamp”