News mining might have predicted Arab Spring

Published online 21 September 2011

Signs of impending social and political change may lie hidden in a sea of news reports.

Philip Ball for Nature magazine

A close reading of the news over the past 10 years could have predicted the Arab Spring.
A close reading of the news over the past 10 years could have predicted the Arab Spring.

You could have foreseen the Arab Spring if only you'd been paying enough attention to the news. That's the claim of a new study that shows how data mining of news reportage can reveal the possibility of future crises well before they happen.

Computer scientist Kalev Leetaru at the University of Illinois at Urbana-Champaign has trawled through a vast collection of news reporting and examined the 'tone' of the news about Tunisia, Egypt and Libya, where long-established dictatorial political leaders have been deposed by public uprisings known collectively as the Arab Spring. In all cases, he says, there was a clear, steady trend towards a negative tone for about a decade before the revolts1.

Although this trend doesn't predict either the course or the timing of the events that took place earlier this year, Leetaru argues that it provides a clear indicator of an impending crisis. "The value of this class of work lies in warning of changing moods and environments, and increased vulnerability to a sudden shock," he explains.

Erez Lieberman Aiden of Harvard University in Cambridge, Massachusetts, who has explored the mining of digitized literary texts for linguistic and historical trends, agrees. "Leetaru's work is interesting not so much because it makes predictions, but because it points to the power and the opportunity latent in new ways of analyzing large-scale news databases," he says.

Political scientist Thomas Chadefaux of the Swiss Federal Institute of Technology (ETH) in Zurich calls the paper "a welcome addition to a field — political science — that has cared very little about finding early warning signals for war, or making predictions at all".

The long view

Long-term trends can be subtle and hard to spot by subjective and partial monitoring of the news. But they might presage crises more reliably than a focus on the short term. For example, although there was talk during March 2011 of the possibility of similar public uprisings in Saudi Arabia, reflected in a rather negative tone in the news there during that month, the long-term data showed that spell to be no worse than other fluctuations in recent years — there was no worsening trend. On this basis, one would have predicted the failure of the Arab Spring to unseat the Saudi rulers.

"If we think of the vast array of digital information around us today as an ocean of information, up to this point we've largely been studying the surface," says Leetaru. He thinks that automated news analysis that looks for information about mood, tone or spatial references could supply something like a weather forecast for political events, perhaps "offering updated assessments every few minutes for the entire planet and pointing out emerging patterns that might warrant further investigation", he suggests.

Leetaru used the immense collection of news reports in the Summary of World Broadcasts, a monitoring service set up by the British intelligence service just before the Second World War to assess world opinion. That collection now includes newspaper articles, television and radio broadcasts, periodicals and a variety of other online resources from more than 130 countries.

Previous efforts to extract 'buried' information from vast literary resources — an approach known as 'culturomics' — have tended to focus on quantifying the occurrence of certain key words or phrases2. By contrast, Leetaru conducted 'sentiment mining' of the sources by assessing their positive or negative tone, looking for evaluation words such as 'terrible', 'awful' or 'good'. He used computer algorithms to convert these data into a single parameter that quantifies the tone of the news, normalized so that the long-term average value is zero.

Tracking the tone

For Egypt, the tone in early 2011 fell to a negative value seen only once before in the past three decades. What's more, the tone of the coverage specifically mentioning Hosni Mubarak, Egypt's now-deposed president who ruled for almost 30 years, reached its lowest ever level in early 2011. Similar sharp falls in tone were found for Tunisia and Libya.

This didn't in itself predict when those crises would happen. It seems likely, for example, that rocketing food prices helped to trigger the Arab Spring revolts3. But such trends might reveal when a region or state is ripe for unrest. Dirk Helbing of ETH Zurich, a specialist in modelling social systems, compares it to the case of traffic flow: computer models can help to spot when traffic is in a potentially unstable state, but the actual triggers for jams may be random and unpredictable.

It therefore remains to be seen whether this approach can spot signs of trouble in advance, rather than retrospectively finding them foreshadowed in the media. "It is obviously much easier to find precursory signs when you know where to look than to do it blindly," says Chadefaux.

But if news mining does turn out to offer a crystal ball, "The question is what kinds of use we'll make of this information," says Helbing. "Will governments act in a responsive way to avoid crises, say by improving people's living conditions, or will they use it to police dissatisfied people in a preventative way?"

This article is reproduced with permission from Nature 13 September 2011 doi: 10.1038/news.2011.532


  1. Leetaru, K. First Monday 16, 9 (2011).
  2. Michel, J.-B. et al. Science 331, 176-182 (2010) | Article | PubMed | ISI | ADS |
  3. Lagi, M., Bertrand, K. Z. & Bar-Yam, Y. preprint at (2011).