Quantifying political disruption

José Luengo-Cabrera
16 min readMar 29, 2021

Digestible information for crisis situations

Data can be another word for stored information. It can be organized and categorized in ways that make it easier to analyze and visualize, especially large quantities of it.

When coded in numbers and geolocated with coordinates, quantitative data has the power to provide a better sense of the magnitude, frequency and geographic distribution of (measurable) phenomena across time and space.

But when dealing with issues that require contextual knowledge, data can only tell us so much. For this reason, data analysis should be envisaged as a complementary tool to qualitative analysis.

This is particularly true in crisis-affected countries, where politically motivated (violent and non-violent) disruption can cause much confusion due to the multiplicity of events taking place simultaneously, some of which cannot necessarily be quantified.

Measuring the measurable

Political disruption (actual or anticipated) can create, inter alia, significant economic and social distortions. While politically-motivated violent incidents incur costs in the form of property damage, physical injury or psychological trauma, political mobilization protesting against a given territory’s (adverse) predicament can alter economic behavior, primarily by perturbing consumer patterns but also by diverting public and private resources away from productive activities and towards protective measures.

Combined, they generate significant welfare losses in the form of productivity shortfalls, foregone earnings, and suboptimal expenditure — all of which affect the price of goods and services.

Measuring the scale and cost of political disruption has, therefore, important implications for assessing the effects it has on economic activity and societal welfare. It is, however, difficult to estimate the wide-ranging socio-economic externalities generated by political disruption.

Consequently, the scope for quantifying it is restricted to what is tangibly measurable and for which reliable data exists. In other words, the estimated monetary costs engendered by (non)violent political disruption and the financial expenditure allocated to contain or prevent it.

Although it provides an incomplete diagnostic of the multidimensional impact of political disruption, calculating the costs associated with it is important for gauging the magnitude of the problem. However, this remains an elusive (and even contested) research endeavour that is already the subject of numerous studies and which is beyond the scope of this analysis.

The focus here will be placed on the added value of systematically gathering data on political disruption to understand the spatio-temporal patterns of crisis-related events and the ostensible utility to so-called policymakers.

Impact quantification

Because of the high-frequency nature of crisis situations, attempting to quantify their drivers can provide a more tangible way of gauging their spatiotemporal nature, notably so as to identify trends and patterns — which can, in turn, provide useful information to design and implement operational responses.

With the use of statistics becoming increasingly popular in recent years, the growing salience of what has been dubbed as ‘evidence-based policymaking’ has resulted in a greater demand for ‘data-driven’ analyses.

This has led to a burgeoning number of research outlets providing quantitative data on a wide array of crisis-related phenomena. It is progressively allowing researchers to analyze this data and communicate results via visualizations.

Analyzing available data on events that have a politically disruptive impact provides a relatively more practical way to diagnose the temporal variance and geographic nature of phenomena like civil disobedience, armed violence, food insecurity or population displacement.

In turn, data visualization can be a valuable communication tool, one that provides a general audience with digestible snapshots of how politically disruptive events are impacting different sectors of society across time and space.

Data caveats

While the quantitative approach has its notable advantages, data is however often taken at face value, sometimes creating the illusion that statistics are unquestionably reliable.

However, because of the hazardous — and indeed oftentimes politicized — nature of collecting and disseminating information in countries undergoing crisis situations, data must always be consumed with caution, especially when considering the methodological differences and varying statistical outputs across different datasets.

When dealing with issues requiring contextual knowledge, data has its limitations. The high rate of information being generated by different events taking place across different locations makes it hard to capture the ‘relevant’ data.

When analyzing datasets, one has to take into careful consideration the caveats that data providers tend to identify regarding the process of collecting and coding information on events of interest. This is especially important when the sourcing of the information is incomplete or even contested.

The politics of numbers

In crisis-affected countries, the politicization of numbers is of particular interest. Certain political actors might have an interest in under/over-estimating (or simply omitting) certain figures for political ends.

This is seemingly pertinent with regards to reporting on politically sensitive issues like misconduct of public officials, heavy-handed military operations or violation of human rights, among others. Oftentimes, information on public misconduct is restricted, creating certain distortions.

Moreover, when events take place in remote areas, for example, information on politically sensitive incidents can face serious limitations. This is not only due to difficulties in verifying information across limited sources, it is also due to the fact that these events may be underreported— or reported with a lag.

Although we are currently witnessing a ‘new media’ wave in the form of independently shared online (Twitter/Telegram/Whatsapp) information on events of interest in areas where events tend to be underreported, cross-checking the veracity of this information is often difficult. Citizens may also face sanctions for sharing information online that is considered politically sensitive or for being critical of government action.

In a constellation of actors with diverging political interests, access to information tends to be imbalanced, especially when certain actors have a greater (and sometimes even coercive) power to monopolize it.

In addition, when information is sourced from news outlets, a degree of ‘media bias’ has to be taken into account, especially when (arguably) more ‘sensationalist’ events like armed attacks or urban unrest tend to be more heavily reported.

Data providers

The Uppsala Conflict Data Program (UCDP) and the Armed Conflict Location and Event Data Project (ACLED) are two popular data providers in the realm of political disruption.

Given that their data is geo-referenced and publicly available, these data providers are making it easier to identify not only the timing, location and intensity of politically disruptive events but also the actors involved and affected by them.

This ‘disaggregated’ approach to collecting and coding data is providing an avenue for generating digestible diagnostics on conflict-related dynamics at the subnational level. Its appeal emanates from the provision of numerical estimates that serve as a tool for communicating evolving patterns of political disruption — across locations and over time.

Data providers are organizations that essentially collect, aggregate, categorize and code information provided by a wide array of different sources. The way in which this information is stored and categorized matters for how one can go about analyzing and visualizing it.

UCDP and ACLED have gained prominence as the go-to sources for quantifying political disruption. Meanwhile, differences in how they define events and how they are coded has prompted a growing literature on the pros and cons of these (primarily) media-sourced event datasets, with notable discussions on comparability and reporting bias.

For any research project, one can assess the relative comparative advantage of using one dataset or the other. It all depends on the sort of events one is interested in. In the realm of political disruption, it is seemingly as important to get a sense of both fatal and non-fatal events.

UCDP, for example, only counts events where at least one fatality has been reported. ACLED goes a step further and counts lethal as well as non-lethal events. This results in ACLED providing information on a wider range of events than UCDP.

Arguably, including non-fatal events that are generally considered precursors to larger-scale violence give ACLED an additional edge for researchers and organizations interested in analyzing the drivers of deadly political disruption.

This can range from events like protests and riots but also non-fatal campaigns of intimidation and violence by state forces and non-state armed groups — as well as attacks on public infrastructure (i.e. bridges and roads) or looting of commercial sites (i.e. mines and oil wells).

In addition, ACLED updates its database on a weekly basis — UCDP once a year. For this reason, ACLED has seemingly become the preferred source for audiences wanting to get a quantitative sense of trends and patterns of political disruption in (quasi) real-time.

While there are notable caveats to be taken into account when using ACLED data, particularly with regards to the precision of numbers when they remain provisional or contested, there are two main advantages to the use of ACLED data to study political disruption.

The first is that it reduces the time usually taken for analysts to process (and look back at) coded information on events of interest. When data like ACLED is structured in the standardized ‘tidy data’ format, it is easier to analyze and visualize.

Second, datasets like ACLED make it easier to verify the quality of the data — particularly as ACLED provides details on its sources and includes written text describing the context and nature of each reported event.

In essence, ACLED is a sort of ledger that allows researchers to evaluate the veracity of reported information on (violent) political disruption. Moreover, ACLED also provides a measure on the accuracy of numerical estimates regarding different types of events, largely based on the number of different sources reporting information that can be cross-checked with some degree of confidence.

This facilitates the task of organizations seeking to aggregate different datasets so that they can be analyzed and visualized simultaneously. The more data is coded in ‘tidy data’ format, the easier it is to get a sense of the overall picture across datasets measuring different things.

Data aggregators

Data collection tends to be a rather fragmented process in crisis-affected countries. When otherwise isolated developments become more intertwined across time and space, it becomes increasingly important to harmonize the collection and publication of data.

This expedites and facilitates the process of merging datasets containing valuable (especially time-series and geo-located) information that is reflective of the different ways in which any given crisis is unfolding — and how different events are impacting one another.

But data collecting organizations seldom gather data on the wide array of indicators necessary to understand a multidimensional crisis. This makes it harder for anyone wanting to evaluate the flurry of factors driving any given crisis situation, especially those that warrant rapid responses — notably whenever people’s lives are at risk.

Time is often lost gathering (and cleaning) different datasets provided by different organizations — with varying options on downloadability across platforms — and variations in how the datasets are structured.

This is why OCHA's Humanitarian Data Exchange platform is so important. Here, you can find all sorts of cross-indicator data in all sorts of compatible file formats relevant to understanding countries experiencing humanitarian crises. Because HDX undergoes a quality-control and review process, the datasets made available enjoy a degree of trustworthiness.

By partnering with different data providers and making their data available on the HDX platform, researchers can access and analyze a wide number of standardized datasets of interest, making it easier (and faster) to summarise key results. This is particularly useful for generating visuals mapping multiple datasets at a frequency equal to the frequency with which datasets are being updated.

The European Commission's INFORM Risk Index is also a valuable resource. Like all composite indices, it standardizes different indicators by transforming their values into a common numerical scale. This allows anyone to compare indicators that use different measures.

By using the index’s dataset, researchers can evaluate the variability of available data relating to countries undergoing humanitarian crises and use index scores to understand their intensity, sometimes even at the subnational level.

The European Commission’s Joint Research Centre also provides sufficient support material to understand the methodology behind the construction of the INFORM Risk Index, opening the possibility for researchers to replicate other composite indices of interest using additional datasets.

While imperfect, these are the low-hanging fruits for any researcher or organization wanting to get a sense of what is available in terms of quantitative data in crisis-affected countries. After all, stored data can only show you things that can be tracked and coded in a way that can be read by statistical software.

At the very least, publicly available (and downloadable) datasets can give us an approximative diagnostic of what is going on. As public goods, datasets’ codebooks also provide researchers with access to their methodology, which facilitates open discussions on possible shortfalls and improvements of any given dataset.

Data, however, cannot fully replace the more time-consuming — but oftentimes necessary — understanding that (for now) comes primarily from written or spoken analysis.

Data literacy

The ability to analyze data isn't seemingly as widespread as the ability to gather and disseminate qualitative information. Written articles are (arguably) relatively more abundant than data-driven analyses, even if, encouragingly, a growing number of written analyses are complemented by quantitative analysis.

This can be due to the fact that more people are able and willing to inform on crisis situations using words or speech. It is also possible that data on countries in crisis situations is harder to get a hold of. One only has to look at the number of (updated) cross-indicator datasets made available by the offices of national statistics or the multiple (inter)national agencies in different countries to evaluate the degree of data availability.

Data paucity remains an issue that makes it relatively more difficult to generate quantitative diagnostics of continually evolving developments in crisis situations. With this in mind, there is an (arguably) relative shortage of researchers with the quantitative skills required to analyze and visualize data on countries undergoing crisis situations — at least when one compares this to the share of data analysts in the technology sector, for example.

Increasing the financial appeal for more data analysts to focus on countries undergoing crisis situations is warranted, especially as hiring data analysts can be considered a cost-effective investment for organizations seeking to provide quantitative diagnostics as crises (that are tracked) are evolving.

Luckily, there are sufficient open-source platforms and free software that are supported by a plethora of training material, especially video tutorials. The data community is also global and full of kindhearted denizens willing to share their knowledge on the different methods required to use computer software that can read, analyze and visualize quantitative data.

Statistical programs like R-Studio, for example, can be used to wrangle and merge different datasets in order to carry out correlation analyses (i.e. correlating armed violence with population displacement). Identifying the geographic covariance of different types of phenomena could prove useful when trying to design and implement operational responses to address the varying nature of political disruption.

In this vein, the R-Studio community is particularly relevant. R users are known for being generally supportive of their peers, as reflected by the number of people of all walks of life helping others solve their queries on how to use the R console. These sorts of online communities provide enough training resources for anyone with an internet connection to learn how to analyze and visualize data independently.

For those interested in data mapping, QGIS is the go-to geographic information system (GIS) application that supports viewing, editing, and merging geospatial data. Free of cost, it is a user-friendly software with sufficient functionality to format and generate high-resolution maps.

Finally, nothing beats YouTube as a pedagogical resource — once you filter through the noise. Anything you need as an aspiring auto-didactic data analyst in the realm of crisis quantification is available to anyone with (un)restricted internet connection.

Providing internet access to as many people as possible might help in increasing the availability of information that can be analyzed independently by a growing number of people, especially those directly involved in or affected by crises.

Researchers tracking political violence across West Africa provide a good example of how incentivizing individuals from different organizations to ‘combine forces’ can generate economies of scale. This could, potentially, reduce the cost and time otherwise needed to understand how crises are evolving and why.

Having people with mixed research skills, different cultural backgrounds and advanced contextual knowledge working together could result in improving how trends and patterns in crisis situations across West Africa are communicated.

Understanding the Malian context could benefit from having a statistician like Ahmadou Dicko work more closely with a researcher like Ornella Moderan or a policy advisor like Ibrahim Maïga.

This would be the case in Burkina Faso with Sahel Security and Mahamadou Savadogo; Gilles Yabi with Amadou Ndong for coastal West Africa, Héni Nsaiba with Ibrahim Yahaya Ibrahim and Sarah Lawan for Niger or Nnamdi Obasi and Comfort Ero with Chinemelu Okafor for Nigeria, for example.

Putting more financial and human resources into mixing quantitative data with qualitative knowledge for understanding the multiple sources of instability in the West African region (and beyond) would be a welcomed start.

This piece is being updated as new (suggested) information comes to the fore.

Visual examples of how cross-dataset visualizations can be helpful for identifying spatiotemporal patterns. Any errors author's own.