Home > News > The challenges of collecting terrorism data
165 views 9 min 0 Comment

The challenges of collecting terrorism data

- August 6, 2014

Robert Pape and his colleagues at the Chicago Project on Security and Terrorism (CPOST) recently reviewed the Global Terrorism Database (GTD) collected and maintained by the National Consortium for the Study of Terrorism and Responses to Terrorism (START) and concluded that “government data exaggerate the increase in terrorist attacks.” They asserted that CPOST’s Suicide Attack Database (SAD) is therefore preferable for measuring worldwide trends.
We were surprised to see the GTD described as “government data” given that the GTD is compiled and maintained through research grants awarded to the University of Maryland — in the same way that the National Science Foundation awards grants to university research teams. In the 12-year history of the GTD, our government funders at the Department of Homeland Security, the Department of State, and the National Institute of Justice have never influenced data entry decisions or encouraged a particular interpretation of the data.
Pape and colleagues argue that the trends in suicide attacks reported by GTD and SAD are extremely dissimilar. We encourage readers to look closely at the trends in each dataset:

Considering that the two series were produced by independent data collection efforts, we would argue that in fact they are remarkably similar (with correlation of .88 from 1983 to 2013, and .94 for 2003 through 2011). The similarity is even greater when we take into account the different inclusion standards of GTD and SAD. The SAD includes both terrorist and non-terrorist suicide attacks while the GTD only includes suicide attacks that meet our definition of terrorism; SAD is limited to “successful” suicide attacks in which perpetrators are killed, whereas the GTD also includes attacks that were attempted but foiled before perpetrators could detonate their explosives; and SAD requires two sources per event while the GTD requires only one.
When we standardize the data from the two collections (to the extent that it is possible) the trend lines are even more similar, especially in 2012 and 2013, the years that the authors claim that the GTD dramatically over-reports the number of suicide attacks. The truth is that when compared carefully, the GTD and the SAD paint a surprisingly similar picture of a complex and growing threat. This assessment is also consistent with recent accounts of terrorism trends by independent analysts.
In contrast to Pape and colleagues, we would argue that presenting aggregated global trends in suicide attacks as an indicator of world-wide terrorism is in fact a much more troubling source of misinformation. Considering the fact that 84 percent of suicide terrorist attacks in 2013 took place in just four of the 91 countries that experienced terrorism (Iraq, Afghanistan, Pakistan, and Syria), characterizing this phenomenon as a global threat is certainly an exaggeration. Moreover, such destructive terrorist organizations as the FARC in Colombia and the New People’s Army in the Philippines have never seriously adopted suicide terrorism as a strategy. Surely an analytical strategy that only considers a type of terrorism widely practiced in the Middle East and South Asia but rarely observed in other parts of the world is fundamentally misleading as a general measure of terrorism.
Given the widespread reach of the GTD, we feel a tremendous responsibility to be transparent about the strengths and limitations of the data. Contrary to Pape and colleagues, we believe that the GTD can be used to look at trends over time, but that the results must be interpreted thoughtfully and in concert with other sources of information. This practice should be non-controversial, as it is fundamental to science. In fact, we find it unfortunate that the CPOST team views the GTD and the SAD as competitors. We believe it is counter-productive to advocate for a single data source, and instead strongly argue that independent data collection efforts are essential to advancing science.
Thoughtful use of data is hardly as simple as “the news media keeping a critical eye on authorities” but instead is everyone’s responsibility. We make available to the public a comprehensive codebook and website that detail the GTD’s methodology and history. We encourage awareness among users of the GTD, and train scores of researchers, students, and practitioners on the perils of the incorrect or uninformed use of data. We organize panels at professional academic meetings to discuss the challenges of collecting event data. We consult with end-users working on particular analyses, and we always encourage our media contacts to resist the draw of a dramatic headline—or worse, an eye-catching tweet.
We lament that not all GTD users heed our warnings about sensationalizing trends in the data. This is an issue that vexes all researchers who make their data publicly available. The reality is that data often seem most reliable to those least familiar with it. The examples are legion. When the US State Department released its Patterns of Global Terrorism 2003 report its authors concluded that worldwide terrorism had dropped by 45 percent between 2001 and 2003. This statement and more generally, the data upon which it was based, prompted a flurry of criticism by policy makers and researchers who identified serious problems with the State Department’s data collection efforts. Pitfalls of economic data are well documented, and the Federal Bureau of Investigation’s 84-year-old Uniform Crime Report is so plagued by reporting inconsistency that the FBI strongly cautions against drawing conclusions from direct comparisons between cities.
Unfortunately, such warnings rarely stop media outlets and politicians from making sweeping claims. We can only hope that as we train a new generation in the collection and analysis of public data that we will help create a better informed citizenry.
Pape and colleagues argue that the GTD cannot be trusted because it has been driven by inconsistent data collection methodology. In fact, although different organizations have been responsible for GTD primary data collection, the quality control function of GTD data collection has been continuous. GTD researchers, past and present, have ensured that the entire database uses the same standards for inclusion and is as comprehensive as possible.
However, our critics would be on more solid ground if they had instead argued that the availability of source articles from around the globe have gone through transformative changes in recent years. With the advent of the Internet, the expansion of the 24-hour news cycle, and the birth of social media, it is easier than ever for researchers to learn about events in the most remote pockets of the world. This is a beneficial development for data collectors, but it comes with notable drawbacks, including the possibility that expanding access to information can impact trends in data over time. Although we continually work to improve and supplement the historical GTD data, exhaustive coverage is impossible because data sources erode over time. This poses a fascinating but complex challenge for those interested in longitudinal data. However, this problem is not unique to the GTD, but is true of all longitudinal data collections—including the one composed by CPOST.
This post was authored by Michael Distler, Omi Hodwitz, Michael Jensen, Gary LaFree, Erin Miller, and Aaron Safer-Lichtenstein, all of whom are senior staff with the Global Terrorism Database.