Home > News > Using clustering algorithms to organize edited volumes
182 views 2 min 0 Comment

Using clustering algorithms to organize edited volumes

- April 16, 2009

I’ve just gotten a copy of Gary King, Kay Lehman Scholzman and Norman Nie’s “The Future of Political Science: 100 Perspectives”:http://www.amazon.com/gp/product/0415997011?ie=UTF8&tag=henryfarrell-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=0415997011 . I will likely be dipping into it over the next few weeks and responding to various of the short chapters in blogposts – but for starters, I thought that the way the editors decided to organize the 100 odd short book chapters was interesting.

bq. With 100! possibilities, placing these essays in some kind of order posed an obvious challenge … resisted the suggestions of friendly readers and the importunes of editors, each of whom posed a perfectly reasonable – but utterly different – organizing scheme … tried sitting on the floor and putting the essays in logical piles, but this too was uncertain. We turned instead to an automated procedure to assist us in locating an organizing scheme to present here. After trying various existing clustering algorithms that work on unstructured text, we ultimately developed one that seemed most useful for our purposes. … any essay within a cluster has more in common with the essay at the cente ro fits cluster than with essays outside its cluster … We developed a related algorithm to order the essays to print in this book, and then hand-tooled it to reflect our understanding of the content of the essays. … [after performing a test with pairs of essays matched in various ways by humans and otherwise, and then asking RAs to evaluate the fit of the matches] [w]e were interested to learn that the clustering algorithm performed best in matching pairs of related essays, followed in descending order by the essay-by-essay approach, the political science categories, and random selection.

Topics on this page