Reproducible Research

This week’s readings concerned databases and the elegent, reproducible methods of getting historical knowledge out of them. Many of the themes were very familiar to me and echoed a graduate seminar on local archives I’d taken in the past. The questions of authority, how to extract knowledge, and the role of the historian/actual work of the historian were at play in archives, both digital and physical.

Tim Hancock’s chapter covered a number of interesting topics, but I fixated on his casting of the historian as an archives “expert.” The historian would toil at the archive, expertly pulling substantive examples to support their work from a huge collection of non-relevant material. The work would then be published, launching others to build upon the nuggets of information mined from the opaque archive. In the digital archive, anyone can search and unearth their own information, democratizing the historical process (which most agree is one benefit of the digital).

But I wondered where this leads the professional historian. If the public doesn’t require someone with the peculiar expertise to determine the “correct” evidence, what is the point of our discipline and its requisite training? This was answered by Lara Putnam in “The Transnational and the Text Searchable.” Putnam shows that where historians had to search in the liminal spaces for their research topics,  “against the grain” from the “historia patria” at the heart of many national archives’ missions, historians must still do so in their digital searches. For searching online can provide you the documents but it often doesn’t provide the context or the local knowledge of a physical archive.

The hazards of new digital methods were also echoed in McDaniel, Mussel, Nicholson, and Spedding. This is where the professional historian can provide value over the amateur. In gleaning the meaning hidden in the cracks of the databases, understanding the technologies underlying the documents (the limits of OCR, for example), and providing reproducible research methods, a professional historian still provide value in the pursuit of the past.

One of McDaniels points about reproducible research methods was the cite the electronic version of a document if you actually used the electronic version. In her 2014 article, Lara Putnam does so, but still has broken links. The link is broken because she mistyped the link to this page, but the reader might not be able to find an object without conducting their own search. This speaks to one of the major problems with citing to electronic objects, they might not exist in that spot in perpetuity like a physical book with an LC call number. So this is the other place where professional historians will need to evolve our methods to provide truly reproducible research that ensures our peers or the public can follow us through the evidence we find.

But the readings don’t quite address a reproducible workflow from start to finish- a set of instructions- 1. go to X database 2. use Y search term  3. organize the data in Z way.  One of the strengths of computers is that they run a set of instructions the same way given the same parameters and inputs, so digital history should leverage this ability. It would definitely look more like William Turkel’s vision of history and be barely recognizable to those who rely exclusively on physical documents in archives for their research. Yet the databases *are* different and we need to recognize that fact and leverage it where it’s useful. To search across traditional colllections, as in Nicholson’s media culture history, and beware of the possible pitfalls of divorcing documents from their sense of place as Putnam points out. Manovich is correct that databases utilize a sense of narrative that is unique from traditional history, so the discipline can’t continue to pretend we work the same way we always have.

