Introduction to Digital Humanities

Working with Big Data

In addition to constructing data, digital humanists frequently work with "large or dense cultural datasets, which call for new processing and interpretation methods" (Kaplan, 2015).  Because these datasets are messy, our first task is to learn how to clean them. 

This week we will discuss some of the opportunities and problems of big data research in the humanities, and you will learn how to clean a large dataset using OpenRefine.

Annotation #5

1. Guldi, Jo, and David Armitage. “Big Questions, Big Data.” In The History Manifesto, Cambridge: Cambridge University Press, 2014. Hypothesis link.

2. Wueste, Elizabeth. “Big Data, Big Problems.” Eidolon, December 18, 2017. Hypothesis link.

3. Schöch, Christof. “Big? Smart? Clean? Messy? Data in the Humanities.” Journal of Digital Humanities, November 22, 2013. Hypothesis link.

Assignment #5

At home, complete Seth Van Hooland, Ruben Verborgh, and Max De Wilde's tutorial "Cleaning Data with OpenRefine," The Programming Historian 2 (2013), When you are done, export and save your cleaned dataset so that you can use it in next week's assignment. Then, compose a written reflection on the tutorial for your "Assignment #5"  page. In addition to documenting your experiences, examine the relationship between our discussions of constructing data and working with big data, especially in light of Schöch's call for the creation of "smart big data." Be sure to follow the instructions on the "Assignment" page of our workbook to make sure that it shows up in the contents of your personal page and the "Assignment #5" page. 



This page has paths:

This page references: