Manipulating Data
http://miriamposner.com/classes/dh101f17/tutorials-guides/data-manipulation/
https://ryancordell.org/research/teachingHDA/
'Data' can be a difficult term for humanists. As Miriam Posner of the Department of Information Studies at UCLA explains in "Humanities Data a Necessary Contradiction:"
When you call something data, you imply that it exists in discrete fungible units; that it is computationally tractable; that its meaningful qualities can be enumerated in a finite list; that someone else performing the same operations on the same data will come up with the same results. This is not how humanists think of the material they work with.
Despite discomfort with the term, humanists today engage with data on a regular basis. The data that shapes our professional lives can be defined as "a digital, selectively constructed, machine-actionable abstraction representing some aspects of a given object of humanistic inquiry" [1]. As this definition suggests, the state of our data - and its utility for research - depends on the construction process. For analogue objects, the process begins with digitization. From there, both digitized and born-digital objects need to be curated, structured and/or annotated to facilitate human and computational analysis.
In the digital humanities, there are different approaches to working with data. One approach focuses on constructing 'small' datasets that critically engage with - and frequently challenge - traditional classification systems, editorial practices, archives, or cannons. Another is rooted in the field of big data research. Oriented towards the social sciences, big data research in the digital humanities focuses on "large or dense cultural datasets, which call for new processing and interpretation methods" [2]. Whereas the first approach uses web-based technologies to publicly redress absences and biases in "how people process and document human cultures and ideas," the second uses computational methods to perform macro-level analyses [3].
Building on last week's discussion of metadata, this lesson examines the big data approach.
This page has paths:
- Assignments Andrea Davis