This site requires Javascript to be turned on. Please enable Javascript and reload the page.

The Computational Literary Studies Debate: Thoughts from the Fall 2019 Class, Writing in Digital Humanities

The Future of Literary Analysis: Arithmetical or Subjective?

When we think about literature, our minds do not typically conjure images of computers. Traditionally, it is believed that literary analysis consists of mostly human reasoning and have little to none algorithmic and mathematical involvement. In today’s modern era, however, the application of computational tools has become more apparent in the field of literary studies. An example of this is the sub field of Digital Humanities that uses algorithms and computation to approach different aspects of literature. This sub field is called computational literary studies, or CLS. CLS can also be defined as “the statistical representation of patterns discovered in text mining fitted to currently existing knowledge about literature, literary history, and textual production…” (Da 602). Historically, literary studies have been intertwined with humanistic thinking. Darwinian literary criticism is an example of how literature and the human condition are not and can not be mutually exclusive. This branch of literary studies claims that “characters in literary representations...reflect the notions and beliefs and purposes of individual authors” (Caroll 76). I would argue that in this sense, only other bodies that also have individual notions, beliefs and purposes can fully understand underlying meaning that is embedded in literary works. While humans can carry out the complex thinking that is involved in adopting beliefs and purposes, today’s computers do not contain algorithms that can imitate these processes. Due to the absence of humanistic thinking in computational approaches, computational literary analysis leads to inaccurate and insubstantial work. In order to provide a critical, holistic and in depth inquiry on the faulty nature of computational literary analysis, I would like to examine supporting and counter arguments that are present in the controversial conversation around computational literary analysis while engaging with other scholarly opinions.

"Literary computational methods, such as text mining, regardless of how much more they can process, can ultimately lead to a “meaningful reduction of ‘[literary-historical and linguistic]’ complexity” (Da 629, 638).

One of the supporting arguments of CLS that I would like to address is the appeal of the increase in speed of literary processing that is presented by the use of computational methods for literary analysis. Scholars such as Da have rebutted this argument, claiming that literary computational methods, such as text mining, regardless of how much more they can process, can ultimately lead to a “meaningful reduction of ‘[literary-historical and linguistic]’ complexity” (629, 638). I would like to place emphasis on the use of “meaningful” in Da’s statement. I believe that a reasonable amount of reduction is common when analyzing literature. What Da is implying in her statement is that computer programs are not able to capture the full extent of the elaborate intricacies that pieces of literature may contain. A “meaningful reduction” of the historical context of a text can produce a completely inaccurate and misdirected analysis. The reduction of such a substantial amount of information may also be what is sacrificed to receive “quicker, more intuitive, noncontingent calculations” (629). With this in mind, scholars can question what is more important: reducing the amount of time it takes to complete literary analyses or ensuring that the analyses we produce accurately capture the historical and linguistic context of texts. It is apparent that Da believes that the latter is more essential. There are many studies that have been conducted in an attempt to prove that complex analyses of texts can be successfully conducted by computational literary studies. In Nan Da’s journal article, she demonstrates that, regardless of the results presented by statistical analysis and quantitative data, computational approaches do not successfully draw deliberate conclusions within related literary context. With the purpose of exposing the flaws of computational literary analysis, I would like to analyze what is presented by Da when critiquing statistical studies conducted by several scholars, including Matthew Jockers, Gabi Kiriloff and Andrew Piper.

“'Theoretical approaches' are not always successful at “producing interpretations that are intentional, that have meaning and insight defined with respect to the given field” ( Da 621).

I agree with Nan Da’s view that the yield of statistically significant results in these scholar’s studies do not prove that statistical methodologies can competently analyze literature. For the research conducted by Jockers and Kiriloff on the correlation between verbs and gendered pronouns in a dataset, Da claims they may have yielded significant results due to sample size. When critiquing Piper’s study on Augustine’s novel, Confessions, Da states that they may have yielded significance because of an evident shift of contextual focus (610,612). She makes a solid argument when stating that “computational literary criticism...often places itself in a position of making claims based purely on word frequencies without regard to position, syntax, context, and semantics” (611). The utilization of computational tools such as word frequency present limitations when analyzing texts. Ted Underwood was another scholar that Da mentions in her essay that used word frequency to see if genres evolve over periods of time (607). Da claims that these “theoretical approaches” are not always successful at “producing interpretations that are intentional, that have meaning and insight defined with respect to the given field” (621). These methods produce fallacies that do not provide meaningful analyses. Literary analysis is the main, but not the only component, of computational literary studies. As defined by scholar Amy Earhart, digital literary studies are composed of “four dominant areas of world...: digital edition form, the digital archive form, cultural studies approaches, and literary data approaches” (7). I would like to critique the rise of the digital edition form. Computational methods have not only proven to create futile work when analyzing literature but I also propose that digitizing approaches used in editing literature lead to low-quality edition.

“Rejection of digital literary studies has occurred because of the legacy of associating edition building with...the charge of uncomplicated, simplistic, and mechanistic digital literary studies work” (Earhart 16).

Textual studies and editions have traditionally been conducted on printed copies for decades. The emergence of digital forms of text edition has raised a substantial amount of controversy. In Amy Earhart’s chapter “The Rationale of Holism: Textual Studies, the Edition, and the Legacy of the Text Entire”, Earhart “examines...the distrust of the digital environment, the holistic text, and the desire for editorial control of the text” (7). She argues that the “rejection of digital literary studies has occurred because of the legacy of associating edition building with...the charge of uncomplicated, simplistic, and mechanistic digital literary studies work” (16). Her claim adds to the argument that computational methods are unsuccessful at capturing the complex nature of certain texts. Here, digital humanists must consider again the question I proposed earlier. If simplistic work is the result, then why bother utilizing computational methods? The use of digital edition may rupture the standards that individual editors set by oversimplifying the elaborate approaches that are taken to edit text. Human editors capture minute details because they are capable of acknowledging a text’s complexity. When the humanistic aspect of edition is removed, not only is the text impacted by this, but so are the editors that are replaced by a computer.

“The logic of [digitizing edition] would be to de-skill and demote the very individuals, text-editors, and text-theorists, whose interests it is supposed to promote” (Earhart 16).

Writing in her chapter, Earhart discusses the contrast in the role and power of an editor in digital versus traditional textual studies, highlighting why CLS would lack appeal to textual critics (16). One of the scholars that Earhart mentions is Thomas Tanselle, who feared that digitizing texts could cause a “loss of editorial control” (Earhart 13, 14). This concern may make digitized edition seem like a treat to people who use their own literary knowledge and experienced to make thought-out editorial decisions when approaching texts. Earhart also states that “the logic of such a move would be to de-skill and demote the very individuals, text-editors, and text-theorists, whose interests it is supposed to promote” (16). By stating this, she emphasizes that the power-dynamic between a text and an editor is affected by digitization. From the presented perspective, the editor becomes less powerful and even less effective when the medium becomes digital. I believe that edition is more than the simple recognition of errors in syntax and grammar. Text edition also includes contextual edition, which is too complex for computers to holistically comprehend. Literary context is composed of humanistic elements that computers can not understand from a simple algorithm. Furthermore, a computer may decide to change a component of a text that is purposefully the way it is because of an author's own stylistic choice or purpose. It could recognize that stylistic choice to be a mistake because it is only following their algorithm. The computer would not know any better. Regardless, simple editorial changes like these could lead to the reduction of the author's message.

After analyzing how computational methods lack a holistic approach to literature, I have come to the conclusion that the combined implementation of computational analysis and digital edition would lead to fruitless work. Da and Earhart demonstrate the computational methods result in literary reduction, misinterpretations of quantitative results, loss of tradition and reducing the significance of humanistic thinking in literary work.

Works Cited

Carroll, Joseph. “Human Nature and Literary Meaning: A Theoretical Model Illustrated with a Critique of Pride and Prejudice”.
The Literary Animal: Evolution and the Nature of Narrative, 1st edition,
Northwestern University Press, 2005, pp 76.muse.jhu.edu/book/5333.

Da, Nan Z. “The Computational Case against Computational Literary Studies.” Critical Inquiry, vol 45, 2019,
pp. 602-638. doi: https://doi.org/10.1086/702594

Earhart, Amy E. Traces of the old, uses of the new: The emergence of digital literary studies, University of Michigan Press, 2015, pp. 2-17. https://books.google.com/books? id=KE3gCgAAQBAJ&printsec=frontcover
&dq=isbn:0472052780&hl=en&cd=1&source=gbs_api#v=onepage&q&f=false

This page references:

The charts or my own brain? I choose the latter.