This site requires Javascript to be turned on. Please enable Javascript and reload the page.

The Computational Literary Studies Debate: Thoughts from the Fall 2019 Class, Writing in Digital Humanities

Figure 3: MDS of language similarities among 13 books of Augustine's Confessions by Nan Da

This page is referenced by:

Then and Now and Then What?
Olarte, Marcus Christopher O.
Professor Grant Glass
ENGL 105i-037: English Composition and Rhetoric (Interdisciplinary)
27 September 2019
Then and Now and Then What?: Applications of New Developments in Digital Humanities
In Digital Humanities (DH), upcoming fields such as computational linguistics (CL) and computational literary studies (CLS) introduce statistical applications upon analyses of literary texts. Their studies utilize “the statistical representation of patterns discovered in text mining” and adhere them to “currently existing knowledge about literature, literary history, and textual production” (Da 602). Statistical patterns, such as word frequencies and point-of-view consistencies, attempt to provide quantitative conclusions with computational approaches. These conclusions, however, are subject to higher rates of error than traditional literary analysis. Although literary scholars postulate that DH cannot develop significant quantitative conclusions due to the inherent complexity of literature, recent academic contributions highlight the new developments that not only provide foundational knowledge but improve upon current methodological practices (Kuhn 29).
In this paper, I will demonstrate how new developments improve upon the supposed insignificance of CLS results by overviewing past criticisms on research, current flaws in computational methods, and the future of improving its faults.
Then: Da and Kuhn on Digital Humanities
Nan Z. Da, a literature professor at the University of Notre Dame, published an academic article commenting on DH. In “The Computational Case against Computational Literary Studies,” she investigates the applications of computational methods in literature within the upcoming subfield of CLS. Through an extensive variety of data mining examples and repeated statistical measures, Da argues that there is a “fundamental mismatch between the statistical tools used” and subsequent quantitative results (601). The limitations of word frequency studies do not properly analyze the corpora of literature nor provide apt conclusions on the given text, failing to account the literary complexities of ambiguity and emotion.
At its core, the article illustrates the current deficiencies of CLS by its lack of both statistical accuracy and precision (Da 610). Multiple examples, primarily from conducted by other digital humanists or recreated by Da herself, highlight how the results do not correlate with traditional interpretations of literary corpora and its human context and culture. From Hoyt Long and Richard Jean So’s predictive algorithms on classifying literary works to Andrew Piper and Mark Algee-Hewitt’s n-gram probabilities, these studies lack both the significance and reproducibility recognized in current statistical standards. Additionally, the treatment of their data points creates the problematic tendency to surmise generalized claims about literary history that not only can be disputed but also have no statistical backing (Da 610).
A few months before Da, Jonas Kuhn published his academic article commenting on DH that further exposes its computational shortcomings. In “Computational text analysis within the Humanities: How to combine working practices from contributing fields?,” Kuhn examines the issues surrounding the collaborative research process between DH and computational social sciences. While not directly criticizing the value of CL’s objective research efforts, he does identify two major methodological problems that he refers to as the “scheduling dilemma” and “the subjectivity problem” (Kuhn 1). The article summarizes that:
Specifically, we can identify a scheduling dilemma that makes it hard to deploy sophisticated computational analysis chains in specialized hermeneutic studies, and the subjectivity problem. The latter originates from the constraint that in standard data-driven modeling, gold-standard annotations have to be grounded in operationalized categories leading to high inter-annotator agreement (Kuhn 37).
Through the presentation of two scenarios (the first outlining the complexities of analytical approaches between humanistic disciplines; the second annotating experiments regarding stable or shifting character perspectives), the scheduling dilemma and subjectivity problem illustrate the current incompatibility of CL and traditional literary analysis (Kuhn 2). Despite its problems, however, Kuhn subsequently suggests other frameworks to improve computational strategies.
This article provides an in-depth investigation that discusses new hermeneutical solutions through visualization of the complexities surrounding large-scale literary interpretation. For example, Kuhn shows how one of his recommended frameworks—the rapid probing model—aid in reducing departmental conflicts and varying subjectivity along the transdisciplinary research process (2). The technique, overall, shifts focus towards identifying early problems in hermeneutics, appealing to computational social science’s hesitation of committing early to unrefined models, while narrowing the large scale of ambiguity that DH struggles to conform with its input-output methodology (Kuhn 28). As a result, Kuhn introduces the greater potential for a “transdisciplinary integration” of literature and computing technologies as the basis of CL (7).
Now: 99 Problems but Glitch Ain’t One
Although Da and Kuhn address subfields in DH with distinct approaches, both articles acknowledge the high error rates of computational models, each serving to improve the methodological shortcomings of CLS and CL. Both authors intended to criticize the statistical significance and research practices of DH to combat its scientific objectiveness. Da’s and Kuhn’s criticisms, however, differ thematically with concentrations on empirical discrepancies and propositions in bridging the interdisciplinary gap, respectively. Da delves deeper into the redundancy of word frequency studies by digital humanists that target similar categories with similar approaches and result in similar, statistically insignificant conclusions (607). On the contrary, Kuhn seeks to remedy issues related to CL’s research workflow that promote the currently undervalued potential of a new field within a modernized humanistic discipline (4).
Notwithstanding their proponent/opponent approaches in which encompass the debate on DH, both authors do not necessarily provide practical applications for which can be applied onto the future of computational analysis in their respective subfields. As Kuhn states, “there is still no best-practice recipe for teams of interdisciplinary collaborators to follow” (5). Thus, DH requires more foundational and inquisitive knowledge to answer the root questions of why (Da 602). Does data from CLS belong in literature? Are computational studies worthy of integrating with literary analysis?
The academic discussion should proceed in investigating other approaches beyond word frequency studies and hermeneutic strategies. In this neo-empirical domain, alternative visualization techniques can serve as a reduction strategy that also addresses the “polyvalence” of literary works (Kuhn 4). Additionally, this multifaceted nature of literature can undergo a form of quality control in which addresses the “underexploitation” of CLS practices and its future potential for “interdisciplinary integration” (Kuhn 5-6).
Then What?: If It Is Broke, Do Fix It
Da frames that statistics serve to determine “a higher-order structure in quantifiable data” (629). Whether by indications of p-value significance or probability measures, statistics attempts to uncover representative inferences from large collections of sample data. In regards to CLS, visualization is an easy method to establish some form of higher-order structure out of a vast ocean of literary corpora. For example, multidimensional scaling (MDS) serves to organize results containing multiple, interconnected relationships down to a two-dimensional graph (Da 606). Although it achieves the simplification objective, Da illustrates that changes in framing completely alter the results of graphing word frequencies. For instance, she recreated a "corrected" version (Figure 3) of Piper's plotting of language similarities among Augustine's thirteen books of Confessions (Fig. 2).
An alternative visualization technique can be to engineering a program that visualizes these results on a three-dimensional graph. The topological relations, therefore, resemble a web appropriate in visually representing multiple, interconnected relationships.
Another current measure for visualization is factorization, a method that expands numerical components of data into multiple dimensions in order to analyze variation; however, the resulting “high-dimensional data” from factorization becomes counterintuitive in searching for similarities among overall variance (Da 620-621). As a result, many CLS conclusions lack statistically significant factors that are similar to one another in comparison to general dissimilarity. Therefore, a possible reduction strategy of “defactorization” can aid in determining literary homology in variance. Not only does this method decreases the polyvalence of the given literature, as it serves to reduce overall complexity and ambiguity, but also allows for the computers handling the literary corpora to learn from low-dimensional data, building foundational programming that contributes to interdisciplinary practices (Kuhn 6).
When utilized together, MDS from two-dimensional to three-dimensional scaling and defactorization assists in improving the generally low accuracy and precision of CLS results. These strategic adaptations translate into an increased ability to better identify statistically significant information among insignificant results. Furthermore, greater reproducibility can help in identifying proper claims obtained from literary analyses by being able to replicate evidence that supports perspectives or criticisms on literature, history, and/or culture. (Da 604).
On another note, quality control is necessary to adhere to the interests of computer social sciences and DH due to the shift in focus of early refinement. Kuhn describes that:
Such an informed model application can be achieved with a conceptually simple procedure, which does however take some extra effort: whenever one plans to apply some analysis system to a new type of text data, a prior step of reference data-based quality assessment has to be performed . . . it is definitely methodologically superior to adhere to prior manual annotation of test data for evaluating quality (10).
It is critical, especially for the input-output fashion of CLS, that earlier changes in the quality of both the methodology and literary works are made. Like with any research study and clinical trial, certain limitations or boundaries are set to ensure the production of quality results while not intentionally skewing towards desired outcomes. Therefore, the primary utilization of word frequency studies can be replaced with different approaches (such as predictive context algorithms like word2vec). As the present catches up to the future, a growing array of computational tools with increasing reliability become increasingly available in the inventory of analytical models; however, digital humanists cannot solely rely on this inventory without undergoing adjustments of their research frameworks beforehand (Kuhn 19-20).
The application of unaltered analytical tools in which produce quality-lacking research only perpetuates the underexploitation problem, stifling the potential for CLS to develop novel methodologies (Kuhn 9). Therefore, the utilization of these quality control measures provides solutions to address major issues in DH, similarly to how the aforementioned rapid probing model addressed the scheduling dilemma and subjectivity problem. Increased experimentation, furthermore, identifies the agathokakological nature of these novel methodologies, allowing for future digital humanists to develop upon previous victories and learn from previous mistakes (Kuhn 37).
Quality control measures also benefit the significance of results since improvement at the start of CLS research should reduce the chances of empirical discrepancy at later stages. As a result, Da’s empirical relationship between supported and evident conclusions will begin the process of reversing to being statistically logical (601).
This paper overviewed the previous criticisms on the methodologies and significances of research in DH. From Da’s focus on word frequency redundancies in CLS to Kuhn’s conditional provisions towards unity between humanistic disciplines in CL, the computational flaws of today are given solutions for future improvement by digital humanists. Altered practices in data visualization and control over quality are just two examples of how to address and amend the current problems underlying CLS. Proponents for DH assume the ingenuity, comprehensiveness, and unbiasedness provided by computational tools can sufficiently handle the literary breadth of analytical demands (Da 638). However, strides still need to be made in order to traverse the insignificant pitfalls of CLS, and the applications of new developments will help digital humanities in taking the first step.

Works Cited
Da, Nan Z. “The Computational Case against Computational Literary Studies.” Critical Inquiry,
vol. 45, no. 3, The University of Chicago Press, 2019, pp. 601-639.
https://doi.org/10.1086/702594.
Kuhn, Jonas. “Computational text analysis within the Humanities: How to combine working
practices from contributing fields?” Language Resources and Evaluation, Springer
Netherlands, 2019, pp. 1-38. https://doi.org/10.1007/s10579-019-09459-3.