In the Margins is a Scalar book which serves as a companion for Lexomic research and the Lexos literary text analysis software. The online version of the Lexos software is available at http://lexos.wheatoncollege.edu. Our passions for tool-building have intersected with our interest in two questions:
How can we explore the growing impact that quantitative and algorithmic approaches are having on the Humanities?
How can we make the discussion part of the tool and the tool part of the discussion?
Lexomics is our name for certain methods of stylistic analysis (sometimes called stylometry). This type of analysis harnesses the power of modern computing and statistical techniques to investigate Humanities-based questions such as authorship attribution or textual lineage. Lexomic methods complement traditional Humanities methods of literary interpretation, rather than replacing these challenges. We note that our small but spirited team exists within a much larger community of scholars who continue to influence our team greatly (cf. Eder, Craig, Jockers, Hoover, Liu, Sinclair and Rockwell, et al.).
The role of Lexos is to help readers of literature identify and explore patterns in texts, thereby opening up new questions and new avenues of research. Lexos provides an integrated workflow of pre-processing, analytical, and visualization tools which allow students and scholars of literature to detect and explore patterns in their texts. Lexos is freely available for use online (perhaps the best choice for first and occassional users) and it may also be downloaded and installed locally for better performance (installation instructions are available here).
The aim of Lexos is to create an entry-level environment for Lexomic scholarship, one simple enough to be used easily by the casual student but powerful enough for the advanced professor to use in creating new knowledge and insight. Lexos was created for use with small to medium-sized collections of texts (rather than large text corpora or "big data"), and for use with languages that have non-standard or non Latin-based spelling systems. Most of the early Lexomic research was done on medieval English texts. Doing statistical analysis on texts of these types creates certain challenges, both theoretical and practical, and Lexos developed as a way to explore them.
These issues form part of a wider set of questions we can ask about how computational tools can be used in the Humanities: where are the opportunities, what are the effective practices, and what are the limitations? These questions are not new with us of course, and the wider field is too large to cite here, but In the Margins is our effort to bring the choice of and discussion about methodological decisions to the fore. Our companion documentation, In the Margins, exists not only as a "how to" guide for using Lexos but also as a means to elicit community commentary of effective practices when making the many decisions during the workflow (e.g., how to handle punctuation, count words, and select metrics). In the Margins can be explored directly from its Scalar website, but we also make use of Scalar's Application Programming Interface (API) to embed In the Margins content directly in Lexos. We think it is important that Lexos not become a "black box" into which users feed their texts and from which they obtain results uncritically. By making the discussion part of the tool and the tool part of the discussion, we aim to make Lexos a more rigorous and powerful tool, one in which we can explore more generally the growing impact that quantitative and algorithmic approaches are having on the Humanities.