In the Margins

The Lexos Workflow

Lexos is an integrated workflow of tools to facilitate the computational analysis of digitized texts. Lexos highlights the small but important steps involved when preprocessing and preparing texts for analysis by making the decisions explicit in the tool's interface. Feel free to walk through the workflow paths below or jump around. The Lexos tools themselves also link back into appropriate places of the discussion on effective practices of the steps in the workflow.
 
For more specific information, click the highlighted links. 
 
In the Margins is our attempt to position the process of computational
literary text analysis side by side with its product, whether it be the tool
used for or the results obtained from such analysis. This is particularly
important for entry-level users and those whose training has not explored the
issues raised by computational methods of studying literature. Lexos is designed for use by newcomers to the field while empowering them to do sophisticated work in relatively little time. 
 
HERE (mdl)
The Lexos Workflow is a series of steps when working with files and corpora of digitized texts. 
 
Speed Tour:
Import files by using the Upload Tool, then use the Select Tool to select the files that will be active for subsequent operations. To prepare the files, Scrub the selected files to, for example, remove punctuation, but not internal apostrophes. Next Cut a text by dividing it into different segments. Finally, Visualize results as word and topic clouds, download matrix files of word counts for use with other tools, or Analyze subsets of your document and document segments by varying the metrics, the number of top-most words, n-gram type, and culling options, for example, produce a dendrogram from a hierarchical cluster analysis using the Euclidean Distance metric with Average Linkage on the top-100, 1-gram word counts.

The Lexos suite is divided into several sections, Manage, Prepare, Visualize, and Analyze. Each section contains tools used in Lexomic methods of linguistic research
 
Manage
By default, the Lexos website begins in the Upload tool from the Manage tab. Here, you can choose the text you wish to analyze by simply dropping it inside the dotted lines, or by navigating to the file location through use of the browse button. Multiple files may be uploaded simultaneously, although upload times may increase.

Once files have been uploaded, head to the Select tool from the Manage drop-down menu. Here, you may choose which of your uploaded texts you wish to analyze.

After selecting your texts, move to the Scrub tool under the Prepare tab. This tool allows you to scrub your text, removing capital letters, punctuation, and numbers, boiling a text down to its essential elements: words. The Scrub tool also allows you to Lemmatize your text, as well as input Stopwords, Consolidations, and Special Characters.

Cutting, also located under the Prepare tab, follows Scrubbing. Cut divides your text into different segments, allowing you to analyze variations between different sections.

The Tokenize/Count tool, under the Prepare tab and directly after Cut, provides broad information about your text in the form or frequency counts for words and characters.

Once you have Cut and Scrubbed your text, it is time to move on to analysis.

This page has paths:

  1. Lexos Mark D. LeBlanc