In the Margins

The Rolling Windows Tool

Rolling window analysis is a method for tracing the frequency of terms within a designated window of tokens over the course of a document. It can be used to identify small- and large-scale patterns of individual features or to compare these patterns for multiple features. Rolling window analysis tabulates term frequency as part of a continuously moving metric, rather than in discrete segments. Beginning with the selection of a window, say 100 tokens, rolling window analysis traces the frequency of a term's occurrence first within tokens 1-100, then 2 to 101, then 3, 102, and so on until the end of the document is reached. The result can be plotted as a line graph so that it is possible to observe gradual changes in a token’s frequency as the text progresses. Plotting different tokens on the same graph allows us to compare their frequencies.

The Lexos Rolling Windows Tool has numerous options which are best understood as part of a workflow. In the Lexos interface, the steps of this workflow are numbered 1-6. Each of these options steps is discussed below.

  1. Select Active Document: Lexos performs rolling windows analysis on a single active document at a time. Use the radio buttons to select which document you would like to examine.
  2. Select Calculation Type: Lexos will plot either the average term frequency in each window (Rolling Average) or the ratio of term frequencies if you are examining multiple terms (Rolling Ratio). If you choose Rolling Ratio, you must enter two terms (top / bottom); the plot will show the ratio of these terms in each window.
  3. Enter Search Terms: These are the terms you wish to plot from your document. Enter up to 6 terms, separated by commas. When Lexos searches your document for these terms, it uses the document text, rather than the Document-Term Matrix (DTM) as its starting point. This means that you can choose to search for strings of text, individual words or terms (separated by spaces), or regular expressions (regex). A basic tutorial for using regex can be found at https://regexone.com/.  Note: When searching for patterns as String(s), do not include extra spacing between your strings. For example, if you are searching for "il" and "on", enter the two strings to search for without a space after the comma, as follows:  il,on  (that is, do not put a space after the comma). Again, when searching for multiple Strings, separate each string by only a comma with no whitespace. Any entered whitespace will be included in the search, probably not what you intended.
  4. Define Window: This is where you set the size of the window you want to use. It can consist of any number of characters, tokens (separated by spaces), or lines (separated by line breaks in the text). If your document contains milestones, click the checkbox, enter your milestone, and the location of each milestone will be indicated on the rolling window graph by a vertical line.
  5. Choose Display Options: The Show Individual Points option (turned off by default) produces an uninterrupted line graph. Mousing over a point will display the location of the term in the token sequence (starting from 0), along with the average or ratio at that point in the window. Turning this option on will show the points where each term occurs in the document. The Black and White only option produces a non-color version of the graph that is suitable for downloading and publishing in journals.
  6. Download the graph: Click the Get Graph button to generate the Rolling Windows graph. While panning over the graph, notice the menu that appears above the graph. Selecting the 'camera-icon' (left-most icon) will save and download the graph image as a .png file.

This page has paths: