In the Margins

The Content Analysis Tool

The Lexos Content Analysis Tool allows you to compare the presence of words in multiple dictionaries (lists of words) as found within your uploaded documents. The words in each dictionary are tallied for each document. You can then build your own equation to compute a single score for each document based on the words found in each dictionary. For example, if you have previously uploaded a document containing a concatenated list of tweets, you can use this tool to (a) upload two dictionaries, happyWords.txt and sadWords.txt, (b) build an equation such as: [happy] - [sad] (for each document, compute the total number of happy words found minus the total number of sad words found), and (c) click the Analyze button. A final score will emerge to indicate a "happiness" measure for your collection of tweets.

There are various applications of the content analysis tool such as opinion mining, determining organizational hardiness in stock broker reports, and sentiment analysis. 

Usage

(1) Upload the documents you wish to explore using Lexos' Upload page, as usual.
(2) On this Content Analysis page, upload your dictionary file(s).
(3) Build a formula to relate scores obtained via each dictionary, [happy] - [sad]
(4) Click on the Analyze button.
(5) Review the right-most column in the initial result table to see your "score".

Example: Sentiment Analysis
After uploading and preparing one or more text files (e.g., a novel), you must then upload at least one of your own user-defined dictionaries containing key words that you tend to associate with a particular feeling. This "dictionary" can simply be another text file and uploaded under the dictionaries menu in the Content Analysis page. You must then enter a formula using the provided calculator in order to determine the “sentiment” of the text. For instance, you may choose to upload a novel and then two of your own user-defined dictionaries in order to determine the tone of the literary text.

Trivial example:
Text File: happy very happy happy very sad sad happy happy happy sad
Dictionary 1: happy, very happy
Dictionary 2: sad, very sad

Enter the formula [happy] – [sad] into the formula box. After clicking analyze, you might see that the text appears to have a happier tone since it uses more phrases from Dictionary 1 (the “happy” dictionary) than from Dictionary 2 (the “sad” dictionary).
 

Interpreting the Results

Three groups of tables will be displayed as results:

Assuming a use with positive and negative dictionaries, the first table displays the number of positive and negative terms present in each document, the final formula value computed, the total word count of each document, and the score which is the formula result divided by the total word count. The average for each of these categories is also included in the table.

The second table displays a ranking of the most frequently occurring dictionary terms for the entire corpus (the entire collection of active documents). For each dictionary term, the columns show the dictionary holding the term, the dictionary term, and the raw count of the number of times that term appears in the entire corpus. Each user-defined dictionary is color-coded for convenience.

The third table displays a ranking of the most frequently occurring dictionary terms for each document in the set of active documents. For each dictionary term, the columns show the dictionary holding the term, the dictionary term, and the raw count of the number of times that term appears in the document.

This page has paths: