In the Margins

The Content Analysis Tool

The Lexos Content Analysis Tool allows you to compare the presence of words in multiple dictionaries (lists of words) as found within your uploaded documents. The words in each dictionary are tallied for each document. You can then build your own equation to compute a single score for each document based on the words found in each dictionary. For example, if you have previously uploaded a document containing a concatenated list of tweets, you can use this tool to (a) upload two dictionaries, happyWords.txt and sadWords.txt, (b) build an equation such as: [happy] - [sad] (for each document, compute the total number of happy words found minus the total number of sad words found), and (c) click the Analyze button.

There are various applications of the content analysis tool such as opinion mining, determining organizational hardiness in stock broker reports, and sentiment analysis. 

Usage

(1) Upload the documents you wish to explore using Lexos' Upload page, as usual.
(2) Upload your dictionary file(s).
(3) Build a formula to relate scores obtained via each dictionary, [happy] - [sad]
(4) Click on the Analyze button.

Sentiment Analysis
After uploading and preparing one or more text files, you must upload at least one of your own user-defined dictionaries containing key words that you tend to associate with a particular feeling. This "dictionary" can simply be another text file and uploaded under the dictionaries menu in the Content Analysis page. You must then enter a formula using the provided calculator in order to determine the “sentiment” of the text. For instance, you may choose to upload a text file and two of your own user-defined dictionaries in order to determine the tone of the literary text.

Example:
Text File: happy very happy happy very sad sad happy happy happy sad
Dictionary 1: happy, very happy
Dictionary 2: sad, very sad

Enter the formula [happy] – [sad] into the formula box. After clicking analyze, you might see that the text appears to have a happier tone since it uses more phrases from Dictionary 1 (the “happy” dictionary) than from Dictionary 2 (the “sad” dictionary).
 

Interpreting the Results

Three groups of tables will be displayed as results:

Assuming a use with positive and negative dictionaries, the first table displays the number of positive and negative terms present in each document, the formula result, the total word count of each document, and the score which is calculated from the division of the formula result by the total word count. The average for each of these categories is also included in the table.

The second table displays the phrase, its count, and the corresponding user-defined dictionary according to the appearance of each phrase throughout the entire corpus of texts. Each user-defined dictionary is color-coded for convenience.

The third table displays the phrase, its count, and the corresponding user-defined dictionary according to the appearance of each phrase throughout a specific text.

This page has paths: