Thanks for your patience during our recent outage at scalar.usc.edu. While Scalar content is loading normally now, saving is still slow, and Scalar's 'additional metadata' features have been disabled, which may interfere with features like timelines and maps that depend on metadata. This also means that saving a page or media item will remove its additional metadata. If this occurs, you can use the 'All versions' link at the bottom of the page to restore the earlier version. We are continuing to troubleshoot, and will provide further updates as needed. Note that this only affects Scalar projects at scalar.usc.edu, and not those hosted elsewhere.
In the MarginsMain MenuWelcomeThe In the Margins home pageLexomicsThe starting point for the Lexomics pathManualStart page for the Lexos ManualTopicsExplore this path to learn about the Lexomic methodsGlossaryGlossary of terms used in Lexos and In the MarginsBibliographyBeginning of bibliography pathLexos Install GuideInstall GuideScott Kleinman9a8f11284fbcd30816f25779706745a199e2813bMark D. LeBlanc23eecdfefefedd63f3c03839b2eb82298bb7b6acMichael Drout982893aaef23041e734606413d064fcc52ac209a
The Content Analysis Tool
12018-06-05T18:53:53-07:00Mark D. LeBlanc23eecdfefefedd63f3c03839b2eb82298bb7b6ac537127Manual page for the Lexos Content Analysisplain2019-05-28T17:21:59-07:00Mark D. LeBlanc23eecdfefefedd63f3c03839b2eb82298bb7b6acThe Lexos Content Analysis Tool allows you to compare the presence of words in multiple dictionaries (lists of words) as found within your uploaded documents. The words in each dictionary are tallied for each document. You can then build your own equation to compute a single score for each document based on the words found in each dictionary. For example, if you have previously uploaded a document containing a concatenated list of tweets, you can use this tool to upload two dictionaries (e.g. happyWords.txt and sadWords.txt), then build an equation such as [happy] - [sad]. Given this equation, the tool will compute the total number of happy words found minus the total number of sad words found for each document. Once you click the Analyze button, a final score will emerge to indicate a "happiness" measure for your collection of tweets.
There are various applications of the content analysis tool such as opinion mining, determining organizational hardiness in stockbroker reports, and sentiment analysis.
Usage
Upload the documents you wish to explore using Lexos' Upload page, as usual, and scrub the documents if necessary.
On this Content Analysis page, upload your dictionary file(s).
Build a formula to relate scores obtained via each dictionary (e.g. [happy] - [sad])
Click on the Analyze button.
Review the right-most column in the initial result table to see your "score".
Interpreting the Results
Three groups of tables will be displayed as results:
Assuming a use with happy and sad dictionaries, the first table displays the number of happy and sad terms present in each document, the final formula value computed, the total word count of each document, and the score which is the formula result divided by the total word count. The average for each of these categories is also included in the table.
The second table displays a ranking of the most frequently occurring dictionary terms for the entire corpus (the entire collection of active documents). For each dictionary term, the columns show the dictionary holding the term, the dictionary term, and the raw count of the number of times that term appears in the entire corpus. Each user-defined dictionary is color-coded for convenience.
The third table displays a ranking of the most frequently occurring dictionary terms for each document in the set of active documents. For each dictionary term, the columns show the dictionary holding the term, the dictionary term, and the raw count of the number of times that term appears in the document.
Examples
For step by step examples, visit our public repository on Github WheatonCS/Lexos/Content_Analysis. From here, you may select whichever experiment you wish to carry out. Each folder contains a README.md file with instructions on how to execute the tests, a FilesToUse folder containing all the files you will need, and a ResultsToExpect folder containing a PDF file with the anticipated results from the analysis.
Example: Sentiment Analysis After uploading and preparing one or more text files (e.g., a novel), you must then upload at least one of your own user-defined dictionaries containing keywords that you tend to associate with a particular feeling. This "dictionary" is another text file uploaded under the dictionaries menu in the Content Analysis page. You must then enter a formula using the provided calculator in order to determine the “sentiment” of the text. For instance, you may choose to upload a novel and then two of your own user-defined dictionaries in order to determine the tone of the literary text. Text File: happy very happy happy very sad sad happy happy happy sad Dictionary 1: happy, very happy Dictionary 2: sad, very sad Formula: [happy] – [sad] After clicking analyze, you might see that the text appears to have a happier tone since it uses more phrases from Dictionary 1 (the “happy” dictionary) than from Dictionary 2 (the “sad” dictionary).
This page has paths:
12016-08-15T18:02:04-07:00Scott Kleinman9a8f11284fbcd30816f25779706745a199e2813bManualScott Kleinman19Start page for the Lexos Manualplain1736692018-08-24T05:42:56-07:00Scott Kleinman9a8f11284fbcd30816f25779706745a199e2813b