Procedures for moving from debates to actors on the web.
Meta-actor categorization:
All working URLs (approximately 1200) comprising the corpus were individually classified into one of seven meta-actor categories including: Business, Journalism, Not-for-Profit, Scientific Community, Government, QUANGO, and Other. The organization hosting the website was used as the primary actor to categorize.
Identification of main actors in the corpus:
The entire corpus, including text from all 1,100 URLs, was uploaded into Voyant. The “document terms” tool was used to provide a list of the most common terms in the corpus. This list was manually analyzed to identify terms that may resemble an actor. For example, names, organizations, strings of letters that may indicate an acronym or unfamiliar name, words that could double as names (e.g. White, Mills, Brook), or words that seemed unfamiliar or like gibberish, which could be an actor in a foreign language. All potential actors with 35 counts or more in the corpus were compiled into a “potential actor list” (over 400). Each identified term was investigated using a combination of the “context” and “collocates” tool in Voyant, as well as a basic Google search to validate or invalidate each “potential actors”.
There are several limitations in this process worth noting. For example, common names like Paul, James, and Wong have high “counts” since several people in the corpus share the same name, giving the “potential actor” an artificially high count. In these situations, inclusion criteria was based on the following: 1) The actor is mentioned at least 5 times; 2) The actor is found in more than one URL; 3)The actor is related to e-waste in a direct way. Additionally, someone with a unique name may be mentioned 25 times throughout the corpus and that actor would not be identified, and organizations like Toxics Link, were not identified since the words “toxics” and “link” do not resemble an actor individually. Other actors appeared many times in the corpus, but seemed to be irrelevant to the “e-waste controversy/debates”. Many of these names include politicians with no relation to e-waste (e.g. Donald Trump), celebrity names (e.g. Kim Kardasian), etc. Names with no direct connection to e-waste were not included in the “verified actor” list,
Once all “potential actors” had been verified, actors were then categorized into two groups: “institutions” - consisting of government bodies, businesses, media outlets, universities, not-for-profit organizations, QUANGOS, etc., and “people” - consisting of individual people affiliated with institutions. The institutional affiliation of each “person” was based on the context of the corpus. For example, if a “person” wrote a report for congress, the “person” would be affiliated with congress. However, if the same “person” was cited many times in the corpus based on the report to congress as well as other academic publications, then an additional affiliation of the university would be added as well. Each institution was then categorized into one of the seven meta-actor categories (business, scientific community, journalism, government, not-for-profit, QUANGO, and other).
Identification of actors debating each meta-issue:
Actors making controversial statements in the debate were identified and extracted from quotes in the “index of issues” (described in Movement 1 protocol). Each quote in the index of issues was manually scanned to identify actors (i.e. people and institutions). Each identified actor was added as a “protagonist” node into DebateGraph, with the quote(s) attributed to the actor copied into the “details” section. In some circumstances, both the institutions and people representing the institutions were identified in the quotes (e.g. a business and the owner of the business or a government agency and an employee of that agency). In such cases, the institution was created as the central node, and the actor attached to the institution was a sub-node.