Circular Area to Hyper-Local Tweets using Maltego Chlorine 3.6.1
By Shalin Hai-Jew, Kansas State University
Ever wonder what people are Tweeting (microblogging) about in the space around you? That is actually a fairly easy thing to know using the Maltego Chlorine 3.6.1 feature known as the “Circular Area” tool, which “transforms” locational information into a flood of location-based Tweets. This tool enables a person to get a sense of what local Tweets are being shared. Further, it is possible to point to any place on earth where there is Internet connectivity and people Tweeting.
Figure 1: A Close-in Bubble View of Circular Area at 3000m Radius Capture
Actually, it is possible to capture Tweets from the International Space Station even (but not using Earth-based locational tools…but rather by following astronaut Twitter accounts). It is possible to follow Tweets from Mars (per the Mars Rover account, which is actually not ‘bot-based but created by multiple NASA scientists based on data beamed 140 million miles back from the audacious Mars Rover).
What is Going On with Locational Short-Text Data Extractions?
Essentially, what is going on is this. The Twitter microblogging site captures a range of information by those who Tweet. It captures who is sharing the message, the content of the message, the time the message was shared, replies to the message, geolocational, and other information. The geolocational data column is filled if the app, device, or individual actually share the location of the shared Tweet, image, or video. The geolocation information is of varying levels of accuracy depending on whether a properly set app or device is reporting the latitude-longitude coordinates or a human being (people will put locations like “outer space” or “nowhere” in the location column).
Twitter enables access to a range of its data to developers through application programming interfaces (APIs). Reportedly about 1% of the available Tweets are made available at any one time, and the data extractions are further limited by rate (once a certain amount of access has been reached, the Twitter API will put a user’s access on pause for a certain amount of time, usually about 15 minutes). [To access a full set of data, researchers have to go with a commercial entity, Gnip, which has that access to the full data. Twitter purchased Gnip in August 2014: “Twitter paid $134 million for data partner Gnip”]
There are researchers engaged in “big data” analyses of Twitter datasets. There are commercial entities that download and process masses of real-time Twitter data. A general algorithm involves “mapping” the incoming data and “reducing” it to summarizable and human-usable information. Depending on the data needs of the entities, various other algorithms may be applied to process the data.
Starting with Google Maps
It helps to know the latitude and longitude coordinates that mark the physical location on Earth that is of interest. It generally helps to start with something local and known. On this particular Sunday mid-morning, I am at my cubicle at Hale/Farrell Library. I find the spot on Google Maps and click a pin on it. In the URL, the latitude and longitude are indicated: 39.1904315,-96.5821477,17.25z.
Figure 2: Pinning Hale Library on Google Maps for the Lat-Long Coordinates
So the way the Circular Area tool works is that it has a default setting of a 1000 meter (m) radius (1 kilometer or .62 miles or 3,280.84 feet) with the pin at the center. That means the coverage goes out at 2000 m diameter circle, with a circumference of and a circular area of 3,141,592.63 square meters.
This extraction does not require that the researcher follow the individuals whose messages are being captured. By definition, public Tweets are open and accessible to anyone interested. Rather, all that is needed is to have a Twitter account in good standing and then logging in to “whitelist” access (which enables the rate-limiting by Twitter, among other things). This then allows a skim off of the existing information.
Figure 3: Successful Whitelisting into Twitter API to Enable Maltego Circular Area Transform
A first extraction pulls about 48 messages, including those who are Tweeting and what they are sharing. Messages may be any text in any language representable with UTF-8 character encoding; it may be URLs, images, and short videos. This morning, it looks like one wants to wish Beyonce Happy Birthday! It’s also apparently World Mental Health Day! Several mention the landscape, trips, and blessings. Some of the messages are in Indonesian (thanks to Google Translate!).
The microblogging talk is pretty low-key and sparse. That is probably due both to the location (a very quiet university campus) and to the time (mid-morning on a Sunday). Researchers who study social media have been able to identify different times when people’s communications are much more active (“bursty”). They have also been able to identify diurnal rhythms--collective mood and sentiment swings that are time-variant. (During a Monday morning rush hour in a city, there are many messages about coffee, driving, work, job-recruitment-based Tweets, advertising and marketing Tweets, and so on.) It makes sense to sample the messaging at a location at different times of day.
Figure 5: Entity List of Twits and Tweets in the Circular Area
A bubble view graph captures a sense of “burstiness” of fast-breaking issues. On a quiet morning, though, there is less of a visceral “burst” than just a low-key mapping of a few messages.
Figure 6: A Bubble View of the Circular Area Tweet Data Extraction
Researchers who may want to directly engage with individuals on-scene may identify their identities on Twitter and send them a message @whatevertheaccountname (not an actual account) on Twitter. In other words, this does not have to be about data analysis alone but about information elicitation and engagement (if desired and if within the scope of the research).
Figure 7: Identification of Most Active Nodes by Degree Based on Twit Node Size
The Maltego Chlorine 3.6.1 Circular Area tool does enable the definition of a physical point and the extraction of messaging within a certain area around it (it is easy enough to go even a few miles out…but while that results in the capture of much more information, it also dilutes the location specificity of the circular pinpointed area.) The data extractions are near real-time.
Broadening the Circle
Closer to noon on the same day, another extraction was done with a 3,000 meter radius, and that captured a large number of upcoming fraternity and sorority events, sports chatter, lunch location call-outs, mentions of the weather, mentions of the university, church-based messaging, and job recruitment Tweets. One Twit ranted against double parking. Another hearted K-State. One mentioned half-dollar sized hail.
In the current version of Google Maps, it is still not possible to draw a circle around a particular point based on the radius. A Google developer posted online that this is a feature that is on a features list.
Figure 8: Eight Hundred Entities at 3000m Radius Circle
A More Complex Multi-Stream Instantiation (as Summary Heat Maps)
The Geographic Information Systems Spatial Analysis Laboratory (GISSAL) created an app capturing social media references to Kansas State University. This application may be accessed via the Web at http://kstate.maps.arcgis.com/apps/SocialMedia/index.html?appid=683745c7d20d4fa4a8525764b1bc933b.
This more sophisticated approach combines data streams from Flickr (image- and video-sharing site), Twitter (microblogging site), and YouTube (video-sharing site) and includes a geospatial angle (which suggests that it only includes geotagged messages and contents).
The app was apparently created to find "K-State" and other permutations of it, which includes Kennesaw State University as well as Kansas State University. This is seen with the intense heat map concentration over the Kennesaw State University campus in Georgia.
Figure 9: "K-State in the Social Media" in the Landing Page View over the Continental U.S.
The following heat map of "K-State in the Social Media" shows this map in action in ArcGIS Pro in the K-State instantiation..with a zoomed-in view close to the campus. The highlighted areas--based on color intensity (with red as indicating a lot of electronic chatter from particular physical locales)--indicate the intensity of electronic sharing on the three target social media platforms.
Figure 10: "K-State in the Social Media" Heat Map (in ArcGIS Pro 1.1.0)
At a later time, the heatmap was viewed (but at a different zoom-scale) and with an underlying imagery with labels basemap. This map view may be accessed through the app.
Figure 11: "K-State in the Social Media" with an Imagery with Labels Basemap (in ArcGIS Pro 1.1.0)
These latter maps offer more of a distant and summary view of where there is electronic chatter occurring...but not a direct way to access the underlying messages or the active social media account users.
These map visualizations also evoke the phenomenon of "digital maps" that show hot spots of participation in particular regions and around certain topics...as contrasted against dead zones without broadcast social media communications.
About the Author
Shalin Hai-Jew works as an instructional designer at Kansas State University. Her email is firstname.lastname@example.org.
|Previous page on path||Cover, page 18 of 27||Next page on path|