Capturing Social Imagery in Bulk for Research and Analysis
By Shalin Hai-Jew, Kansas State University
One common assumption in research that uses qualitative data analytics techniques is that everything has informational value, and the challenge is how to extract meaning in rigorous and systematic ways. “Found” objects in the world are also potentially informationally valuable. “Social imagery,” images shared via image and video-sharing social media platforms, web logs (blogs), websites, and other spaces on the Social Web, is one very common source for contemporary research. Image sets often include a variety of visuals: photos, diagrams, video stills, data visualizations, and others.
One estimate is that some 1.8 billion digital images were uploaded daily in 2014, or 657 billion photos a year (Meeker, 2014, Eveleth, Nov. 2, 2015). [A 2017 version of Mary Meeker’s “Internet Trends 2017 – Code Conference” is available by Kleiner Perkins.] Among Gen Z-ers, sending images is seen to replace typing messages. If new social phenomenon are to be explored, it helps to capture communications in their native forms and to learn from them.
Social Imagery in Research
So has social imagery been used for research. On the commercial side, social imagery has been used to identify trends in fashions for “cool-hunting.” It has been used to identify highly influential individuals, who may be targeted to market goods to their peers. These have been used to understand cultural norms and practices, such as in terms of how selfies manifest in different locations around the world.
In terms of usage in social sciences research, this is a fairly new endeavor. In the same way that it is possible to computationally explore “word senses” and their varied uses, it is possible to study images and their “image senses” and their uses in various contexts. Image sets may be studied en masse, with computer vision harnessed to identify main objects in imagery sets and to extract sentiment. (There are commercial entities that offer such services.) Image patterns may be insightful about particular phenomenon. They may be revelatory about who is capturing the imagery, how they are capturing the imagery, and possibly why. It is possible to manually code and analyze image sets. Even an image set of a few thousand images is not burdensome per se for human-based analytical methods. There can be studies of small image sets as well as even the study of single images, as it is possible to study from an example of one (Winston, 2014). Researchers may want to combine human-coding and auto-coding methods to learning from social image data.
To over-simplify, one approach to coding data is a top-down approach, which requires a priori application of concept, theory, or framework to the data, and then the creation of a coding method to engage the imagery. Another method is bottom-up, which does not require any pre-concept but applies a “grounded theory” approach (really an inductive method of coding) starting with the image data itself and coding from the raw data.
Image data is high-dimensional data, which means that they may be studied across a range of dimensions. For example, what is depicted…from what angle…and how? What messages are being conveyed? What is the social context? The environmental context? What is in the foreground? The background? What are the aesthetics? What is the informational value? And these dimensions are only a simple beginning.
Also, image data is polysemic—or many-meaninged. The interpretive lens applied to the imagery will affect what may be seen, whether this is extracted through computational or human-coded or mixed methods.
Also, additional information may ride with the downloaded images. For example, metadata may reveal information about the circumstances behind the image capture: geographical locations, times, equipment used, image capture settings, and other data.
Google Chrome-based Image Downloaders through Extensions
Scraping imagery from the Social Web used to be a fairly complex effort, requiring the writing of code or the deploying of scripted agents. Today, much of the image set collection requires little technological sophistication. There are “add-ons” to web browsers that may enable access to imagery on web pages or from particular image sharing sites.
In this work, one extension from the Chrome Web store will be showcased: Picture Downloader Professional™ by www.startpage24.com. (This particular add-on is one of several that enables mass image downloads. While it only as 3/5 stars in reviews on the store site, its capabilities are impressive—particularly in its ability to “Save all pictures of this web page” (including at least up to over a thousand images at once).
To access this free tool, open up a Google Chrome web browser window. Go to the ellipses at the top right of the tab. Go to More Tools -> Extensions -> Get More Extensions. (To remove an add-on, just right click the add-on icon for a dropdown menu in order to remove that from the web browser.)
When capturing an image set, go to Google Search. Type in a word (in any language enabled by UTF-8 charset) or a phrase or a symbol. Click on the Image link. Scroll down to load more images on the page. When the “Show more results” appears at the bottom, click it, and continue until the image capture is exhausted. (The limit is apparently set by Google Search.) At the top of the image set are colorful filters. These may be clicked to filter the image search results. Once the filter is unclicked, a researcher is returned to the start-point for the filters. (The filters seem to come from a mix of folk-tagging, locational data, computer vision analyses of the images, and image hosts / sources.) The tags themselves are an interesting exploratory path.
To download the images, go to the Picture Downloader Professional icon at the top right and click “Save all pictures of this web page.”
Figure 2: Social Images Related to “Educational Technology” on Google Images
Image Filtering on Google Images
An image set for “educational technology” resulted in the following filters. Note that the filters themselves are clustered into groupings that provide semantic value and are suggestive of the types of available technology.
www.edudemic.com 50 educationtechnology tools every teacher should know about
classroom clip art
early childhood education
Table 1: Filters of the Social Imagery around “Educational Technology”
Digital Image Handling
The captured images are only a miniscule sample of available social imagery. These are not randomly selected, but it is hard to make claims about what images are collected through Google Search.
Cleaning digital image datasets. One of the strengths of being able to collect a large image set is that a lot of everything may be seen at once—like a giant proof sheet. However, this also means that not all of the data will be relevant.
Sometimes, glyphs (logos and icons) may be swept up in an image search. Or there may be an image that seems to have nothing to do with nothing. Remember that an image search captures images as separate objects beyond the original use case context. Duplicate images should be deleted (de-duplication).
In a research context, it is important to document what imagery was removed and why. (And there should be a pristine backup master image set of all images scraped in case it is necessary to review what was captured and to bring back some imagery.) Deletions of images from consideration in research should be non-destructive ones, so if those images are seen to be relevant later, they may be accessed and studied.
What researchers do with the social imagery and resulting data will vary, but it’s likely that new insights will be discovered. There’s a lot to be said for engaging social imagery!
Eveleth, R. (2015, Nov. 2). How many photographs of you are out there in the world? The Atlantic. https://www.theatlantic.com/technology/archive/2015/11/how-many-photographs-of-you-are-out-there-in-the-world/413389/.
Winston, Patrick. (2014, Jan. 10). "15. Learning: Near Misses, Felicity Conditions." Lecture. MIT OpenCourseWare. https://youtu.be/sh3EPjhhd40.
About the Author
Shalin Hai-Jew works as an instructional designer at Kansas State University. Her email is email@example.com.
|Previous page on path||Cover, page 18 of 26||Next page on path|