Capturing Social Imagery in Bulk for Research and Analysis

By Shalin Hai-Jew, Kansas State University

One common assumption in research that uses qualitative data analytics techniques is that everything has informational value, and the challenge is how to extract meaning in rigorous and systematic ways. “Found” objects in the world are also potentially informationally valuable. “Social imagery,” images shared via image and video-sharing social media platforms, web logs (blogs), websites, and other spaces on the Social Web, is one very common source for contemporary research. Image sets often include a variety of visuals: photos, diagrams, video stills, data visualizations, and others.

Figure 1: Results of an "iot" Image Search on Google Search

One estimate is that some 1.8 billion digital images were uploaded daily in 2014, or 657 billion photos a year (Meeker, 2014, Eveleth, Nov. 2, 2015). [A 2017 version of Mary Meeker’s “Internet Trends 2017 – Code Conference” is available by Kleiner Perkins.] Among Gen Z-ers, sending images is seen to replace typing messages. If new social phenomenon are to be explored, it helps to capture communications in their native forms and to learn from them.

Social Imagery in Research

So has social imagery been used for research. On the commercial side, social imagery has been used to identify trends in fashions for “cool-hunting.” It has been used to identify highly influential individuals, who may be targeted to market goods to their peers. These have been used to understand cultural norms and practices, such as in terms of how selfies manifest in different locations around the world.

In terms of usage in social sciences research, this is a fairly new endeavor. In the same way that it is possible to computationally explore “word senses” and their varied uses, it is possible to study images and their “image senses” and their uses in various contexts. Image sets may be studied en masse, with computer vision harnessed to identify main objects in imagery sets and to extract sentiment. (There are commercial entities that offer such services.) Image patterns may be insightful about particular phenomenon. They may be revelatory about who is capturing the imagery, how they are capturing the imagery, and possibly why. It is possible to manually code and analyze image sets. Even an image set of a few thousand images is not burdensome per se for human-based analytical methods. There can be studies of small image sets as well as even the study of single images, as it is possible to study from an example of one (Winston, 2014). Researchers may want to combine human-coding and auto-coding methods to learning from social image data.

To over-simplify, one approach to coding data is a top-down approach, which requires a priori application of concept, theory, or framework to the data, and then the creation of a coding method to engage the imagery. Another method is bottom-up, which does not require any pre-concept but applies a “grounded theory” approach (really an inductive method of coding) starting with the image data itself and coding from the raw data.

Image data is high-dimensional data, which means that they may be studied across a range of dimensions. For example, what is depicted…from what angle…and how? What messages are being conveyed? What is the social context? The environmental context? What is in the foreground? The background? What are the aesthetics? What is the informational value? And these dimensions are only a simple beginning.

Also, image data is polysemic—or many-meaninged. The interpretive lens applied to the imagery will affect what may be seen, whether this is extracted through computational or human-coded or mixed methods.

Also, additional information may ride with the downloaded images. For example, metadata may reveal information about the circumstances behind the image capture: geographical locations, times, equipment used, image capture settings, and other data.

Google Chrome-based Image Downloaders through Extensions

Scraping imagery from the Social Web used to be a fairly complex effort, requiring the writing of code or the deploying of scripted agents. Today, much of the image set collection requires little technological sophistication. There are “add-ons” to web browsers that may enable access to imagery on web pages or from particular image sharing sites.

In this work, one extension from the Chrome Web store will be showcased: Picture Downloader Professional™ by www.startpage24.com. (This particular add-on is one of several that enables mass image downloads. While it only as 3/5 stars in reviews on the store site, its capabilities are impressive—particularly in its ability to “Save all pictures of this web page” (including at least up to over a thousand images at once).

To access this free tool, open up a Google Chrome web browser window. Go to the ellipses at the top right of the tab. Go to More Tools -> Extensions -> Get More Extensions. (To remove an add-on, just right click the add-on icon for a dropdown menu in order to remove that from the web browser.)

When capturing an image set, go to Google Search. Type in a word (in any language enabled by UTF-8 charset) or a phrase or a symbol. Click on the Image link. Scroll down to load more images on the page. When the “Show more results” appears at the bottom, click it, and continue until the image capture is exhausted. (The limit is apparently set by Google Search.) At the top of the image set are colorful filters. These may be clicked to filter the image search results. Once the filter is unclicked, a researcher is returned to the start-point for the filters. (The filters seem to come from a mix of folk-tagging, locational data, computer vision analyses of the images, and image hosts / sources.) The tags themselves are an interesting exploratory path.

To download the images, go to the Picture Downloader Professional icon at the top right and click “Save all pictures of this web page.”

Figure 2: Social Images Related to “Educational Technology” on Google Images

Image Filtering on Google Images

An image set for “educational technology” resulted in the following filters. Note that the filters themselves are clustered into groupings that provide semantic value and are suggestive of the types of available technology.

education

importance

role

learning

innovative

definition

classroom

impact

internet

room

illustration

game

teaching learning

futuristic

origin

nature

teacher

history

media

communication

benefit

fiat

school

mobile

gadget

class

www.edudemic.com 50 educationtechnology tools every teacher should know about

blog

classroom clip art

animation

software

student

approach

planning

component

education wordle

application

device

higher education

office

hardware

computer

ict

environment

form

process

photograph

new zealand

singapore

malaysia

japan

hong kong

india

graphic organizer

concept map

symbol

infographic

timeline

quote

diagram

statistics

21st century

20th century

19th century

urdu

hindi

uses

instructional

learning environment

learner centered

education quote

digital

comic strip

early childhood education

literacy

technology

youth

modern

contemporary

distance learning

collaborative learning

australia

korea

brunei

qatar

thailand

england

ppt

pdf

used australia

usa

italy

taiwan

africa

indonesia

iot

academic

development

mapping

monitor

fear

artist

science

multimedia

telecommunications

cloud

ipad

the philippines

south korea

china

canada

college

platform

service

info

Table 1: Filters of the Social Imagery around “Educational Technology”

Digital Image Handling

The captured images are only a miniscule sample of available social imagery. These are not randomly selected, but it is hard to make claims about what images are collected through Google Search.

Cleaning digital image datasets. One of the strengths of being able to collect a large image set is that a lot of everything may be seen at once—like a giant proof sheet. However, this also means that not all of the data will be relevant.

Sometimes, glyphs (logos and icons) may be swept up in an image search. Or there may be an image that seems to have nothing to do with nothing. Remember that an image search captures images as separate objects beyond the original use case context. Duplicate images should be deleted (de-duplication).

In a research context, it is important to document what imagery was removed and why. (And there should be a pristine backup master image set of all images scraped in case it is necessary to review what was captured and to bring back some imagery.) Deletions of images from consideration in research should be non-destructive ones, so if those images are seen to be relevant later, they may be accessed and studied.

Staying legal. Simply because imagery is open-access and broadly publicly available does not mean that it is necessarily released to the public domain or licensed for open-source usage. Also, when coding images en masse, it is not possible to do reverse image searches (such as through TinEye Reverse Image Search) to find out who the original owners are and what the terms of use might be. There are no limits to studying available imagery at least in the U.S. context, but there are limits to reproduction of others’ images beyond thumbnails because of intellectual property laws.

Conclusion

What researchers do with the social imagery and resulting data will vary, but it’s likely that new insights will be discovered. There’s a lot to be said for engaging social imagery!

References

Eveleth, R. (2015, Nov. 2). How many photographs of you are out there in the world? The Atlantic. https://www.theatlantic.com/technology/archive/2015/11/how-many-photographs-of-you-are-out-there-in-the-world/413389/.

Winston, Patrick. (2014, Jan. 10). "15. Learning: Near Misses, Felicity Conditions." Lecture. MIT OpenCourseWare. https://youtu.be/sh3EPjhhd40.

About the Author

Shalin Hai-Jew works as an instructional designer at Kansas State University. Her email is shalin@k-state.edu.

Comment on this page

Local Discussion

Popout

Discussion of "Capturing Social Imagery in Bulk for Research and Analysis"

Add your voice to this discussion.

Checking your signed in status ...

Previous page on path

Cover, page 18 of 26

Next page on path

Your name
Comment title
Content <a><i><u><b>
CAPTCHA

C2C Digital Magazine (Fall 2017 / Winter 2018)

Capturing Social Imagery in Bulk for Research and Analysis

Discussion of "Capturing Social Imagery in Bulk for Research and Analysis"

Add your voice to this discussion.