This site requires Javascript to be turned on. Please enable Javascript and reload the page.

Cherchez le texte: Proceedings of the ELO 2013 Conference

A Network Analysis of Dissertations About Electronic Literature

by Jill Walker Rettberg & Scott Rettberg
This essay was published in the Electronic Book Review, July 2014.

Abstract

More than 60 dissertations in the field of electronic literature have been documented in the ELMCIP Electronic Literature Knowledge Base, including tags, abstracts and in most cases links to full texts of the dissertations. This paper performs a network analysis of the citations in 29 of these dissertations to identify trends, patters and information about an emerging canon.

Introduction

The first dissertation about electronic literature was written in 1976 by James Richard Meehan, and was titled The Metanovel: Writing Stories by Computer. Since then, at least 60 more dissertations have been written on electronic literature. This vast corpus of research literature about electronic literature has not previously been studied as a whole.

This paper begins to map creative works in electronic literature as represented by dissertation on electronic literature. Information about references to creative works in the dissertations has been entered into the ELMCIP Electronic Literature Knowledge Base, and keywords and tags for each dissertation were also registered. The data was then imported into the open source network analysis software Gephi, where it was visualised and analysed.

Changing Topics, Shifting Concerns

Nobody familiar with the growth of the field of electronic literature over the last years will be surprised to hear that more and more doctoral students are choosing to write their dissertations about electronic literature. In the first years, dissertations on electronic literature were occasional events, but in the last decade, there have consistently been several published each year, which allows for a critical mass of new scholars who can respond to each other’s work.

In this analysis my goal was not to to select a representative sample but to analyse all dissertations on electronic literature, but without a doubt there are many dissertations missing from my data. Ideally we would have data about and the full text of every PhD dissertation written at any university in any country about electronic literature. In practice, the Electronic Literature Research Group at the University of Bergen has done its best to find as many dissertations as possible. We have searched archives such as ProQuest and Google Scholar for “electronic literature”, but just as importantly we have made a point of entering information about every dissertation we came across, saw referenced or read a mention of. Our goal has not been Fortunately all the data is documented in the ELMCIP Electronic Literature Knowledge Base, and as new dissertations are recorded in the knowledge base it will be simple for us or other researchers to re-run our analysis with the larger corpus.

Most dissertations are documented online in institutional repositories. Usually at least an abstract is available, often the author has also provided tags or keywords, and in many cases the full text of the dissertation is available. We also added tags when these were not given by the author, and based this on the abstract, and the full text if we had access to it. When we could not find the full text online we attempted to contact the author to ask for a copy. There are, however, some dissertations we could not access beyond the title and abstract. The Knowledge Base allows cross-referencing between nodes, so we have added cross-references (links) from the entry about each dissertation to the creative works it discusses, although this was not possible for dissertations we did not have full text access to. A few dissertations have been left out because we could not find a person able to read the language of the dissertation or the works referenced well enough to register the data - unfortunately, Xiaomeng Lang’s work on Chinese electronic literature (Lang 2008) is one of these, and Asian electronic literature is sorely underrepresented in the Knowledge Base. We hope to be able to include such material in future analyses.

Simple word clouds of the tags used to describe the dissertations clearly show a shifting focus in the study of electronic literature over the years. The first dissertation on electronic literature was James Meehan’s “The Metanovel: Writing Stories by Computer” (1976), followed by Mary Ann Buckles’ “Interactive Fiction: The Computer Storygame 'Adventure'” (1985) and then a steady stream from 1991 onwards.

I split the dissertations into three groups: 1976-2001, 2003-2007 and 2008-2012. We found no dissertations on electronic literature published in 2002. The first time period shows a clear focus on technology and on the mechanics of reading these kinds of text, as shown in the tag clouds in Figure 2. The tag clouds are generated from tags assigned to the dissertations in the Knowledge Base. When the dissertation author specified keywords on the dissertation we used these, and in other cases we added them based on the description and content of the dissertation. The tag clouds were generated using an online tool that displays more frequently used tags in a bigger font. Colors do not signify anything in these images.

As you can see in Figure 1, Hypertext is by far the most common tag (10 of 15 dissertations) and the next most popular tags are only used in three dissertations each (fiction, text, reading, computer) followed by five tags used in two dissertations each (computational, interactive, new media, theory, cybertext).

The next batch of dissertations was published from 2003 until the end of 2007. We see hypertext is still a very important tag, but the focus on technology and the mechanical is dwarfed by more literary words: fiction, literature, narrative, as well as by words showing the broadening of the field: game, art, media.

The third group of dissertations covers the period from 2008-2012, and the most obvious shift is the way “hypertext” has disappeared and “digital” has taken its part. “Poetry” has grown a lot, and “fiction” is barely mentioned. the more general term “literature” has shrunk a little, and we also see how “media” has grown steadily over the three periods. The last few years have seen several dissertations on quite specific genres: interactive fiction, kinetic poetry and generative texts, and this is reflected in the more genre-focused vocabulary of current dissertations.

Diversity is the Rule

After going through the dissertations [28 in this draft, close to 60 when completed] and adding links in the Knowledge Base to the creative works they referenced, I was able to see clear patterns in the ways that the dissertations discuss creative works. The most important finding is that PhD dissertations about electronic literature reference a great diversity of creative works.
As of early July 2013, references to creative works are added for 29 of the 33 dissertations published from 2006-2013 that we have records for in the Knowledge Base. The remaining five dissertations were not included either because we lacked access to the full text, or, in the case of two dissertations, due to language challenges (Hungarian and Chinese). These could be added in later analysis.

A total of 341 separate works were referenced in the dissertations, and there was a lot less redundancy than might have expected. 265 creative works were only cited by one of the 28 dissertations. 57 were cited by two dissertations, and just 19 by three or more dissertations. On average, then, each dissertation cites 16 different creative works. In practice, of course, some cite just a few works while others cite many, but certainly most dissertations cite many creative works. This suggests that the claim that scholarship in electronic literature ignores the creative works and only focuses on theory is at least not the case in current dissertations.

This preliminary corpus gives us the following (preliminary) “top twenty” list of the most cited creative works in dissertations:

Title of work	Citations
afternoon, a story	9
Victory Garden	7
Patchwork Girl	7
Cent mille milliards de poèmes	6
Zork 1: The Great Underground Empire	6
The Impermanence Agent	5
ELIZA	5
Façade	5
my body--a Wunderkammer	4
Lexia to Perplexia	4
Deadline	4
The Last Day of Betty Nkomo	3
Sooth	3
Composition No. 1	3
Text Rain	3
The Legible City	3
The Unknown	3
The Policeman's Beard is Half-Constructed	3
Colossal Cave Adventure	3
JABBER: The Jabberwocky Engine	2

There are no great surprises in the list; these are all family works, although it’s interesting to note that most of the works are early and well-known examples in their genres.

Clusters of Works

Network analysis allows us to see connections between the dissertations and the works they discuss as a network graph. This kind of visualization and analysis has its roots in social network analysis, a sociological methodology that started in the 1950s, and in network theory, a mathematical methodology allowing us to understand networks in, for instance, molecular structures, the spread of disease, airline traffic networks or mobile phone usage.

The connections between dissertations and the creative works they cite form a bipartite or two mode networks, which means there are two types of nodes: dissertations and creative works. A reference from a dissertation to a creative work forms a connection or an edge in this network.

Figure 2 shows the complete network of dissertations and creative works that are referenced. Dissertations are shown in blue and creative works in red. Creative works have been sized according to their indegree, that is to say the number of inbound links or the number of different dissertations that reference them. The most-cited creative works tend to cluster in the middle of the network, and you can see other clusters around the edges. Just one dissertation in the sample, Anders Løvlie’s 2011 Textopia: Experiments with Locative Literature, has no shared references with the others, and so it has floated out to the left with no connections to the main network. The blue spots with no connections are the dissertations that are in the Knowledge Base but that have not yet had references to creative works added.

Looking more closely at the graph, we can see clusters emerging around certain genres of electronic literature. For example, at the top of Figure 2 we see a cluster of dissertations about interactive fiction, shown in the detail view in Figure 3. Nick Montfort’s 2007 dissertation Generating Narrative Variation in Interactive Fiction cites a fan of works that are not cited by other dissertations, but also many that are cited in Jeremy Douglass’s 2007 dissertation Command Lines: Aesthetics and Technique in Interactive Fiction and New Media. Montfort has one shared reference, Anchorhead, with Van Leavenworth’s 2010 dissertation The Gothic in Contemporary Interactive Fictions, which also has a number of shared references with Douglass’ dissertation.

We see that Façade and ELIZA, which both center on conversational characters, and Zork, the first commercial interactive fiction, are referenced by the three dissertations on interactive fiction, but that they are pulled away from it in the layout of the graph. That is because they are also referenced by many other dissertations that do not reference other interactive fictions. This is where the network visualization really begins to get interesting, because it allows us to see how the different creative works relate to each other. Façade and ELIZA are in a bridge or broker position between dissertations on interactive fiction and dissertations on generative narrative and poetry, by Noah Wardrip-Fruin (2006), Fox Harrell (2007) and Daniel C. Howe (2009). This generative cluster is circled in green in Figure 3.

The interconnections are not always as clearly marked by genres as in these cases. If we zoom in to the bottom of the overall graph in Figure 2, we see a different kind of network represented, as shown in Figure 4.

The four dissertations in this part of the network primarily reference newer works, and as you can see by the fans of red creative works around each blue dissertation, they mostly reference works that are not discussed by any of the other dissertations in the sample. Maria Engberg’s Born Digital: Writing Poetry in the Age of New Media (2009) is shown at the top left, Leonardo Flores’ Typing the Dancing Signifier: Jim Andrews' (Vis)Poetics (2010) at the top right, Jeneen Naji’s 2012 Poetic Machines: an Investigation into the Impact of the Characteristics of the Digital Apparatus on Poetic Expression (2012) at the bottom left and Giovanna di Rosario’s Electronic Poetry: Understanding Poetry in the Digital Environment (2011) is at the bottom right.

We see that some creative works pull these dissertations towards each other. Young Hae Chang Heavy Industries’ The Last Day of Betty Nkomo is discussed by Engberg, Naji and Rosario, while the other works are only discussed by two of the four dissertations.
This way of representing the scholarship and the creative works about electronic literature can suggest interesting genre relationships, or provide a way of visualizing what we already know. It is not very controversial to note that interactive fiction and conversational characters are related, but perhaps it is a little less obvious that interactive fiction is closely related to generative narrative, and that conversational characters are a broker between these genres, as suggested by Figure 2.

The portion of the network shown in Figure 4 might be most interesting as a way of navigating the field. Seeing the works laid out in this manner would be useful for newcomers to the field or anyone looking for new works to read and explore, as is particularly evident in the web-based browsable versions of networks like these presented in Scott Rettberg’s paper at this conference (Scott Rettberg 2013).

Relationships Between Creative Works

Another way of viewing the network is to convert it to a one-mode network: that is, to remove the dissertations from the network and instead view two creative works as being directly connected to each other if they are discussed in the same dissertation. Rather than saying that Engberg’s, Naji’s and Rosario’s dissertations are connected to each other through their shared referencing of The Last Day of Betty Nkomo, a one-mode graph looking only at the creative works would instead say that The Last Day of Betty Nkomo is connected to Using Jaroslav Kuchar’s Multimode Networks Transformations plugin for Gephi I then converted the first bipartite network, consisting of dissertations connected to the creative works they referenced, into a monopartite or one-mode network consisting only of creative works. The original network had directed edges, because there was a clear direction of references: from a dissertation to a creative work. Kuchar’s plugin requires undirected edges, so for the plugin to work, I exported the network as two csv files, one for the nodes and one for the edges, I changed the edge type to undirected in Excel, then imported both csv files into Gephi again and ran the plugin, asking it to remove the dissertation nodes.

The ForceAtlas 2 layout algorithm organizes the graph a little differently each time you run it, but the edges and general clusters are always the same. Figure 6 shows the same cluster as figure 4 as a one-mode network.

As you can see in figure 6, the resulting network is extremely densely linked, because if there are fifty creative works referenced by a single dissertation, each of these works will have edges to each of the other fifty works.

It is more useful to filter out creative works that are only referenced by one dissertation before the transformation to a one-mode network. This leaves 105 nodes (including dissertations) and 200 edges between them. Transforming the network to a one-mode network creates a network deletes all the dissertations and we are left with 76 nodes and 1226 edges.

The genres in this graph are quite clear. The modularity algorithm groups works into just five modularity categories. At the top, in green, we see generative works, with the Oulipoan print classic Cent mille milliards de poèmes an important hub in this group. On the upper right, interactive fiction and classic conversational characters form one cluster, in yellow. In the center, in red, we see classic hypertext fictions and what we perhaps might call “mainstream” or “classic” works of electronic literature. The lower two clusters are interesting. They both represent more visual and kinetic poetry, but it’s not clear why they are as separate as they appear in this representation. Admittedly only three or four dissertations have determined each of these two clusters, but it is surprising that they appear really quite cohesive and yet separate. The purple cluster can perhaps be thought of as kinetic poetry or digital poetry, whereas the blue cluster leans more towards the art world.

This representation of the field does leave us with some questions. There are individual works where the placement seems surprising. Why is Donna Leishman’s Deviant: The Possession of Christian Shaw (2004) in the generative narratives group? Why is Composition No 1., which is a printed novel in loose leaf format, in the lower right part of the red “classics” cluster, near the art poetry, rather than being more closely connected to the generative works? Perhaps the answers lie in the idiosyncrasies of the dissertation authors, rather than in any generic qualities of the works. And yet even with just 28 dissertations for our corpus, the overall graph corresponds well to real-world genres.

Ways of Visualizing a Field

Visualizing the field of electronic literature by looking at creative works cited by dissertations provides one view of the field that is difficult to see by other means. The clusters do correspond to genres, unsurprisingly, since dissertation authors especially in the last few years tend to focus on a particular kind of electronic literature rather than try to write about it all, as was perhaps a greater temptation in the earlier days of the field.

An important advantage of using the Knowledge Base to visualize the field is that this makes it possible to include and even center the analysis around creative works, which are not part of library catalogs or others systems that co-citation analysis might typically pick up.

In a data sprint at the Digital Methods Winter School at the University of Amsterdam this January, I worked with fellow scholars on alternative way of visualizing the field of electronic literature: by feeding a selection of “seed books” in the field into the Amazon advertising API and retrieving books that are also bought by people who read the seed books. A full account may be read in Berry, Borra, Helmond, Plantin and Rettberg (2015). It was immediately apparent that while our digital humanities seed books generated a fairly cohesive network that gave a fairly clear idea of what the digital humanities might be, the electronic literature seed books generated a far more disparate network.

While the Amazon related books network for electronic literature does show interesting adjacent fields, such as the conceptual writing books in figure 9, a large and cohesive cluster on games studies, as well as a strong cluster on digital humanities, the lack of a strongly cohesive cluster of books specifically on electronic literature can be read as evidence that the field of electronic literature is not defined by books. There is little surprise in that, of course, though it is interesting that the digital humanities, which is also nominally centered around digital projects and methodologies, is far more clearly defined by its books.

The Amazon related books network does include three works of electronic literature: three of the early Storyspace classics are of course for sale on Amazon. People who buy Hayles’ Electronic Literature, Landow’s Hypertext 3.0 or Murray’s Hamlet on the Holodeck, apparently frequently also buy afternoon, a story, and people who buy afternoon often buy Patchwork Girl or Victory Garden - or techno-utopian or -dystopian books like Clay Shirky’s Here Comes Everybody or Evgeny Morozov’s Net Delusion.
As we detail in the paper discussing the Amazon related books networks, there are many possible methodological flaws and sources for error in the analysis, but it does serve as an interesting mirror view of the field of electronic literature that is, as the digital methods creed demands, completely based on data that is digitally native rather than curated by experts as the ELMCIP Knowledge Base is.

Dissertations as Collective Curation

As shown by this and related research using network analysis to analyse data from the ELMCIP Electronic Literature Knowledge Base, there is much interesting work that can be done. The Knowledge Base offers a promising platform for scholars interested in understanding the big picture of electronic literature, whether one is most interested in creative works or in scholarship about and around them.

In effect, this paper repurposes dissertations on electronic literature, using them as a form of collective curators of the creative works in the field. Individually, the scholars writing these dissertations were not intending their selection of a particular set of creative works to be appropriated by a scholar such as myself in order to map the field. Perhaps they would have chosen differently if they had realized that their work would be used in this way. If a scholar currently writing a dissertation on electronic literature hears or reads this paper, they may think differently about their selection. Perhaps the knowledge that their references will transform to links that can be used to visualize and understand a field will bring us full circles: if our references really do become links, then of course our collective scholarship should be viewed as a collective hypertext. Using visualization software such as Gephi, we can make the map view of our collective field.

Acknowledgements

I could not have done this work without the support and active participation of the University of Bergen Electronic Literature Research Group and of the ELMCIP project, both led by Scott Rettberg. While the analyses in this paper are my own, a lot of the data was entered by fellow research group members and by the many other contributors to the Knowledge Base. My switching use of “we” and “I” reflects this shared labor.