This site requires Javascript to be turned on. Please enable Javascript and reload the page.

Visible Prices

Visible Prices: Technical Statement

One of the major challenges of working with economic data is its heterogeneity. Some prices will be fixed, and others variable. Some wages will include non-monetary supplements: 6 shillings a day, plus beer and potatoes. XML-based encoding languages (which have been highly popular for digital humanities and scholarly editing projects) assume that the things you're encoding (plays by Shakespeare, poems by late Victorians) will look relatively similar, so that the same tags can be used to describe them.

MySQL, and other relational databases, which use tables rather than markup language, assume that if you have a table with 5 columns that describe certain qualities, that most of the entries in the table should have data in each column. Otherwise, the structure becomes rickety, making it harder to construct queries, and more likely to return errors, or crash the database.

Choosing a platform has been a long process, because learning enough to evaluate how well a particular tool will work with the data is slow work. It seems probable to me that if I had been willing to limit my scope; say, to prices related to governesses' salaries, or to prices in a particular author's body of work, that MySQL or TEI might have worked more effectively. But a smaller project, while more immediately gratifying, wouldn't have taught me nearly as much as a monumental project.

In July 2013, I attended the Digital Humanities Oxford Summer School, and took a crash course in semantic web programming, focusing on two closely-related specifications: OWL (Web Ontology Language) and RDF (Resource Description Framework). Both OWL and RDF are intended to model complex data. You've encountered them before -- they provide the structure behind resources like Wikipedia, and the databases of music metadata that iTunes uses to identify your CDs. Semantic web description attempts to capture as much detail as it can, and make it searchable.

The basic unit of OWL is called a triple, and it contains a subject, a predicate, and an object. For example:

Subject: Jane Eyre Predicate: hasAuthor Object: Charlotte Bronte

Subject: Charlotte Bronte Predicate: hasBirthdate: Object: April 21, 1816

Triples are linked together to form what semantic web programmers call a graph. (To outsiders, it looks more like a cluster). For example, the graph for Charlotte Bronte would involve both of the above triples (as well as several others, including triples that would tell you that Bronte has two sisters, used the pseudonym Currer Bell, etc.).

Triples are queried using SPARQL (SPARQL Protocol and RDF Query Language – pronounced “sparkle”) – a language that matches the subject, object, or predicate – or all three, or a combination of two, and returns the information that matches.

So, you might write a SPARQL query that asks for all the novels that match the “hasAuthor" predicate with “Charlotte Bronte.” Alternately, you might write a query that asks for all the novels written by authors with pseudonyms; and you might specify that the pseudonym include the name “Bell.” This would return the Bronte sisters’ works – and the works of any other authors whose pseudonym included “Bell.”

The advantage of OWL, and other semantic web specifications, is that they can handle my highly heterogeneous data without crashing. They balance of structure and flexibility for modeling data from sources that may have significantly different types of metadata, without sacrificing expressivity in representing detailed features of the source texts. This means that OWL and semantic web programming are a good fit for Visible Prices.

Like TEI, which encourages users to customize markup language for their needs, and to develop new terms and categories, OWL allows users to develop new vocabularies for particular subjects, and to share them, making them available for other similar projects. This is why semantic web programming is often referred to as “linked open data.” It’s meant to be open and shareable, meaning that if another scholar developed a digital humanities project, focusing only on Charlotte Bronte, they could utilize my data on the prices that show up in Bronte’s novels.

The vocabularies that semantic web programmers develop are called ontologies, because they define concepts and relationships within a specific area. If you’ve worked with metadata, then you may have made use of the Dublin Core ontology.

A semantic web database may make a certain set of data usable. One of the difficulties of building Visible Prices has involved the non-decimal currency values for British money before 1971. No existing database has indexed currency values, from one farthing and upwards. As a result, the values that you might see in texts (1 shilling or 5 pounds or 20 guineas) are arguably data – but they’re not good data, because they’re much harder to work with. Making such an index would be slightly tedious (though much of the process could be automated) – but once created, it would transform currency amounts from unwieldy to usable objects.

Creating the dataset for pre-1971 currency is part of my continuing work on the Visible Prices project. But I’ll also be working to create my own ontology that allows me to encode prices into my database. There are parts of my database that will make use of existing vocabularies, like DublinCore. Other parts of it will require me to develop my own terms – for things like the non-monetary supplement to wages of alcohol. Because my own semantic web programming knowledge is still relatively new, I’ll be consulting with a professional web ontologist this spring as I work out the structure. This work will be supported by a Small Project Grant from the European Office of Digital Humanities.

Developing my own ontology is a significant step forward for Visible Prices. It will allow me to populate the database, and set up an interface through which users can query my data. Once that database is set up, Visible Prices will be ready to grow at a much faster rate. At that point, I’ll be ready to seek large-scale grants for its ongoing expansion and support.

This page is referenced by:

Introduction: Visible Prices Prototype
Note: You can explore this website on your own tablet or laptop at http://www.tinyurl/vpmla14.
Welcome to the latest prototype of my ongoing digital humanities project, Visible Prices (VP). This demo was created for display at MLA 2014, as part of the DH From the Ground Up panel; and focuses on one 10-year period in the 19th century, 1845-54. I chose this span because I indexed prices in Charles Dickens' novel The Personal History of David Copperfield, which was published during 1849 and 1850. While eventually, VP will be much larger; this is a small prototype, designed for experimenting with Scalar. You can navigate it using the path below, which is also available via the left-hand column.
I've been working on VP since 2009. I had noticed that references to specific prices for specific incomes, goods, services, and experiences appeared regularly in a variety of genres; and also, that these prices were all but invisible in the texts, and difficult for my students to understand.
My original question involved Charlotte Bronte's Jane Eyre. In the novel, Jane is offered £30 per annum as governess. But what does £30 mean for Jane? Is it a generous salary, or an impecunious one? What would we understand about Charlotte Bronte's novel if we knew what £30 would buy?
Though inflation calculators exist, like this one, created by the Bank of England, and based on the Consumer Price Index, such calculators communicate something that most users already know: that the value of money changes over time; and that enough money to live on in 1847 would be insufficient to live on in the 21st century.

This introduction is the start of a path that will take you through a subset of the full collection, which I used to test out Scalar, to see whether it would work as a platform. I've also included a statement on the technical side of building VP and on the next steps I'll be taking as the project continues.

This page references:

Understanding British currency values before 1971