◎ JADH2016

Sep 12-14, 2016 The University of Tokyo

MEDEA (Modeling semantically Enhanced Digital Edition of Accounts) as Historical Method[*1]
Kathryn Tomasek (Wheaton College)

A cooperative project among historians in Germany, Austria, and the United States who are interested in developing models for digital scholarly edition of account books for comparative historical analysis, MEDEA was formed in 2014. Our goals are data modeling and expanding the community of practice for these activities. We have spent the past year introducing these ideas to scholars in Europe and the United States, holding workshops in Regensburg in October 2015 and at Wheaton College in Massachusetts in April 2016. Georg Vogeler Professor of Digital Humanities at the Austrian Centre for Digital Humanities, Centre for Information Modeling at Karl Franzens University in Graz is testing files created by other members of our community against his bookkeeping ontology.

Problem Space: Accounts

Accounts of various sorts—municipal, state, organizational, merchant, and individual—are abundant in archives, but they are underutilized as sources at least in part because of the technologies that have been used to produce and analyze them. They have been important sources both for those who created them and for historians who have sought to understand economic changes over time and space. Historians have sampled such sources using social science methodologies for almost one hundred years, but few scholarly editions of accounts have been produced (Ciula, Spence & Veira 2008, Keating et al. 2010, Teehan & Keating 2010, Bolt & van Zanden 2014, Frantz & Sarnowsky 2014, Burghartz 2015).

At least some of the challenges in producing digital scholarly editions of accounts are related to the development of the very technologies that have been used to record accounts in the past several hundred years in the global North. Account books—codices containing lists of commodities, currencies, and services exchanged among people—developed over time into printed ledgers and spreadsheets—analog books and papers that could be used to record information about these exchanges in tabular format. The formats of the various cross-referenced books of accounts associated with the business of running cities, estates, mercantile operations, and other enterprises gave people opportunities to track inventories, obligations, and assets with a view to such questions as personal, organizational, or state or municipal wealth. (Discussion of accounts kept on clay tablets, papyri, scrolls, and other media that preceded the codex are omitted only as a reflection that MEDEA currently does not include any scholars who are working with sources in such forms; we welcome such scholars and the challenges their sources will bring.) Accounting practices are themselves a technology that have undergone changes over time and space (Ijiri 1975, Everest & Weber 1977, Bywater & Yamey 1982, McCarthy 1982, Wigley n.d., Mersiowsky 2000, Wang, Du & Lee 2002, Arlinghaus 2004, Vogeler 2005, Vogeler 2010).

Digital versions of this analog technology in the form of spreadsheet software, relational databases, and web-based forms, such as the business software XBRL-General Ledger, have the advantage of simplifying the tracking of sums, balances, and in fact most numerical or mathematical operations as well as producing visualizations. However, spreadsheet software handles semantic values much less efficiently. Information about which currencies, commodities, services, individuals, and geographical locations are referenced in exchanges between groups or individuals can easily be lost or misrepresented in spreadsheets. Even more flexible relational databases are often idiosyncratic in their references to such semantic values and fail to meet any sorts of standards for interoperability, despite the considerable social scientific literature based on sampling from analog sources. Oxford historian Richard C. Allen has undertaken to assess the quality of data extracted from accounts and electronically available (Allen 2001, Allen et al. 2004, Allen 2014).

Possible Solutions: An Event-Based Ontology for Accounts

Vogeler and Tomasek have been working on somewhat parallel paths for the past decade or so. Both began from the position of the Guidelines of the Text Encoding Initiative (TEI) as an accepted method for producing stable humanities-oriented data, and both have sought to leverage the TEI’s position as a standard for such work to explore models for creating reusable and interoperative digital scholarly editions of accounts from original sources distant in time and space. (Tomasek & Bauman 2013, Vogeler 2015).

Vogeler has outlined a preliminary version of an RDF model for comparing accounts. In describing this model, he has argued that the “transactionography” TEI customization that Tomasek and Bauman developed a few years ago amounted “a simple ontology for accounting facts” (Vogeler 2016). Thus, the ontology that he has been developing begins from the transaction, incorporating the notion that

a transaction between two parties or accounts consists of at least one transfer from one to the other. It transfers a measurable and can be attested by text. The transfer occurs at a place. Booking a transfer into an account can create liabilities held by a party and owed to another (Vogeler 2016).

Vogeler borrows additional data types from XBRL-GL and TEI. These include monetary values, the entry, debit/credit, the balance, totals, and measure. And attending to the interests of historians, he adds prices, commodities, services, and conversions of measurements. Vogeler suggests further that common terms from the taxonomies developed by individual projects can be identified, exposed as RDF data, and described using the W3C’s Simple Knowledge Organization System (SKOS).

Along with Øyvind Eide, Vogeler presented slides at DH2016 that draw on CIDIC-CRM’s event-based modeling to point towards an ontology that can express both the human interactions and the accounting practices represented in account books (Eide & Orr 2009). Vogeler’s slide outlines the production of accounts as traces of human activity, historians’ interests in what accounts can tell us about the past, and the technologies most appropriate to creating digital surrogates susceptible to analysis. Eide’s slide illustrates how an event-based model of the activities that produced accounts can be expressed using principles from CIDOC-CRM. And Vogeler’s sketch of his bookkeeping ontology on GitHub offers a picture of its current status.

Fig 1. (Image Credit: Georg Vogeler, DH2016.)

Fig 2. (Image Credit: Øyvind Eide, DH 2016.)

Fig 3. (Image Credit: Georg Vogeler, DH 2016.)

Currently, Vogeler suggests using the TEI @ana attribute to add markup for this bookkeeping ontology. Such markup bypasses the need for the kind of “transactionography” described by Tomasek and Bauman, allowing markup of such information as “transfer,” “from,” “to,” or the ambiguous “between,” as well as “monetary value,” “what” was transferred, and whether the transfer was “mutual,” “multiple,” “unilateral,” or “enforced.” Example markup from my own project will show this ontology in use.

Following best practices for digital scholarly editions, the XML/TEI file can be stored, either alongside images of the original archival documents or with pointers to them. The bookkeeping information from the XML/TEI can be converted to RDF for comparison to other documents marked up in similar manner. Vogeler has tested such comparisons with a small set of files and is eager to increase the number of files marked up in such manner for further testing (Vogeler 2016).


Widespread scholarly edition of accounts using the ontology that Vogeler and Eide are developing has the potential eventually to offer an unprecedented source of aggregated data for historical research. As a result of MEDEA workshops, we have added to the community of practice for transcription and markup of accounts following Vogeler’s recommendations for use of XML/TEI with RDF using his bookkeeping ontology. Along with some colleagues in the United States, I am currently seeking funding to encourage use of Vogeler’s rmodel both in educational contexts and in those occupied by citizen archivists. We hope thus to increase the number of accounts available for comparison and to demonstrate the advantages of digital scholarly edition of accounts for making available historical data that can be reused by other scholars. Described in this way, our goals nothing new in Digital Humanities with regard to digital scholarly edition of texts. In its focus on accounts however, MEDEA marks a significant new opportunity. MEDEA challenges historians especially to consider digital scholarly edition of accounts as a new model of scholarship that takes advantage of the affordances of the Semantic Web.


[*1] A portion of the activities described in this paper is supported jointly by the National Endowment for the Humanities and the Deutsche Forschungsgemeinschaft. Any views, findings, conclusions, or recommendations expressed in this paper do not necessarily reflect those of the National Endowment for the Humanities or the Deutsche Forschungsgemeinschaft.


