◎ JADH2016

Sep 12-14, 2016 The University of Tokyo

Image recognition and statistical analysis of the Gutenberg’s 42-line Bible types
Mari Agata (Keio University), Teru Agata (Asia University)

Traditionally, analyses of types used in the early printed books have been conducted by naked but trained eyes of bibliographers. The types of the Gutenberg 42-line Bible (hereafter “B42”), the earliest printed book in Europe with movable metal type, is no exception.

In 1900 Paul Schwenke published results of his minute and painstaking investigation of the B42 type.[*1] He identified and listed two hundred ninety types. The reasons for such a large number of types are the existence of abbreviations, contractions, and secondary forms, or abutting types, of almost every letter of the alphabet. The left side of an abutting type was flat, without the diamond shaped spur, so it could be placed close to the preceding type according to defined rules. Schwenke observed that after letters c, e, f, g, r, t, x, and y, this abutting type was used.

This composition rule was so strict that some deviations were even corrected during the actual print run, as the collation using superimposition of digital images by the present author demonstrated.[*2] The collation also raised new questions about the composition rules. For example, four stop-press corrections concern a shorter abutting “r”; its usage has not been previously studied in detail and thus need further analysis. In addition, collation results suggest that the types were not perfectly locked up but set loosely, resulting in many variations of word spacing, shifted lines, and both inclined and drifted letters.

Furthermore, other scholars had identified a different number of types of B42. Schwenke’s close observation may require several amendments.

In 2000, Paul Needham and Blaise Agüera y Arcas questioned how Gutenberg cast his types.[*3] A traditional view is that he produced types by steel punch, copper matrix, and adjustable hand mould, and thus he could produce thousands of “identical” types, from a single matrix. Needham and Agüera y Arcas made a clustering analysis of the lower case “i”s used in a 20-page Papal Bull printed in the DK type, which was made earlier than the B42 types and closely resemble to them. Several hundred “i” clusters were discovered; a far greater number than expected. They claimed that these “i” types could not have been made from a common punch and matrix and suggested that many matrices had been used in parallel, or equivalently, the matrix had been temporary and needed to be re-formed between castings. This is a significant question to shake to the foundations of the printing history. In spite of the considerable attention their research attracted, there have been few substantial follow-up studies.[*4]

The adoption of computer-based research now allows us to conduct experiments on a much larger scale that was previously possible. The present authors have developed a new method of semi-automatic image recognition of the B42 types and demonstrated that it have explanatory power beyond the influence of inking and photographic conditions when applying to data of a large scale.[*5]

The purpose of this study is to make further analysis of the B42 types with an improved method of image recognition reinforced by machine learning. The image data of B42 held in the Keio Gijuku Library was used for analysis. Information about X and Y coordinates, pixel width and height, and transcribed characters of each type image data are collected and used for the statistical analysis.

To analyze the vertical alignments, the average variance of the Y coordinate for each type image of each line, excluding types with descenders and capitals, were calculated. When doing a page-by-page variance analysis, pages that were thought to have been printed earlier exhibited greater variance.

The width data of each type image provided us useful information. A frequency distribution of the width of several types had two mild peaks; the wider types were those of primary forms, while the more narrow ones were those of secondary, abutting forms. Transcribed character data showed that the narrower ones positioned after letters c, e, f, g, r, t, x, and y. This result supports one of the composition rules observed in Schwenke’s study.

Further statistical analyses enable to investigate such characteristics as variance in the body size, the relative distance between a contraction bar and a main letter, and more. A close examination of these characteristics will lead to identify type variants and their distribution in the book. An accumulation of the results could give further clues to questions regarding specific details of the first printing shop in Europe, and, hopefully, of Gutenberg’s casting method.


[*1] Paul Schwenke, Untersuchungen zur Geschichte des ersten Buchdrucks. Berlin, Behrend, 1900.

[*2] Author. Stop-press Variants in the Gutenberg Bible: The first report of the collation. The Papers of the Bibliographical Society of America. 2003, vol. 97, no. 2, p. 139-165; Author. デジタル書物学事始め:グーテンベルク聖書とその周辺. 勉誠出版, 2010 [Author. Introduction to digital bibliography: the Gutenberg Bible and beyond. Bensei Shuppan, 2010.]; Author. “Improvements, corrections, and changes in the Gutenberg Bible." Scribes, Printers, and the Accidentals of their Texts. Frankfurt am Main, Peter Lang, 2011, p. 135-155.

[*3] Agüera y Arcas, Blaise. “Temporary Matrices and Elemental Punches in Gutenberg’s DK Type.” Incunabula and Their Readers: Printing, Selling and Using Books in the Fifteenth Century, Jensen, Kristian, ed. London, British Library, 2003, p. 1-12.

[*4] Pratt, Stephen. The myth of identical types: A study of printing variations from handcast Gutenberg type. Journal of the Printing Historical Society. 2003, new series 6, p. 7-17.

[*5] Authors. 活字の識別とその応用:グーテンベルク聖書の活字のクラスタリング. 日本図書館情報学会2014年度研究大会. 2014-11-29, 梅花女子大学(大阪府). 第62回日本図書館情報学会研究大会発表論文集. 2014, p. 117-120 [Authors. Recognition of types and its bibliographical application. Annual conference of Japan Library and Information Science. 2014-11-29, Baika Women’s University.]; Authors. A newapproach to image recognition and clustering of the Gutenberg’s B42 types. Memory, the (Re-)Creation of Past and Digital Humanities.-2016-03-15, Keio University (Tokyo).