◎ JADH2016

Sep 12-14, 2016 The University of Tokyo

A Web Based Service to Retrieve Handwritten Character Pattern Images on Japanese Historical Documents
Akihito Kitadai (J. F. Oberlin University), Yuichi Takata, Miyuki Inoue, Guohua Fang, Hajime Baba, Akihiro Watanabe (Nara National Research Institute for Cultural Properties), Satoshi Inoue (University of Tokyo)

We present a web based service to retrieve handwritten character pattern images written on historical Japanese documents.

Digital images of handwritten character patterns are important research products of history and archaeology. We have been providing two digital archives of the images. One of them contains the images extracted from mokkans written in and around 8th century. The mokkan is a Japanese name of a type of historical documents. Wooden tablets were used as the recording media, and brushes with Indian ink were used to write the character patterns of the documents. The other contains the images from paper documents written in and around 9-18th century. Every image of the character pattern is selected by experts of Japanese history, archaeology and calligraphy.

Information retrieval methods and technologies are critical factors for digital archives of history and archaeology. Employing a character code as a key of the retrieval is a reasonable implementation for digital archives of character pattern images. We are providing a crossover retrieval system of the two digital archives in which both the archives output the images that belong to the key code (http://r- jiten.nabunken.go.jp/kensaku.php). However, the character codes for historical languages have not been defined clearly yet. The definitions are ongoing research activities of history and archaeology. For the reason, we need to provide alternative methods that employ other information as the retrieval key.

The web based service Mojizo that we present in this abstract is one of the alternatives. As same as our system previously mentioned, Mojizo provides cross over retrieval of the two digital archives, but it employs a handwritten character pattern image as the key.

Mojizo has a shape evaluation engine consisting of pattern matching technologies. This engine calculates similarity between the key and the images on the digital archives. Since the evaluation needs a large amount of calculation, we designed and implemented the engine and the other modules of Mojizo to run on server side. Therefore, we can use Mojizo via small portable terminal devices with network connection and low computing power only. Digital cameras commonly equipped on such portable terminal devices work well to capture the key images of handwritten character patterns on historical documents. We have opened Mojizo on our web site (http://mojizo.nabunken.go.jp/). Web browsers provide user interfaces to input the key images and to see the similar handwritten character pattern images. Mojizo also provides the links to meta data sets for each of the similar images. The meta data sets are results of decoding processes of historical documents performed by historians and archaeologists. Therefore, we expect that Mojizo supports users who have unreadable handwritten character pattern images.

To broaden application ranges of the digital archives is an aim of our research activities. The users of Mojizo need no keyboard to input the character codes. This means that Mojizo can provide ubiquitous gateways to the digital archives. Activating usage of digital archives is important to inherit the history of the human behavior in our modern society. Mojizo is providing about 28,000 images of handwritten character pattern with the link to their meta data sets, and the number is increasing.

Our presentation will display the detail design and implementation of our web based service including the shape evaluation engine. Also, we will present some examples of information retrieval using Mojizo.


This work was supported by the Grants-in-Aid for Scientific Research (S)-25220401, (A)- 26244041 and (C)-15K02841.