ANR GRAPHEM : Grapheme based Retrieval and Analysis for PaleograpHic Expertise of Middle Age manuscripts

The GRAPHEM project is a multi-disciplinary project that is based on rich medieval manuscripts corpus. It has two ambitions:

  • to contribute to the creation of a true objective and scientific paleography; to create accurate methods for accessing to the contents of these manuscripts, using word-image similarities (Word Spotting, Word Retrieval).
  • The richness and diversity of the writings (of Latin language) will allow us to elaborate and test shape descriptors that will be used in the two applications. A particular caution will be held on the study of graphemes as pieces of strokes holding relevant information.

The laboratories composing the consortium are the LIRIS, the IRHT, the LIFO, the CRIP5 and the Ecole des Chartes. They worked together a few years ago on a project aiming to the exploration of medieval manuscripts. This project was part of the "Société de l'information" program of the CRNS and was entitled "Formes et Couleurs, outils de recherche". It was in this context that the roots of GRAPHEM were constituted.

However, the expected results are of a different kind. In the case of the paleography, we will aim at the development of knowledge and the formalization of the domain. The methods of access to the the textual content of medieval manuscripts are to be used in other contexts than the medieval manuscripts. They have a semiotic nature and can be called alternative solutions to the optical character recognition methods (OCR).

