POPP. Project for the optical character recognition of Parisian population censuses

Sandra Brée, Centre national de la recherche scientifique (CNRS)
Thierry Paquet, Université de Rouen, LITIS
François Merveille, GED Campus Condorcet
Thomas Constum, CNRS, LITIS
Nicolas Kempf, CNRS, LITIS
Pierrick Traounez, Université de Rouen, LITIS

This communication aims to present a new innovative project in Historical demography: POPP! The POPP project (Project for the optical character recognition of Parisian population censuses) has won a funding to create a vast database (12 million people) based on the nominal census of the population of Paris of 1926, 1931, 1936 and 1941 which are the only censuses of the Parisian population existing before the end of the 20th century. Thanks to a collaboration between informaticians, historians and archivists, and to the innovative deep learning technics, the data will be available for use in early 2022. This communication will present the optical character recognition that is used by the informaticians of the team to read the document and the role of the historians to prepare the data. Then, we will show all the topics that can be tackled with the database and, especially those that are allowed thanks to the coupling of censuses.

Keywords: Methodology, Historical demography/methods, Urbanization and urban populations, Census data

