This repository covers different aspects of Automated Text Recognition (ATR) in Historical Research from post-processing scans of historical documents to OCR/HWR technology and improving text recognition with AI. Monika Barget, Assistant Professor in the Faculty of Arts and Social Sciences at the University of Maastricht, started this repository in preparation for the Bring Your Own Data Lab workshop hosted by IEG Mainz in February 2025. Apart from code samples, the repository will also include short instructions and sample workflows, all with a special focus on historical documents.
Over time, materials in this repository will also be expanded and improved in follow-up workshops and research collaborations across different (digital) humanities disciplines. We will experiment with different documents in handwriting and print, especially exploring solutions for texts in languages other than English.
To share (video) tutorials and readings that contributing colleagues found helpful in their work, we have created the ATR History group library, which we will update continuously. Please contact us if you are interested in becoming a co-editor of this shared library.
If you have questions or are interested in exchanging experiences, please get in touch!
Monika Barget can be contacted via the Faculty of Arts and Social Sciences (FASoS) at Maastricht University and is active on Mastodon.
Koen Hufkens is founder and researcher at BlueGreen Labs and also active on Mastodon.