Download
1. Source files
The text is encoded in XML and follows the TEI P5 Guidelines. Linguistic annotations are stored in a relational database. They will be integrated into the TEI document as soon as the automatically assigned analyses have been verified, probably by the end of 2026.
Meanwhile, you can browse the tagged text online or download the tokenized text and database below.
- TEI document:
- gotica.xml (zip)
- documentation: TEI header, converted to HTML with this style sheet
- MS Access database:
- gotica.mdb.zip (*.mdb file)
- gotica.mdb.export.zip (XML export with schema, generated by Access)
- documentation: relational structure and some older notes on the dictionary
The Access file served token and lemma data on the website from 2004–2025. It was last edited on 2026-05-15 and will no longer be updated. Data is currently being converted to a new format and open-source platform.
There is also a preliminary XML encoded version of Streitberg's glossary.
2. Derived formats
- Plain text (generated from the TEI document with this style sheet):
- JSON data for the client-side search engine:
- HTML 4.01†:
† This file is not derived from the master TEI document, but a (very) old HTML file that predates the TEI edition.