Download

1. Source files

The text is encoded in XML and follows the TEI P5 Guidelines. Linguistic annotations are stored in a relational database. They will be integrated into the TEI document as soon as the automatically assigned analyses have been verified, probably by the end of 2026.

Meanwhile, you can browse the tagged text online or download the tokenized text and database below.

The Access file served token and lemma data on the website from 2004–2025. It was last edited on 2026-05-15 and will no longer be updated. Data is currently being converted to a new format and open-source platform.

There is also a preliminary XML encoded version of Streitberg's glossary.

2. Derived formats

This file is not derived from the master TEI document, but a (very) old HTML file that predates the TEI edition.