Corpus description

The WoPoss corpus is a diachronic corpus of Latin texts covering the period from the 3rd century BCE to the 7th century CE. Texts have been selected taking into account not only the chronological criterion, but also sociolinguistic—mainly diatopic and diastratic—variation. The WoPoss corpus includes literary and documentary texts, direct sources (such as inscriptions) and replica documents (such as modern editions of ancient texts based on the manuscript tradition).

Diachronic and sociolinguistic variation is encoded at the level of the text files metadata, according to the following tagset:

The setting-up of the corpus builds upon projects providing open access text files (see the credits page for the list of projects with which we have either informal or formal agreements).

A full description of the annotated corpus will be provided soon.

Current state of the corpus

The WoPoss corpus is currently under construction and the annotation of the texts according to the WoPoss guidelines is ongoing. We have made available a small sample of the corpus through the search interface. The annotated texts are available in this repository under a CC-BY licence. The sample consists of the following texts (for a total of 91,641 words):

How to cite

To cite single files or the results related to single files, please use the following format:

To cite the project and/or website as a whole please use the following format: