You are currently viewing MICLE



Micro-cues of language evolution: A Multifactorial model of V2 loss in Central Romance (MICLE).

Follow us

A collaborative project between the University of Caen (France) and Goethe University, Frankfurt am Main (Germany).

The MICLE project aims to :

  • create a corpus of texts of the same types for the Norman French and the Venetian language, for the origins to the seventeenth century
  • explain syntactic change in the sentence structure in two Romance language varieties for the period in question
  • advance methodological practices and conceptual perspectives in the field by creating a model linking grammatical change to the acquisition of language.


01/06/2021 – 31/05/2024


Logo ANR

In order to achieve MICLE’s main goal of elaborating and testing a multifactorial model of word order evolution, we are creating a bilingual corpus of texts in two Romance language varieties with simiar trajectories of development, the Venetian and the French of Normandy.

Our corpus aims, as far as possible, to give acces to the everyday language from the earliest vernacular witnesses to the seventeenth century. The texts of the corpus belong to the same two non-literary genres, which will allow us to avoid distortions and biases that collecting a diverse range of texts would have introduced. The genres in question are :

  1. legal texts and trial accounts;
  2. personal and business correspondence

since they are likely to contain evidence of spoken language and traces of dialogue. At the same time, we take into account the specificity of both languages’ extant witnesses. The Venetian part of the corpus will thus include statutes and other legal texts that illuminate the system of government in different regions under Venetian control. On the other hand, the French corpus will showcase witch trials in Normandy, from Jeanne d’Arc to Madelein Bavent.

Digitised and tagged following the rules of TEI (Text Encoding Initiative), lemmatised and annotated in order to enable the search function, the texts of the corpus will be made available to researchers and the public via the project website, currently under construction.

Primary investigators


Pierre LARRIVÉE | CRISCO · université de Caen Normandie · France

Principle Investigator of the French team of the project, in charge of the overall conceptual and operational coherence of the project, in close association with the German PI.

My goal through the project is to develop and share the resources, the methodological approaches and the conceptual tools to advance the understanding of the nature of language change. 


I trained in Quebec City (PhD, 1998), and Strasbourg (Habilitation, 2001), have worked in Birmingham (Aston University, 1998-2011), and in Caen, where I accepted a chair in 2011. My current work is striving to establish the contextual determinants of grammatical change. 

More information and list of publications.


Cécilia POLETTO | Institut für Romanische Sprachen und Literaturen · Goethe Universität · Frankfurt, Germany

Primary investigator (German team). Full professor at Goethe University in Frankfurt am Main.


Cecilia Poletto has published extensively on several topics of Old Romance with a particular focus on Old Italian syntax such as V2 effects, including information structure. For many years Professor Poletto was the PI on the Syntactic Atlas of Italy (ASIt) project, as part of which she conducted a lot of field work. She has been on the steering board of various dialectological projects, such as European Dialect Syntax, Scandinavian Dialect Syntax. 

More information and list of publications.

Post-doctoral researchers

Mathieu GOUX
Mathieu GOUX

Mathieu GOUX | CRISCO · université de Caen Normandie · France

Post-doctoral fellow of the project (French team), in charge of the digitization, transcription and TEI-XML structuring of the texts of the corpus, as well as their annotation and linguistic analysis.


Doctor in French language and literature, specialist in diachronic textual Grammar and classical French, lecturer at UNICAEN (Normandy, France).

More information and list of publications.

Francesco PINZIN

Francesco PINZIN | Institut für Romanische Sprachen und Literaturen · Goethe Universität · Frankfurt, Germany

Post-doctoral fellow of the project (German team), in charge of selection, digitization, transcription and TEI-XML structuring of the texts of the corpus, as well as their annotation and linguistic analysis.


I hold a PhD in Linguistics. My research focuses on formal approaches to morphosyntax and on the description and analysis of grammatical phenomena in Latin and Romance languages, with a focus on Italian Dialects, both in the synchronic and diachronic perspective.

In the past, I worked as a post-doctoral researcher on the binational German-Swiss project DiFuPaRo, in which I took active part in the creation of the tagged corpus.

More information et liste de publications.


Natalia ROMANOVA | CRISCO · université de Caen Normandie · France

As project manager, I work with the PIs and  post-doctoral follows on all aspects of the MICLE corpus creation. I am involved in identifying and digitally editing unpublished manuscripts that will be made accessible via the project website.


I hold a PhD in Medieval French literature (University College London) et an MA in Digital Humanities (King’s College London). I have worked in digital editions projects in the UK and at King’s Digital Lab. I am particularly interested in medieval French palaeography, digital edition and research data management.

More information

Larrivée, P. 2021. An Information Structure scenario for V2 loss in Medieval French. Diachronica 38.2: 189-209

Poletto, C. 2019. More than one way out: on the factors influencing the loss ov V to C movement. Linguistic Variation 19.1 (2019): 47-81

Workshop of the GRAVO · Grammatica del Veneto delle Origini project · Thursday 24 February 2022 at 4 PM (CET) | University of Padua

Cecilia Poletto and Francesco Pinzin will present at paper entitled  « Infinitival-verbal inversion, a preliminary study on three Old Venetian texts ».
If you would like to join the workshop on Zoom, please contact us for the link.


«Linguistic Connections» seminar · Thursday 16 December 2021 at 12:30 (CET) | University of Padua

Mathieu Goux will give a lecture in the «Linguistic Connections» seminar at the University of Padua. Delivered in English, the talk « Text transcription and artificial intelligence » will be accessible via a Zoom link.

In this lecture, we will present the (semi-)automated workflow for corpus linguistics adopted by the MICLE project. We will focus on the tools and software used to digitise, transcribe and structure textual data from manuscript and printed sources and convert it into XML-TEI format as well as visualisation tools available to researchers. General principles of metalinguistic annotation will also be discussed. In addition, we will address the challenges and difficulties encountered in the process of the application of these methodologies to the field of historical linguistics. 

Seminar « Stéréotypes de la langue juridique et dans la langue juridique | Stereotypes of legal language and in legal language» · Friday 3 December 2021 at 2 pm (CET) | University of Nantes

In the seminar organised by the Centre de Recherche sur les Identités, les Nations et l’Interculturalité (CRINI), University of Nantes, Pierre Larrivée presents a lecture entitled « Textes juridiques et changement linguistique | Legal texts and linguistic change ».

This paper examines three aspects of syntactic change with the help of legal corpora. Firstly, we demonstrate that the use of the case system was characteristic of the written practice but not of everyday language from at least the 13th century onwards. Secondly, we compare the presence of the null subject and V2 inversion in legal material, on the one hand, and in literary texts, on the other, in order to argue that the rate of usage was probably considerably higher in literature as opposed to everyday use. Finally, we show how the comparison of the presence of these phenomena in different sub-genres of legal texts allows us to determine if grammatical phenomena are linked to linguistic register. The assessment of variation and change helps us demonstrate that the assumption that, given their formulaic nature, legal texts have little value for the study of syntax, is indeed profoundly incorrect.

More information