MICLE · University of Caen Normandy

FR | EN

Micro-cues of language evolution: A Multifactorial model of V2 loss in Central Romance (MICLE).

A collaborative project between the University of Caen (France) and Goethe University, Frankfurt am Main (Germany).

The MICLE project aims to :

create a corpus of texts of the same types for the Norman French and the Venetian language, for the origins to the seventeenth century
explain syntactic change in the sentence structure in two Romance language varieties for the period in question
advance methodological practices and conceptual perspectives in the field by creating a model linking grammatical change to the acquisition of language.

Dates

01/06/2021 – 31/05/2024

Funders

More information

In order to achieve MICLE’s main goal of elaborating and testing a multifactorial model of word order evolution, we are creating a bilingual corpus in two Romance language varieties with similar trajectories of development, the Venetian and the French of Normandy.

Our corpus aims, as far as possible, to give access to the everyday language from the earliest vernacular witnesses to the seventeenth century. The texts of the corpus belong to the same non-literary textual types, which will allow us to avoid distortions and biases that collecting a diverse range of genres would have introduced.

Since they contain evidence of spoken language and traces of dialogue, for the French part of the corpus, we have included trial accounts, depositions and so-called styles de procès. In particular, the French corpus showcases witch trials in Normandy and the Channel islands, from Jeanne d’Arc to Madeleine Bavent.

Since April 2023, you can consult the first version of the French corpus, lemmatised and part-of-speech annotated, via the CRISCO lab’s TXM portal (select MICLE-PREVIEW from the left-hand side menu).

Considering the specificity and availability of extant witnesses, the Venetian part of the corpus will include not only trial accounts but also statutes and other legal texts that illuminate the system of government in different regions under Venetian control. The full annotated Venetian corpus will be made available at the end of the project. Wherever possible, annotated XML/TEI files of the texts of the corpus as well as the software developed and used for the purposes of annotation will also be made available in spring 2024.

Primary investigators

Pierre LARRIVÉE | CRISCO · université de Caen Normandie · France

Principle Investigator of the French team of the project, in charge of the overall conceptual and operational coherence of the project, in close association with the German PI.

My goal through the project is to develop and share the resources, the methodological approaches and the conceptual tools to advance the understanding of the nature of language change.

Biography

I trained in Quebec City (PhD, 1998), and Strasbourg (Habilitation, 2001), have worked in Birmingham (Aston University, 1998-2011), and in Caen, where I accepted a chair in 2011. My current work is striving to establish the contextual determinants of grammatical change.

More information and list of publications.

Cécilia POLETTO | Institut für Romanische Sprachen und Literaturen · Goethe Universität · Frankfurt, Germany

Primary investigator (German team). Full professor at Goethe University in Frankfurt am Main.

Biography

Cecilia Poletto has published extensively on several topics of Old Romance with a particular focus on Old Italian syntax such as V2 effects, including information structure. For many years Professor Poletto was the PI on the Syntactic Atlas of Italy (ASIt) project, as part of which she conducted a lot of field work. She has been on the steering board of various dialectological projects, such as European Dialect Syntax, Scandinavian Dialect Syntax.

More information and list of publications.

Post-doctoral researchers

Mathieu GOUX | CRISCO · université de Caen Normandie · France

Post-doctoral fellow of the project (French team), in charge of the digitization, transcription and TEI-XML structuring of the texts of the corpus, as well as their annotation and linguistic analysis.

Biography

Doctor in French language and literature, specialist in diachronic textual Grammar and classical French, lecturer at UNICAEN (Normandy, France).

More information and list of publications.

Francesco PINZIN | Institut für Romanische Sprachen und Literaturen · Goethe Universität · Frankfurt, Germany

Post-doctoral fellow of the project (German team), in charge of selection, digitization, transcription and TEI-XML structuring of the texts of the corpus, as well as their annotation and linguistic analysis.

Biography

I hold a PhD in Linguistics. My research focuses on formal approaches to morphosyntax and on the description and analysis of grammatical phenomena in Latin and Romance languages, with a focus on Italian Dialects, both in the synchronic and diachronic perspective.

In the past, I worked as a post-doctoral researcher on the binational German-Swiss project DiFuPaRo, in which I took active part in the creation of the tagged corpus.

More information et liste de publications.

Natalia ROMANOVA | CRISCO · université de Caen Normandie · France

As project manager, I work with the PIs and post-doctoral follows on all aspects of the MICLE corpus creation. I am involved in identifying and digitally editing unpublished manuscripts that will be made accessible via the project website.

Biography

I hold a PhD in Medieval French literature (University College London) et an MA in Digital Humanities (King’s College London). I have worked in digital editions projects in the UK and at King’s Digital Lab. I am particularly interested in medieval French palaeography, digital edition and research data management.

More information

Research Assistants

Enrico CASTRO
Emmanuele ROMANINI

Student Interns

2022-2023

Manon LAVERGNE | Université de Caen
Leah PAVCIC | Goethe Universität, Frankfurt
Francesca SANTANGELO | Università Ca’ Foscari Venezia

2021-2022

Agathe AUBERT
Lucy MARIE-LEBLANC
Marie PICART
Valentin SIMENEL | Université de Caen

Goux, M. & Pinzin F. 2023. Challenges of a Multilingual Corpus (Old French/Old Venetian): The Example of the MICLE project. Venise et la France. Similitudes, spécificités, interrelations. Castro E., Della Fontana A. and Pezzini E. Franco Cesati (eds) (forthcoming)

Larrivée, P. 2021. An Information Structure scenario for V2 loss in Medieval French. Diachronica 38.2: 189-209

Poletto, C. 2019. More than one way out: on the factors influencing the loss ov V to C movement. Linguistic Variation 19.1 (2019): 47-81

MICLE project team regularly give presentations and training sessions (online and in person) to present the corpus, its annotation programme and annotation workflow to colleagues and students at different universities in France, Germany and abroad. MICLE researchers also share their findings at international and international conferences and via invited lectures.

Call for papers, “Tracing the Curve of Evolution: Syntactic Change Through Text Types” · 28-30 March 2024

Upcoming academic presentations

On hold

Past academic presentations

Cecilia Poletto (et al) “Learning how to count: a treebank analysis of V2 word order in two Medieval Romance languages through time ” · 4-8 September 2023
ICHL26: 26^th International Conference on Historical Linguistics, University of Heidelberg

Cecilia Poletto (et al) “Are French and Venetian V2 languages? A diachronic treebank analysis” (conference presentation)” · 26-30 June 2023
LSRL53: 53^rd Linguistic Symposium on Romance Languages, Paris

Pierre Larrivée and Mathieu Goux “Antéposition stylistique de l’infinitif et du participe dans l’histoire du français” (conference presentation) · 24 March 2023
Colloque de la Société Internationale de Diachronie du Français (SIDF), Ludwig-Maximilians-Universität, Munich
More information

Francesco Pinzin and Cecilia Poletto “Economy and verb movement: the diachronic perspective” (conference paper) · 10 March 2023
DGfS 2023, Cologne

Tommaso Balsemin, Francesco Pinzin and Cecilia Poletto “Universal 20 restriction reloaded: the view from Old Italo-Romance” (poster presentation) · 1 December 2022
Going Romance 2022, Universitat Autònoma, Barcelona

Cecilia Poletto “On Deriving the Consistency Principle in Old Italo-Romance” (keynote lecture) · 17 November 2022
“Mapping Syntax” Workshop, University of Oxford

Mathieu Goux and Pierre Larrivée “Antéposition stylistique de l’infinitif et du participe dans l’histoire du français” (conference paper) · 13 October 2022
“La constitution de corpus en diachronie longue : méthodologies, objectifs et exploitations linguistiques et stylistiques” Conference, University of Grenoble
More information

Mathieu Goux “Enjeux des corpus multilingues en diachronie longue : l’exemple du projet MICLE” (conference paper) · 13 October 2022
“La constitution de corpus en diachronie longue : méthodologies, objectifs et exploitations linguistiques et stylistiques” Conference, University of Grenoble
More information

Pierre Larrivée “Les causes du changement syntaxique : Un modèle multifactoriel” (lecture) · 29 September 2022
Séminaire BCL, Université Côte d’Azur, Nice

Cecilia Poletto and Francesco Pinzin “What’s left of the V2 high tide: Infinitival Anteposition” (conference paper) · 14 September 2022
16th Cambridge Italian Dialect Syntax-Morphology Meeting, University of Naples

Pierre Larrivée and Cecilia Poletto (organisers), Hiwa Asadpour, Alessandra Giorgi, Onkar Singh, Sam Wolfe (speakers) “Multifactorial Approaches to Word Order Change” (conference workshop) · 24-25 August 2022
Societas Linguistica Europaea Conference, University of Bucharest
More information

Pierre Larrivée “Micro-cues and language change: There’s a quantitative correlation between V2 and particle si in (non-literary) Medieval French” (conference paper) · 4 August 2022
25^e International Conference on Historical Linguistics, University of Oxford

Pierre Larrivée and Cecilia Poletto “Ordre des mots, changement syntaxique et micro-indicateurs dans deux langues romanes” (conference paper) · 18 June 2022
Conference of the “Société de Linguistique de Paris”, Paris

Mathieu Goux and Francesco Pinzin “MICLE Project: theoretical goals and methodological considerations” (conference paper) · 29 March 2022
Conference “Venise et la France: Similitudes, Spécificités, Interrelations”, University of Lausanne and online

Cecilia Poletto and Francesco Pinzin “Infinitival Modal Inversion: A Preliminary Study of Three Old Venetian Texts)” (conference paper) · 24 February 2022
“GRaVO” conference, University of Padua and online

Mathieu Goux “Text Transcription and Artificial Intelligence” (lecture) · 12 December 2021
Lecture series “Seminari di Linguistica”, University of Padua and online

Pierre Larrivée “Textes juridiques et changement linguistique” (lecture) · 3 December 2021
Seminar series “Tournant de jurilinguistique”, University of Nantes

Pierre Larrivée “Triangulations” (workshop presentation) · 16 September 2021
Workshop “Pour une histoire de la langue ‘par en bas’: textes privés et variation des langues dans le passé”, Paris 3 University

Cookie	Type	Duration	Description
__Secure-YEC	third party	13 months	The ‘__Secure-YEC’ cookie is used to detect spam, fraud, and abuse to ensure that advertisers are not wrongly charged for fraudulent or invalid impressions or interactions with advertisements, and that YouTube creators participating in the YouTube Partner Programme are fairly compensated.
_pk_id.*	persistent	1 year 27 days	Used by Matomo to store information about the user, such as the visitor's unique identifier.
_pk_ref*	persistent	6 months	Used by Matomo to store attribution information, the referrer initially used to visit the website.
_pk_ses.*	session	30 minutes	Short-term cookies used by Matomo to temporarily store visit data.
_pk_testcookie_domain	session	less than a minute	Used by Matomo to check whether the visitor's browser supports cookies.
affluenceswebapi_ga	third party	session	Web widgets (webAPIs) are designed to disseminate traffic information (occupancy rates, schedules, waiting times) on third-party websites. A Google Analytics tag is integrated into this webAPI to measure the number of times the tools are consulted in order to ensure their proper functioning and relevance. - No personal data is processed, viewed or stored through the use of webAPIs and the Google Analytics tag. - Users' IP addresses are anonymised to guarantee their confidentiality.
affluenceswebapi_ga_0DZGM777JP	third party	session	Web widgets (webAPIs) are designed to disseminate traffic information (occupancy rates, schedules, waiting times) on third-party websites. A Google Analytics tag is integrated into this webAPI to measure the number of times the tools are consulted in order to ensure their proper functioning and relevance. - No personal data is processed, viewed or stored through the use of webAPIs and the Google Analytics tag. - Users' IP addresses are anonymised to guarantee their confidentiality.
BIGipServer*	session	session	The BIGipServer* cookie is primarily used for load balancing. When a user accesses a website or application that uses F5 BIG-IP devices, this cookie helps direct the user's requests to the same backend server for the duration of the session. This ensures consistency and continuity of the user session.
cli_user_preference	persistent	1 year	This cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to store whether or not the user has given consent for cookie usage. It does not store any personal data.
cookielawinfo-checkbox-fonctionnel	persistent	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Functional".
cookielawinfo-checkbox-necessaire	persistent	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Necessary".
cookielawinfo-checkbox-publicite	persistent	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Advertising".
CookieLawInfoConsent	persistent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
csrftoken	third party	1 year	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks.
dmvk	third party	session	Random video key used to prevent interruption of the video being watched by an end user when navigating between networks of different internet service providers.
PHPSESSID	session	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
pll_language	persistent	1 year	The pll _language cookie is used by Polylang to remember the language selected by the user when returning to the website, and also to get the language information when not available in another way.
ts	third party	13 months	This is a cookie set by Dailymotion. Traffic Segment cookie used principally for progressive roll-out, a critical technical functionality that prevents massive service break-down during implementation of new developments or features.
usprivacy	third party	13 months	This is a consent cookie set by Dailymotion to store the CCPA consent string (mandatory information about an end-user being or not being a California consumer and exercising or not exercising its statutory right).
v1st	third party	13 months	This is your unique digital identifier on the Dailymotion Service. It is used to deliver the Dailymotion Service, and in particular for: – fraud detection and prevention; – security of the Dailymotion Service; – compliance with legal obligations (e.g. the obligation to respond to court orders regarding access to videos); – identification of the age of an end user.
viewed_cookie_policy	persistent	1 year	The cookie is set by the GDPR Cookie Consent plugin to store whether or not the user has consented to the use of cookies. It does not store any personal data.
VISITOR_PRIVACY_METADATA	third party	6 months	YouTube sets this cookie to store the user's cookie consent state for the current domain.
wordpress_test_cookie	session	session	This cookie is used by WordPress to check whether cookies are enabled in the user's browser.
wp_lang	session	session	To save the language settings.

Cookie	Type	Duration	Description
_42b19	session	session	It allows temporary information specific to the user's session to be stored, such as browsing preferences, choices or specific settings, in order to provide a consistent and personalised user experience. This cookie is essential to ensure the proper functioning of certain features of the website during the active session and is automatically deleted when the user closes their browser.
activeCollapseAside	session	session	The activeCollapseAside cookie is used to save the state of a side panel or sidebar.
NEXT_LOCALE	persistent	1 year	It allows the user's preferred local language to be stored and retrieved during subsequent visits to the site.
yt-remote-cast-available	third party	session	The yt-remote-cast-available cookie is used to store the user's preferences regarding whether casting is available on their YouTube video player.
yt-remote-cast-installed	third party	session	The yt-remote-cast-installed cookie is used to store the user's video player preferences using embedded YouTube video.
yt-remote-connected-devices	third party	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	third party	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-fast-check-period	third party	session	The yt-remote-fast-check-period cookie is used by YouTube to store the user's video player preferences for embedded YouTube videos.
yt-remote-session-app	third party	session	The yt-remote-session-app cookie is used by YouTube to store user preferences and information about the interface of the embedded YouTube video player.
yt-remote-session-name	third party	session	The yt-remote-session-name cookie is used by YouTube to store the user's video player preferences using embedded YouTube video.
ytidb::LAST_RESULT_ENTRY_KEY	third party	never	The cookie ytidb::LAST_RESULT_ENTRY_KEY is used by YouTube to store the last search result entry that was clicked by the user. This information is used to improve the user experience by providing more relevant search results in the future.

Cookie	Type	Duration	Description
__Secure-ROLLOUT_TOKEN	third party	6 months	__Secure-ROLLOUT_TOKEN is used by YouTube to manage the phased rollout of new features and updates. This cookie helps assign users to specific test groups for experimental features, such as changes to the user interface or video player. The __Secure- prefix indicates that the cookie is only transmitted over a secure HTTPS connection, enhancing data security.
test_cookie	third party	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	third party	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	third party	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt.innertube::nextId	third party	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	third party	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.