ATILF

Computer Processing and Analysis of the French Language

corporate_fareorganization

Web page: http://www.atilf.fr/

Country: France

ISNI: 0000000405991931

ROR: https://ror.org/02cy9g557

Wikidata: Q2845219

ATILF

- Funding / Projects
  (12)

12 Projects, page 1 of 3

CODIM (Compositionality and Discourse Markers)
assignment_turned_inProjectFrom 2023Partners:Institut national de la recherche en informatique et automatique, Université de Paris, ATILF
Institut national de la recherche en informatique et automatique,
Université de Paris,
ATILF
Funder: French National Research Agency (ANR) Project Code: ANR-22-CE38-0002
Funder Contribution: 412,460 EUR
The CODIM project focuses on the two main linguistic resources for organizing monologues or conversations in human languages : D(iscourse) M(arkers) (therefore/donc, well/ben,bon etc. in English/French) and prosody (in particular intonation). It will evaluate their status with respect to two major views on communication: compositionality (the possibility of combining meaningful expressions into more complex meaningful expressions) and pattern or construction-based approaches (the idea that language users exploit partly ‘frozen’ strings of words). We will compare the semantic and prosodic properties of simple and complex French DM (e.g. ah + bon) found in corpora for written and spoken French, using a variety of complementary approaches for DM identification (category-driven text mining), clustering (statistics and Machine Learning) and research in prosody (ToBI representation, speech analysis/synthesis). This will foster or reinforce strong collaborations between linguists and computer scientists.
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
PREFAB (Constructions of prefabricated sentences in language interactions (PREFAB)gières (PREFAB))
assignment_turned_inProjectFrom 2022Partners:ATILF, Centre national de la recherche scientifique, LINGUISTIQUE ET DIDACTIQUE DES LANGUES ETRANGERES ET MATERNELLES
ATILF,
Centre national de la recherche scientifique,
LINGUISTIQUE ET DIDACTIQUE DES LANGUES ETRANGERES ET MATERNELLES
Funder: French National Research Agency (ANR) Project Code: ANR-22-CE54-0013
Funder Contribution: 462,332 EUR
The PREFAB project aims to identify and analyze the prefabricated patterns of French interactions (e.g. comment dirais-je ‘how shall I put it’), in spoken corpora, interactional written corpora, and dialogues in fiction (these resources are already available). The project, initiated by LIDILEM, integrates researchers from ICAR, ATILF and BCL, research units with complementary skills. The modeling is based on construction grammars and includes syntactic, semantic, pragmatic and interactional dimensions. The innovative aspects of the project are : (a) the study of a very wide range of prefabricated patterns, from expressive expressions to metadiscursive patterns (b) an integrated approach to levels of linguistic processing (in a model based on "constructicons") (c) an innovative inductive methodology of corpus exploration (including treebanks) (d) the study of variation between sub-genres and mediums, including a comparison between French and German c. The data compiled will be freely available. They will contribute to the linguistic heritage and will be useful for language teaching as well as computer applications.
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
AMPERE 2014 (Exploiting Ampère's corpus)
assignment_turned_inProjectFrom 2013Partners:EHESS, ATILF, CAK
EHESS,
ATILF,
CAK
Funder: French National Research Agency (ANR) Project Code: ANR-12-CORP-0017
Funder Contribution: 240,000 EUR
This project starts from the consideration that it is time conjugating inquiry on a scientific corpus with information technology research tools. Until now the building of information technology infrastructures for scientific corpora is mainly devoted to make available the images and the transcription of the texts. AMPERE2014 starts from the already existing electronic resource “@Ampère et l’histoire de l’éléctricité” (www.ampere.cnrs.fr) and intends to exploit Ampère’s corpus, which is qualitatively and quantitatively impressive. Ampère wrote thousands of pages and discussed the most important subjects of the sciences in the first third of the Nineteenth century, proposed a vast inquiry into philosophy, and analyzed both knowledge and its creative process. Main aim of AMPERE2014 is to empower analysis of and research on Ampère’s corpus through IT applications. This project is planned in order to perform analysis, comparison and connection among elements within the different texts (publications, manuscripts, correspondence, private writings,) of Ampère’s corpus. It will be a process from indexing, to the production of semantics interrelations, until actual research: a real synthesis between scholarship and information technology, something that is in fact a novelty in history of science and inquiry into scientific corpora.
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
STAR-FLE (STrategic Adaptations for better Reading and Text Comprehension in FFL)
assignment_turned_inProjectFrom 2024Partners:LPL, Linguistique, Langues et Parole (EA 1339 - UR 1339 depuis 01.01.2020), ATILF, Centre de traitement automatique du langage, Louvain-La-Neuve
LPL,
Linguistique, Langues et Parole (EA 1339 - UR 1339 depuis 01.01.2020),
ATILF,
Centre de traitement automatique du langage, Louvain-La-Neuve
Funder: French National Research Agency (ANR) Project Code: ANR-23-CE38-0007
Funder Contribution: 536,351 EUR
The heterogeneity of levels of langage learners is very frequent in the same class and its handle represents a major problem for the langage teachers, which should provide personalised resources to each learner. Thus, the STAR-FLE project aims to propose innovant digital solutions available in the Natural Language Processing (NLP) area, that may improve text comprehension of French L2 learners and that helps teachers to handle multiple levels of learners. We proposed context-based aided for the comprehension of lexical issues, but also of MWE expressions found in original texts. Our system provides MWE identification, generation of definitions adressed to a specific learner’s profile but also synonym search, word sense disambiguation and simpler synomyms and the possibility to chose simpler synonyms for a better comprehension of a text. On the other hand, we build original NLP resources such as annotated CEFR corpus and lexicons, MWE annotated corpus.
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
ALIENTO (Linguistic analysis, intercultural aspects of sapiential utterances and their transmission from East to West and from West to East)
assignment_turned_inProjectFrom 2014Partners:Centre de Recherche Moyen-Orient et Méditerranée, ATILF, Maison des Sciences de lHomme Lorraine, MSHE, UFC
Centre de Recherche Moyen-Orient et Méditerranée,
ATILF,
Maison des Sciences de lHomme Lorraine,
MSHE,
UFC
Funder: French National Research Agency (ANR) Project Code: ANR-13-BSH3-0009
Funder Contribution: 239,948 EUR
In the Ninth Century, the rich Arabic tradition of adab finds its way to Spain, in al-Andalus, which then played a central role in knowledge exchange from the Orient and then relayed to the West, by monasteries from the North of the Iberian Peninsula in the 11th and 12th C. In al-Andalus, the adab literature meets the Jewish sapiential tradition of the midrashic literature. New collections are composed, including original works in the 10th and 11th centuries and from the 12th century on, exempla and philosophers’ sayings are translated into Hebrew, Latin, and Romance languages. Much of this complex heritage is found in the extensive Spanish paremiological literature, which is at its highest in the 16th and 17th centuries, and in current Spanish, Judeo-Spanish and Maghrebian collections of proverbs. Although the main lines of these exchanges are known, we lack specific information on the circulation of these short sapiential statements (our basic research unit), on the successive translating choices made by the translators, the cultural reinterpretations, or the weight of a borrowing over another. If sapiential textual filiations and translation sequences should be treated cautiously, this is particularly true for the sapiential statements contained in these texts. Due to the difficulty of understanding them, these volatile elements, whose categorization varies with time and considered cultures, have never been subject to overall textual studies, which would recount their sources, circulation and evolution through the different spoken or written languages by the three cultures within the Iberian Peninsula, during the Middle-Ages. The paremiological studies have principally produced compilations of proverbs (thesauri); editions; erudite studies dedicated to a single work, a single language or a single culture, except for D. Gutas’ remarkable groundbreaking work on the Philosophical Quartet (1975). The few existing databases take into account contemporary “paremiae” corpora, most often unilingual or with a traductology perspective. Therefore, the aim of the ALIENTO project is to calculate matches even when partial, close or distant connections in order to reassess inter-textual relations by comparing a great quantity of data and intersecting encoded texts written in different languages. This I why the project, which needs a close interdisciplinary collaboration between computational researchers (ATILF) and the linguists and specialists of literature (MSH Lorraine + INALCO and the international network of collaborators), will develop a computational software transferable to other similar texts using a large corpus of reference composed of 8 related texts which circulated in the Iberian Peninsula (in Latin, Arabic, Hebrew, Spanish and Catalan), representing 582 pages for a number of sapiential statements evaluated at 9,570 units. The developed software will extract and connect brief sapiential units through matching generated by the specific encoding system elaborated scientifically and written in an encoding manual XML-TEI. The choice and the type of annotations used result from a collaborative reflexion between the members of the project, specialists of linguistic paremiology, ancient texts, design engineers of textual databases, computational researchers during special scientific sessions. It will evolve in a collaborative manner during the matching processes. At the end we will have: - a body of texts belonging to a multilingual corpus, digitized, tagged in XML/TEI and publicly accessible, linked to a set of data on the text and its author. - a set of brief sapiential units with their XML/TEI annotations, accessible free of charge. - a trilingual questioning interface, making it possible to display the matched statements contained in these works, with information which can be used to study them regardless of the language. - an encoding methodology and a software for matching data transferable to other similar corpora.
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu
more_vert
linkLink to
shareshareShare
uploaduploadDeposit
codecodeEmbed
Select content type to embed
All Research products
arrow_drop_down
<script type="text/javascript">  </script>
COPY SCRIPT
For further information contact us at helpdesk@openaire.eu

chevron_left
1
2
3
chevron_right

ATILF

ATILF

12 Projects, page 1 of 3

CODIM (Compositionality and Discourse Markers)

PREFAB (Constructions of prefabricated sentences in language interactions (PREFAB)gières (PREFAB))

AMPERE 2014 (Exploiting Ampère's corpus)

STAR-FLE (STrategic Adaptations for better Reading and Text Comprehension in FFL)

ALIENTO (Linguistic analysis, intercultural aspects of sapiential utterances and their transmission from East to West and from West to East)

Loading