DOING – Project APR-IA

DOING – INTELLIGENT DATA

  • APR-IA (PROJET RECHERCHE D’INITIATIVE ACADEMIQUE)
  • REGION CENTRE VAL DE LOIRE

The DOING project aims at developing methods and tools to first extract information from textual data by structuring them in a graph database, and then to manipulate this knowledge graph in an intelligent way. The chosen application domain is the health domain, with first of all the use of freely available data (such as clinical cases). DOING aims at designing data science queries, i.e., a new form of declarative queries, which can integrate analyses, that will guide healthcare specialists in their decision making. DOING is based on a real interdisciplinary collaboration (Natural Language Processing, Databases and Artificial Intelligence) to transform data into information and knowledge. The goal of the project is to concretize proposals from the DOING working group of the RTR-DIAMS and the DOING action of the GDR-MADICS.

The objective of the DOING project is the proposal and construction of methods, algorithms and tools for the transformation of data into information and then into knowledge. The idea is to bring together the expertise of researchers in Natural Language Processing (NLP), Databases (DB) and Artificial Intelligence (AI) to:
1) extract information from textual data and represent it to populate graph databases;
2) propose intelligent methods for the manipulation and maintenance of these databases with new forms of queries.

These objectives are broken down into three tasks that are developed in parallel; their relationship is the main thread of the project .
T1 : Task 1 – Extraction of information from textual data
T2 : Task 2 – Data science queries: language and algorithms

T3: Task 3 – Analysis and prediction of physician needs (not financed by the project)

MEMBERS

  • Mirian Halfeld Ferrari Alves (45%), porteur, LIFO
  • Anne-Lyse Minard-Forst (35%), LLL
  • Donatello Conte (25%), LIFAT
  • Jacques Chabin (30%), LIFO
  • Genoveva Vargas-Solar (20%), LIRIS (external participant)
  • Jean-Yves Antoine (20%), LIFAT
  • Jean-Yves Ramel (15%), LIFAT
  • Agata Savary (15%), LISN (external participant)
  • Anais Lefeuvre-Halfermeyer (15%), LIFO
  • Flora Badin (10%), LLL
  • Lofti Abouda (10%), LLL
  • Emmanuel Schang (10%), LLL
  • Thi-Bich-Hanh Dao (8%), LIFO

POSTDOC

  • Placido A. Souza Neto (1 March 2023 – 28 February 2024): Data science queries (T2)
  • Silvia Federzoni : Extraction d’information dans les données textuelles (T1)

PhD STUDENTS (collaborating to the project)

Lingchen Wang LIFO

RESULTS

SOFTWARE

EASI-GDS:

A user-friendly interface that helps users to build declarative analytical queries on property graphs. These queries are then implemented as Neo4J pipelines.

  • Demonstration: https://youtu.be/pd1s7hOVMx8
  • Installation : https://gitlab.com/mirian/easi-gds_install
  • Developers: Valentin Bouvresse and Virgile Crvenka (master students, Université d’Orléans)