DOING@ADBIS-2022

DOING : Intelligent Data – from data to knowledge

WORKSHOP in ADBIS 2022

September 5, 2022. Torino, Italy

Chairs

Mírian Halfeld Ferrari – Université d’Orléans, INSA CVL, LIFO EA, France
Carmem S. Hara – Universidade Federal do Paraná, Curitiba, Brazil

A word about DOING

DOING workshop is connected to:

DOING@DIAMS (part of the RTR DIAMS)
DOING@MADICS ( ACTION in the MADICS network)

PROGRAM – 5 September 2022

10:45-10:50 Opening
10:50-11:50 Session 1
11:50-13:00 Break & Keynote (Angela Bonifati)
14:00-15:00 Session 2
15:00-15:10 Break
15:10-16:05 Session 3
16:05-16:15 Closing

Session 1 : Graph databases (Full papers: 23 min talk + 7 min questions)

Title: Privacy Operators for Semantic Graph Databases as Graph Rewriting
Authors:
Adrien Boiret (INSA Centre Val de Loire, France)
Cédric Eichler (INSA Centre Val de Loire, France)
Benjamin Nguyen (INSA Centre Val de Loire, France)

Title: Using Provenance in Data Analytics for Seismology: Challenges and Directions
Authors:
Umberto S. da Costa (Universidade Federal do Rio Grande do Norte, Brazil)
Javier A. Espinosa-Oviedo (Université de Lyon, France)
Martin A. Musicante (Universidade Federal do Rio Grande do Norte, Brazil)
Genoveva Vargas-Solar (CNRS, Université de Lyon, INSA Lyon, France)
José-Luis Zechinelli-Martini (Universidad de las Américas Puebla, Mexico)

Keynote Talk:
Towards Quality-Aware Data Science and its Application to Health Data
Angela Bonifati

Session 2 : Natural Language Processing (Full papers: 23 min talk + 7 min questions)

Title: Automatic classification of stigmatizing articles of mental illness: the case of Portuguese online newspapers
Authors: Alina Yanchuk (University of Aveiro, Portugal)
Alina Trifan (University of Aveiro, Portugal)
Olga Fajarda (University of Aveiro, Portugal)
José Luís Oliveira (University of Aveiro, Portugal)

Title: Relation extraction from clinical cases for a knowledge graph
Authors:
Agata Savary (Paris-Saclay University, France)
Alena Silvanovich (Université d’Orléans, France)
Anne-Lyse Minard (Université d’Orléans, France)
Nicolas Hiot (EnnovLabs, Université d’Orléans, France)
Mirian Halfeld-Ferrari (Université d’Orléans, France)

Session 3 : Query processing
(Short papers: 20 min talk + 5 min questions)

Title: A Multiplex Network framework based Recommendation Systems for Technology Intelligence
Authors:
Foutse Yuehgoh (Léonard de Vinci Pôle Universitaire, France)
Djebali Sonia (Léonard de Vinci Pôle Universitaire, France)
Nicolas Travers (Léonard de Vinci Pôle Universitaire, France)

Title: Storing Feature Vectors in Relational Image Data Warehouses
Authors:
Guilherme M. Rocha (University of São Paulo, Brazil)
Piero L. Capelo (University of São Paulo, Brazil)
Anderson C. Carniel (Federal University of São Carlos, Brazil)
Cristina D. Aguiar (University of São Paulo, Brazil)

KEYNOTE ABSTRACT: Towards Quality-Aware Data Science and its Application to Health Data, concerns one of the key challenges in data science pipelines, i.e., how to ensure the quality of the results and understand their relationship with the quality of the analyzed inputs. Data collection and acquisition in several scientific domains, such
as for instance in health and life sciences domains, tend to often introduce errors into the data, such as violations of business rules, typos, missing values and replicated entries. As an example, data collected for the patients’ signals might exhibit peculiar features that need to be taken into account within the analytical and inference processes. Patient’s healthcare records might exhibit inconsistencies that need to be properly quantified and accounted for in the subsequent analysis.

We present our latest results on enhancing the quality of querying and inference processes on scientific data and beyond. Among the others, we operate on real-life data of patients from several hospitals in the EU and provide the domain experts with useful data science and learning techniques that can help them with their diagnoses and analyses.
First, inconsistency-aware annotations can improve the input to analytical processes. These annotations are further exploited during query processing in order to enhance the output of queries with inconsistency degrees. Second, feature-based similarities among time series corresponding to patients’ signals help to better identify groups of patients and to assess their risks. Third, logic-based declarative privacy-preserving data integration allows to migrate clinical data between hospitals while ensuring privacy guarantees. In all cases, our research aims at providing the caregivers with a better understanding of their clinical data thanks to the improved outcomes of the performed analytical and data science processes.

BIO: Angela Bonifati (IEEE Senior Member), is a Professor of Computer Science at Lyon 1 University and at the CNRS Liris research lab (France), where she leads the Database Group. Angela Bonifati is also an Adjunct Professor at the University of Waterloo (Canada) in the Data Systems Group. Prior to that, she was working as a Professor at Lille 1 University, France (2011-2015) and as a researcher at CNR, Italy until 2011. She received her Ph.D. from Politecnico di Milano in 2002.

Her current research interests are on the interplay between relational and graph-oriented data paradigms, particularly query processing, data integration and learning for both structured and unstructured data models and their impact on real-world scientific data science applications. She is involved in several grants at Lyon 1 University, including French, EU H2020 and industrial grants. She has also co-authored more than 150 publications in top venues of the data management field along with two books (“Schema Matching and Mapping” edited by Springer in 2011 and “Querying Graphs” edited by Morgan Claypool in 2018) and an invited paper in ACM Sigmod Record 2018. She was the Program Chair of ACM Sigmod 2022 and is currently an Associate Editor for both Proceedings of VLDB and IEEE ICDE. She is also an Associate Editor for several journals, such as the VLDB Journal, ACM TODS, Distributed and Parallel Databases and Frontiers in Big Data. She is currently the President of the EDBT Executive Board and was a former member of the ICDT council.

IMPORTANT DATES

Paper submission: ~~May 3, 2022~~ ~~June 1, 2022~~ June 6, 2022 (extended) at 2 p.m. CET (14h)
Notification of acceptance: ~~May 23,~~ ~~June 24~~ June 27 2022 (extended)
Camera-ready due: ~~June 7,~~ July 5 2022 (extended)
Workshop day: September 5, 2022

Aims and scope.

Text are important sources of information and communication in diverse domains. The intelligent, efficient and secure use of this information requires, in most cases, the transformation of unstructured textual data into data sets with some structure, and organized according to an appropriate schema that follows the semantics of an application domain. Indeed, solving the problems of modern society requires interdisciplinary research and information cross-referencing, thus surpassing the simple provision of unstructured data. There is a need for representations that are more flexible, subtle and context-sensitive, which can also be easily accessible via consultation tools and evolve according to these principles. In this context, consultation requires robust and efficient processing of requests, which may involve information analysis, with quality, consistency, and privacy preservation guarantees. Knowledge bases can be built as these new generation infrastructures which support data science queries on a user-friendly framework and are capable of providing the required machinery for advised decision-making.

The workshop focuses on transforming data into information and then into knowledge. The idea is to gather researchers in NLP (Natural Language Processing), DB (Databases), and AI (Artificial Intelligence) to discuss two main problems :

how to extract information from textual data and represent it in knowledge bases;
how to propose intelligent methods for handling and maintaining these databases with new forms of requests, including efficient, flexible, and secure analysis mechanisms, adapted to the user, and with quality and privacy preservation guarantees.

This workshop focuses on all aspects concerning these modern infrastructures, giving particular attention (but not limited to) to data related to health and environmental domains.

KEYNOTE: Angela Bonifati (Lyon 1 University)

Topics of interest.

We invite the submission of work-in-progress that address various aspects of information extraction from textual data, intelligent and efficient interrogation, and maintenance of knowledge bases. The workshop welcomes submissions of theoretical, technical, experimental, methodological papers, application papers, position papers and papers on experience reports addressing – though not limited to – the following topics:

Artificial intelligence in databases and information systems
Data curation, annotation, and provenance
Data management and analytics
Data mining and knowledge discovery
Data models and query languages
Data quality and data cleansing
Data science (theory and techniques)
Context-aware and adaptive information systems
Constraints extraction from text
Natural language processing
Indexing, query processing and optimization
Information and knowledge extraction
Information integration
Information quality
Graph databases
Knowledge bases (querying, management, evolution and dynamics)
Machine learning for knowledge graph construction, completion, refinement
Machine learning for knowledge and information extraction, for instance, named entity disambiguation, sentiment analysis, relation extraction, or the detection of claims, facts and stances from unstructured documents
Machine Learning in NLP
Management of large volumes of data
Methodologies, models, algorithms, and architectures for applied data science
NLP for Digital Humanities
NLP & Knowledge Graphs
Privacy, trust and security in databases
Query processing and optimization
Question answering over knowledge graphs
Text databases

Preferred Application Domains (but not limited to).

Bio-sciences and healthcare
Environmental issues

Submissions

Papers must be submitted via EASY CHAIR : HERE ( track for DOING: 3rd Workshop on Intelligent Data)

DOING workshop intends to accept short (limited to 6-8 pages) and long (limited to 12 pages) papers. DOING reserves the right to accept as short papers those submitted as long, describing interesting and innovative ideas but still requiring further technical development. Papers should be written in English, formatted in Latex and present substantially original results. We adopt a double blind review policy: the papers submitted for review MUST NOT contain the authors’ names, affiliations, or any information that may disclose the authors’ identity. Authors should consult Springer’s authors’ guidelines and use their proceedings templates (you can download the templates available on the bottom of that page).

ADBIS 2022 follows a Diversity and Inclusion policy that invites authors to adopt inclusive language in their papers and presentations (https://dbdni.github.io/pages/inclusivewriting.html and https://dbdni.github.io/pages/inclusivetalks.html). We also kindly ask all participants to adopt a proper code on conduct (https://dbdni.github.io/pages/codeofconduct.html).

Accepted papers will be published in the Springer CCIS series and the best papers will be invited to a special issue of the journal Computer Science and Information Systems.

Program Committee.

Ahmed Awad (University of Tartu, Estonia)
Cheikh Ba (UGB – Université Gaston Berger, Senegal)
Davide Buscaldi (LIPN, Université Sorbonne Paris Nord, France)
Karin Becker (UFRGS – Universidade Federal do Rio Grande do Sul, Brazil)
Javam de Castro Machado (UFC – Universidade Federal do Ceará, Brazil)
Laurent d’Orazio (IRISA, Université de Rennes, France)
Danilo Giordano (Politecnico di Torino, Italy)
Sven Groppe (University of Lubeck, Germany)
Nicolas Hiot (Université d’Orléans, France)
Jixue Liu (University of South Australia, Australia)
Wagner Machado Nunan Zola (UFPR – Universidade Federal do Paraná, Brazil)
Anne-Lyse Minard-Forst (LLL, Université d’Orléans, France)
Dung Viet Nguyen Nghiem (LIFO, Université d’Orléans, France)
Agata Savary (LIFAT, Université de Tours, France)
Rebecca Schroeder Freitas (UDESC, Universidade Estadual de Santa Catarina, Brazil)
Aurora Trinidad Ramirez Pozo (UFPR – Universidade Federal do Paraná, Brazil)
Domagoj Vrgoc, Pontificia Universidad Católica de Chile.