Website of the CA team

You are here: Home > Job opportunities > Research internship with Involvd project
French

Research internship with Involvd project

5 or 6-month internship
Multiple Constrained Clustering.
LIFO - University of Orléans - France
Contact : Christel Vrain
christel.vrain@univ-orleans.fr

 


Supervisors : Christel Vrain, Marcilio Pereira de Souto, Thi-Bich-Hanh Dao

This master internship is part of a national project InvolvD, supported by ANR (Agence Nationale de la Recherche), starting the 1st February 2020.


Clustering is a type of unsupervised learning whose goal is to find the underlying structure present in the data as, for examples, a partition/clustering composed of groups/clusters. Observations belonging to each cluster should share some relevant property (similarity) regarding the data domain. Integrating knowledge can help guiding the process toward a clustering, closer to the expert needs. It can be pairwise constraints, such as must-link or cannot-link constraints, expressing that two points should be, resp. cannot be, in the same cluster, or constraints on the clusters (for instance
their size, diameter, ...) This has led to a new research area called Constrained Clustering and many methods have already been developed for integrating constraints in a clustering algorithm. Some of them are dedicated to one kind of constraints, others are generic, usually based on declarative frameworks such as Integer Linear Programming, Constraint Programming, SAT.


Instead of a single clustering on which the user can give a feedback, one can present her several partitions and let her pick only one or merge those that show desired characteristics in parts of them. In this internship, we are interested in integrating feedback given by an expert on several partitions. To do this, we will need to develop two aspects.

  1. Interpretability: we are interested in applications (chemo-informatics) where data is represented by discrete descriptors. To ease the expert in giving relevant feedback we will develop approaches that highlight the differences/similarities between pairs of clusters and thus provide interpretations, the level of explanations depending on the structural or semantic information available.
  2. Merging different clusterings under constraints given by the expert. There might not be a single preferred clustering but several ones that should be merged into a consensus partition while satisfying certain constraints. We will consider purely declarative methods that guarantee to find a consensus partition satisfying all the constraints to the detriment of efficiency.

This internship aims specifically at:

  • Producing a review of the state-of-the-art on ensemble methods and on constrained ensemble clustering
  • Proposing explanations on the multiple clusterings
  • Proposing and testing novel (or improved) constrained ensemble methods

Required skills:
- Experience in machine learning, data mining, computer programming or applied mathematics is
highly appreciated.
- French and/or English are the working languages .

Candidates are encouraged to contact us as soon as possible. Start is expected on February 1st, 2020. The complete application consists of the documents below, which should be sent as a single PDF file to Christel Vrain (christel.vrain@univ-orleans.fr):

  • CV
  • One-page cover letter (clearly indicating available start date as well as relevant qualifications,
  • experience and motivation)
  • University certificates and transcripts (both B.Sc and M.Sc degrees marks)
  • Contact details of up to three referees
  • Possibly an English language certificate and a list of publications
  • Attention: all documents should be in English or French.
« prev  |   top  |   next »

Powered by CMSimple | Template by CMSimple | Login