Lifo - Laboratoire d'Informatique Fondamentale d'orléans INSA Centre Val de Loire Université d'Orléans Université d'Orléans

Lifo > LIFO Research Report for Year 2014

 Site en Français



Contact

LIFO - Bâtiment IIIA
Rue Léonard de Vinci
B.P. 6759
F-45067 ORLEANS Cedex 2

Email: contact.lifo
Tel: +33 (0)2 38 41 70 11
Fax: +33 (0)2 38 41 71 37



Go to Research Reports of Year : 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017

LIFO Research Report for Year 2014


RR-2014-01 Scalable and Skew-insensitive Algorithm for Join Operations using Map/Reduce Model
Mostafa Bamha and Frédéric Loulergue
Date : 2014-01-15
Abstract Download

RR-2014-02 Towards more Precise Rewriting Approximations (full version)
Yohan Boichut, Jacques Chabin and Pierre Réty
Date : 2014-10-06
Abstract Download

RR-2014-03 Unsupervised Dependency Parsing, a new PCFG
Marie Arcadias, Guillaume Cleuziou, Edmond Lassalle, Christel Vrain
Date : 2014-09-26
Abstract Download


Abstracts of Research Reports


RR-2014-01 Scalable and Skew-insensitive Algorithm for Join Operations using Map/Reduce Model
Mostafa Bamha and Frédéric Loulergue
Abstract : For over a decade, Map/Reduce has become a prominent programming model to handle vast amounts of raw data in large scale systems. This model ensures scalability, reliability and availability aspects with reasonable query processing time. However these large scale systems still face some challenges: data skew, task imbalance, high disk i/o and redistribution costs can have disastrous effects on performance. In this paper, we introduce MRFA-Join algorithm: a new Frequency Adaptive algorithm based on Map/Reduce Programming model and distributed histograms for join processing on large-scale datasets. A cost analysis of this algorithm shows that our approach is insensitive to data skew and ensures perfect balancing properties during all stages of join computation. Performances have been experimented on Grid'5000 infrastructure.
Keywords : Join operations, Data skew, Map/Reduce model, Hadoop framework

RR-2014-02 Towards more Precise Rewriting Approximations (full version)
Yohan Boichut, Jacques Chabin and Pierre Réty
Abstract : To check a system, some verification techniques consider a set of terms I that represents the initial configurations of the system, and a rewrite system R that represents the system behavior. To check that no undesirable configuration is reached, they compute an over-approximation of the set of descendants (successors) issued from I by R, expressed by a tree language. Their success highly depends on the quality of the approximation. Some techniques have been presented using regular tree languages, and more recently using non-regular languages to get better approximations: using context-free tree languages on the one hand, using synchronized tree languages on the other hand. In this paper, we merge these two approaches to get even better approximations: we compute an over-approximation of the descendants, using synchronized-context-free tree languages expressed by logic programs. We give several examples for which our procedure computes the descendants in an exact way, whereas the former techniques compute a strict over-approximation.
Keywords : term rewriting, tree languages, logic programming, reachability.

RR-2014-03 Unsupervised Dependency Parsing, a new PCFG
Marie Arcadias, Guillaume Cleuziou, Edmond Lassalle, Christel Vrain
Abstract : Dependency learning aims at building a model that allows transforming textual sentences into trees representing a syntactical hierarchy between the words of the sentence. We present an intermediate model between full syntactic parsing of a sentence and bags of words. It is based on a very light probabilistic context free grammar, allowing to express dependencies between the words of a sentence. Our model can be tuned a little depending on the language. Experimentally, we were able to surpass the scores of the DMV reference on attested benchmarks for five over ten languages, such as English, Portuguese or Japanese. We give the first results on French corpora. Learning is very fast and parsing is almost instantaneous.
Keywords : Natural Language Processing, syntactic dependency trees, unsupervised learning of grammars, Inside-Outside, CYK, Context-free grammars