Changes between Version 10 and Version 11 of workshop


Ignore:
Timestamp:
Jun 12, 2014, 2:58:04 PM (4 years ago)
Author:
frederic.loulergue@…
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • workshop

    v10 v11  
    2323* 9h15-9h30: Introduction 
    2424 
    25 ==== Session: Algorithmic skeleton libraries I (9h30-12h30) ====  
    26  
    27 * 9h30-10h30 ''Invited Talk'': Joel Falcou, '''Costless Software Abstractions for Parallel Architectures'''[[BR]]Performing large, intensive or non-trivial computing on array like data structures is one of the most common task in scientific computing, video game development and other fields. This matter of fact is backed up by the large number of tools, languages and libraries to perform such tasks. If we restrict ourselves to C++ based solutions, more than a dozen such libraries exists from BLAS/LAPACK C++ binding to template meta-programming based Blitz++ or Eigen. If all of these libraries provide good performance or good abstraction, none of them seems to fit the need of so many different user types. Moreover, as parallel system complexity grows, the need to maintain all those components quickly become unwieldy. This talk explores various software design techniques and their application to the implementation of a parallel computing librariy in such a way that:[[BR]]- abstraction and expressiveness are maximized through the use of Parallel Skeletons[[BR]]- cost over efficiency is minimized thanks to Generative Programming[[BR]]- architecture specific hints are used throughout the whole library thanks to architecture aware tag dispatching[[BR]]We'll skim over various applications and see how they can benefit from such tools. We will conclude by discussing what lessons were learnt from this kind of implementation and how those lessons can translate into new directions for the C++ language itself and for the design of future Parallel Skeletons.  
    28  
    29 * 10h30-10h50: '''Coffee Break''' 
    30  
    31 * 10h50-11h30: Kento Emoto, Kiminori Matsuzaki, '''The !SkeTo Library'''[[BR]]The !SkeTo (Skeletons in Tokyo) library is a library of algorithmic skeletons, which was originally designed to allow users to describe parallel computations in a sequential manner and implemented in C++ on top of MPI.  It provides three distributed data structures for lists (1D-arrays), matrices (2D-arrays) and trees, as well as skeletons for their manipulation.  Recent works on the !SkeTo library have been done for its automatic optimization mechanism by using the meta-programming technique with C++ templates.  In this talk, we will introduce the outline of our !SkeTo library and show how we integrated the optimization mechanism on the algorithmic skeletons.  
    32  
    33 * 11h30-12h10: Joeffrey Légaux, Noman Javed, Sylvain Jubertie, and Frédéric Loulergue, '''OSL: The Orléans Skeleton Library'''[[BR]]Structured parallel models such as algorithmic skeletons offer a global view of the parallel program in contrast with the fragmented view of the SPMD style. This makes program easier to write and to read for users, and offer additional opportunities for optimisation done by the libraries, compilers and/or run-time systems. Algorithmic skeletons are or can be seen as patterns or higher-order functions implemented in parallel, often manipulating distributed data structures. Orléans Skeleton Library (OSL) is a library of parallel algorithmic skeletons, written in C++ on top of MPI, which uses meta-programming techniques for optimisation. This talk will present the recent work on OSL:  skeletons used to manage arbitrary distributions of distributed arrays, support for BSP homomorphisms, an exception mechanism that ensures the global coherence of the system after exceptions are caught. 
    34  
    35 ==== Lunch (12h10-13h50)==== 
    36  
    37 ==== Session: Algorithmic skeleton libraries II (13h50-15h10) ====  
    38  
    39 * 13h50-14h30: Shigeyuki Sato, Kiminori Matsuzaki, '''A Generic Implementation of Tree Skeletons'''[[BR]]In data-parallel skeleton libraries, the implementation of skeletons is usually tightly-coupled with that of data structures. However, a loose coupling between both like C++ STL will improve modularity and flexibility of skeletons and data structures.  This flexibility is particularly valuable for tree skeletons.  To achieve such a loose coupling, we present an iterator-based interface of trees for tree skeletons.  We have implemented tree skeletons on the basis of our interface; we present their design and implementation.  This paper also reports the results of preliminary experiments.  
    40  
    41 * 14h30-15h10: Wadoud Bousdira, Frédéric Loulergue, Julien Tesson, Vitor Rodrigues, and Sylvain Dailler, '''A Verified Library of Algorithmic Skeletons on Evenly Distributed Arrays'''[[BR]]To make parallel programming as widespread as parallel architectures, more structured parallel programming paradigms are necessary. One of the possible approaches are Algorithmic skeletons that are abstract parallel patterns. They can be seen as higher order functions implemented in parallel. Algorithmic skeletons offer a simple interface to the programmer without all the details of parallel implementations as they abstract the communications and the synchronisations of parallel activities. To write a parallel program, users have to combine and compose the skeletons. Orléans Skeleton Library (OSL) is an efficient meta-programmed C++ library of algorithmic skeletons that manipulate distributed arrays. A prototype implementation of OSL exists as a library written with the function parallel language Bulk Synchronous Parallel ML. In this paper we are interested in verifying the correctness of a subset of this prototype implementation. To do so, we give a functional specification (i.e. without the parallel details) of a subset of OSL and we prove the correctness of the BSML implementation with respect to this functional specification, using the Coq proof assistant.  
    42  
    43 * 15h10-15h30: '''Coffee Break''' 
    44  
    45 ==== Session: Verified compilation (15h30-18h10) ==== 
    46  
    47 * 15h30-16h30: ''Invited Speaker'' Francesco Zappa Nardelli, '''Languages definitions and verified compilation for shared memory concurrency''' 
    48  
    49 * 16h30-16h50: '''Coffee Break''' 
    50  
    51 * 16h50-17h30: Thomas Pinsard, Frédéric Dabrowski, Frédéric Loulergue, '''Nested Atomic Sections with Thread Escape: From a Formal Definition to Verified Compilation'''[[BR]]We consider a simple imperative language with fork/join parallelism and lexically scoped nested atomic sections from which threads can escape. In this context, our contribution is the precise definition of atomicity, well-synchronisation on execution traces and the proof that the latter implies the strong form of the former. Then we define the formal operational semantics of this language that satisfies these specifications. 
    52  
    53 * 17h30-18h10: Sylvain Dailler, Frédéric Dabrowski, '''Modular Verified Compilation for Parallel Languages'''[[BR]]We will present our attempts at providing extensions to an existing verified compiler of parallel languages. These extensions were designed to allow high-level synchronization primitives and therefore be a possible target for the compilation of algorithmic skeletons. We will show that we can adapt the semantics and proofs of correctness in a modular way which should help us to easily change both the memory model and/or the synchronizations primitives of the languages (in a limited way) without changing the whole compiler specifications and proofs. 
    54  
    55 ==== Dinner ==== 
    56  
    57 === Wednesday, July 2 === 
    58  
    5925==== Session: Constructive algorithms (9h30-12h30) ==== 
    6026 
     
    6531* 10h50-11h10: '''Coffee Break''' 
    6632 
    67 * 11h10-11h50: Reina Miyazaki and Kiminori Matsuzaki, '''Parallel Tree Accumulations on !MapReduce'''[[BR]]!MapReduce is a remarkable parallel programming model as well as a parallel processing infrastructure for large-scale data processing. !MapReduce is now widely available on cloud environments, developing methodology or patterns of !MapReduce programming is important.  In particular, XML is the de facto standard for representing data, processing semi-structured data is involved in many applications.  The target computational pattern in this paper are tree accumulations. Tree accumulations are shape-preserving computations over trees in which values are updated through flows over the tree.  We develop BSP algorithms for two tree accumulations as extensions of the BSP algorithm for tree reduction by Kakehi et al. (2006).  We also implemented the two-superstep algorithms by a single !MapReduce 
    68 execution.  Experimental results on 16-node PC cluster show good speedups with factors of 10.9-12.7.  
     33* 11h10-11h50: Reina Miyazaki and Kiminori Matsuzaki, '''Parallel Tree Accumulations on !MapReduce'''[[BR]]!MapReduce is a remarkable parallel programming model as well as a parallel processing infrastructure for large-scale data processing. !MapReduce is now widely available on cloud environments, developing methodology or patterns of !MapReduce programming is important.  In particular, XML is the de facto standard for representing data, processing semi-structured data is involved in many applications.  The target computational pattern in this paper are tree accumulations. Tree accumulations are shape-preserving computations over trees in which values are updated through flows over the tree.  We develop BSP algorithms for two tree accumulations as extensions of the BSP algorithm for tree reduction by Kakehi et al. (2006).  We also implemented the two-superstep algorithms by a single !MapReduce execution.  Experimental results on 16-node PC cluster show good speedups with factors of 10.9-12.7.  
    6934 
    7035* 11h50-12h30: Frédéric Loulergue, Simon Robillard, Julien Tesson, Joeffrey Legaux, and Zhenjiang Hu. '''Formal Derivation and Extraction of a Parallel Program for the All Nearest Smaller Values Problem'''[[BR]]The All Nearest Smaller Values (ANSV) problem is an important problem for parallel programming as it can be used to solve several problems and is one of the phases of several other parallel algorithms. We formally develop by construction a functional parallel program for solving the ANSV problem using the theory of Bulk Synchronous Parallel (BSP) homomorphisms within the Coq proof assistant. The performances of the Bulk Synchronous Parallel ML program obtained from Coq is compared to a version derived without software support (pen-and-paper) and implemented using the Orléans Skeleton Library of algorithmic skeletons, and to a (unproved correct) direct implementation of the BSP algorithm of He and Huang. 
     36 
     37==== Lunch (12h30-14h00)==== 
     38 
     39==== Session: Algorithmic skeleton libraries  (14h00-18h10) ====  
     40 
     41* 14h00-15h00 ''Invited Talk'': Joel Falcou, '''Costless Software Abstractions for Parallel Architectures'''[[BR]]Performing large, intensive or non-trivial computing on array like data structures is one of the most common task in scientific computing, video game development and other fields. This matter of fact is backed up by the large number of tools, languages and libraries to perform such tasks. If we restrict ourselves to C++ based solutions, more than a dozen such libraries exists from BLAS/LAPACK C++ binding to template meta-programming based Blitz++ or Eigen. If all of these libraries provide good performance or good abstraction, none of them seems to fit the need of so many different user types. Moreover, as parallel system complexity grows, the need to maintain all those components quickly become unwieldy. This talk explores various software design techniques and their application to the implementation of a parallel computing librariy in such a way that:[[BR]]- abstraction and expressiveness are maximized through the use of Parallel Skeletons[[BR]]- cost over efficiency is minimized thanks to Generative Programming[[BR]]- architecture specific hints are used throughout the whole library thanks to architecture aware tag dispatching[[BR]]We'll skim over various applications and see how they can benefit from such tools. We will conclude by discussing what lessons were learnt from this kind of implementation and how those lessons can translate into new directions for the C++ language itself and for the design of future Parallel Skeletons.  
     42 
     43* 15h00-15h40: Kento Emoto, Kiminori Matsuzaki, '''The !SkeTo Library'''[[BR]]The !SkeTo (Skeletons in Tokyo) library is a library of algorithmic skeletons, which was originally designed to allow users to describe parallel computations in a sequential manner and implemented in C++ on top of MPI.  It provides three distributed data structures for lists (1D-arrays), matrices (2D-arrays) and trees, as well as skeletons for their manipulation.  Recent works on the !SkeTo library have been done for its automatic optimization mechanism by using the meta-programming technique with C++ templates.  In this talk, we will introduce the outline of our !SkeTo library and show how we integrated the optimization mechanism on the algorithmic skeletons.  
     44 
     45* 15h40-16h10: '''Coffee Break''' 
     46 
     47* 16h10-16h50: Joeffrey Légaux, Noman Javed, Sylvain Jubertie, and Frédéric Loulergue, '''OSL: The Orléans Skeleton Library'''[[BR]]Structured parallel models such as algorithmic skeletons offer a global view of the parallel program in contrast with the fragmented view of the SPMD style. This makes program easier to write and to read for users, and offer additional opportunities for optimisation done by the libraries, compilers and/or run-time systems. Algorithmic skeletons are or can be seen as patterns or higher-order functions implemented in parallel, often manipulating distributed data structures. Orléans Skeleton Library (OSL) is a library of parallel algorithmic skeletons, written in C++ on top of MPI, which uses meta-programming techniques for optimisation. This talk will present the recent work on OSL:  skeletons used to manage arbitrary distributions of distributed arrays, support for BSP homomorphisms, an exception mechanism that ensures the global coherence of the system after exceptions are caught. 
     48 
     49* 16h50-17h30: Shigeyuki Sato, Kiminori Matsuzaki, '''A Generic Implementation of Tree Skeletons'''[[BR]]In data-parallel skeleton libraries, the implementation of skeletons is usually tightly-coupled with that of data structures. However, a loose coupling between both like C++ STL will improve modularity and flexibility of skeletons and data structures.  This flexibility is particularly valuable for tree skeletons.  To achieve such a loose coupling, we present an iterator-based interface of trees for tree skeletons.  We have implemented tree skeletons on the basis of our interface; we present their design and implementation.  This paper also reports the results of preliminary experiments.  
     50 
     51* 17h30-18h10: Wadoud Bousdira, Frédéric Loulergue, Julien Tesson, Vitor Rodrigues, and Sylvain Dailler, '''A Verified Library of Algorithmic Skeletons on Evenly Distributed Arrays'''[[BR]]To make parallel programming as widespread as parallel architectures, more structured parallel programming paradigms are necessary. One of the possible approaches are Algorithmic skeletons that are abstract parallel patterns. They can be seen as higher order functions implemented in parallel. Algorithmic skeletons offer a simple interface to the programmer without all the details of parallel implementations as they abstract the communications and the synchronisations of parallel activities. To write a parallel program, users have to combine and compose the skeletons. Orléans Skeleton Library (OSL) is an efficient meta-programmed C++ library of algorithmic skeletons that manipulate distributed arrays. A prototype implementation of OSL exists as a library written with the function parallel language Bulk Synchronous Parallel ML. In this paper we are interested in verifying the correctness of a subset of this prototype implementation. To do so, we give a functional specification (i.e. without the parallel details) of a subset of OSL and we prove the correctness of the BSML implementation with respect to this functional specification, using the Coq proof assistant.  
     52 
     53==== Dinner ==== 
     54 
     55=== Wednesday, July 2 === 
     56 
     57==== Session: Verified compilation (10h00-12h30) ==== 
     58 
     59* 10h00-10h50: ''Invited Speaker'' Francesco Zappa Nardelli, '''Can we ever get language memory-models right ?''' 
     60 
     61* 10h50-11h10: '''Coffee Break''' 
     62 
     63* 11h10-11h50: Thomas Pinsard, Frédéric Dabrowski, Frédéric Loulergue, '''Nested Atomic Sections with Thread Escape: From a Formal Definition to Verified Compilation'''[[BR]]We consider a simple imperative language with fork/join parallelism and lexically scoped nested atomic sections from which threads can escape. In this context, our contribution is the precise definition of atomicity, well-synchronisation on execution traces and the proof that the latter implies the strong form of the former. Then we define the formal operational semantics of this language that satisfies these specifications. 
     64 
     65* 11h50-12h30: Sylvain Dailler, Frédéric Dabrowski, '''Modular Verified Compilation for Parallel Languages'''[[BR]]We will present our attempts at providing extensions to an existing verified compiler of parallel languages. These extensions were designed to allow high-level synchronization primitives and therefore be a possible target for the compilation of algorithmic skeletons. We will show that we can adapt the semantics and proofs of correctness in a modular way which should help us to easily change both the memory model and/or the synchronizations primitives of the languages (in a limited way) without changing the whole compiler specifications and proofs. 
    7166 
    7267==== Lunch ====