Abstract. 
  Joins constitute a computation-intensive part of database request and 
 their parallelization is necessary for dealing with very large data sets. 
 However, effectiveness of parallel joins depends on the ability to evenly 
 divide load among processors. Research has shown that the join operation is
 parallelizable with near-linear speed-up on shared nothing machines only 
 under ideal balancing conditions. Data skew can have a disastrous effect on
 performance. Although many skew-handling algorithms have been proposed they
 remain generally inefficient in the case of multi-joins due to join product
 skew, costly and unnecessary redistribution and communication costs.  
 In this paper, we introduce a new parallel join algorithm with optimal
 complexity and deterministic perfect balancing.  Its predictably low 
 join-product and attribute-value skews makes it suitable for repeated 
 use in multi-join operations. Its tradeoff between balancing overhead 
 and speedup is analyzed using the scalable and portable BSP (Bulk 
 Synchronous parallel) cost model which predicts a negligible join product 
 skew and a near-linear speed-up.  The algorithm presented here, improves 
 our fa_join and sfa_join algorithms by reducing their communication and 
 synchronisation cost to a minimum while offering the same load balancing 
 properties. 
   
Key words. PDBMS (Parallel Database Management Systems), Intra-transaction 
 parallelism, Parallel joins, Multi-joins, Data-skew, Join-product skew, 
 Dynamic load balancing.