Abstract. Joins constitute a computation-intensive part of database request and their parallelization is necessary for dealing with very large data sets. However, effectiveness of parallel joins depends on the ability to evenly divide load among processors. Research has shown that the join operation is parallelizable with near-linear speed-up on shared nothing machines only under ideal balancing conditions. Data skew can have a disastrous effect on performance. Although many skew-handling algorithms have been proposed they remain generally inefficient in the case of multi-joins due to join product skew, costly and unnecessary redistribution and communication costs. In this paper, we introduce a new parallel join algorithm with optimal complexity and deterministic perfect balancing. Its predictably low join-product and attribute-value skews makes it suitable for repeated use in multi-join operations. Its tradeoff between balancing overhead and speedup is analyzed using the scalable and portable BSP (Bulk Synchronous parallel) cost model which predicts a negligible join product skew and a near-linear speed-up. The algorithm presented here, improves our fa_join and sfa_join algorithms by reducing their communication and synchronisation cost to a minimum while offering the same load balancing properties. Key words. PDBMS (Parallel Database Management Systems), Intra-transaction parallelism, Parallel joins, Multi-joins, Data-skew, Join-product skew, Dynamic load balancing.