Skip to main content
  Cornell University

MAE Publications and Papers

Sibley School of Mechanical and Aerospace Engineering

New article: Computationally-efficient and scalable parallel implementation of chemistry in simulations of turbulent combustion

Article: Hiremath V, Lantz SR, Wang HF and Pope SB (2012). “Computationally-efficient and scalable parallel implementation of chemistry in simulations of turbulent combustion.” Combustion and Flame 159(10): 3096-3109.

DOI

Abstract: Large scale combined Large-Eddy Simulation (LES)/Probability Density Function (PDF) parallel computations of reactive flows with detailed chemistry involving large numbers of species and reactions are computationally expensive. Among the various techniques used to reduce the computational cost of representing chemistry, the three approaches in widest use are: (1) mechanism reduction, (2) dimension reduction, and (3) tabulation. In addition to these approaches, in large scale parallel LES/PDF computations, we need strategies to distribute the chemistry workload among the participating cores to reduce the overall wall clock time of the computations. Here we present computationally-efficient strategies for implementing chemistry in parallel LES/PDF computations using in situ adaptive tabulation (ISAT) and x2f_mpi – a Fortran library for parallel vector-valued function evaluation (used with ISAT in this context). To test the strategies, we perform LES/PDF computations of the Sandia Flame D with chemistry represented using (a) a 16-species augmented reduced mechanism; and (b) a 38-species C-1-C-4 skeletal mechanism. We present three parallel strategies for redistributing the chemistry workload, namely (a) PLP, purely local processing; (b) URAN, the uniform random distribution of chemistry computations among all cores following an early stage of PLP; and (c) P-URAN, a Partitioned URAN strategy that redistributes the workload only among partitions or subsets of the cores. We show that among these three strategies, the P-URAN strategy (i) yields the lowest wall clock time, which is within a factor of 1.5 and 1.7 of estimates for the lowest theoretically achievable wall clock time for the 16-species and 38-species mechanisms, respectively; and (ii) for reaction, achieves a relative weak scaling efficiency of about 85% when scaling from 2304 to 9216 cores and a relative strong scaling efficiency of over 60% when scaling from 1152 to 6144 cores. (c) 2012 The Combustion Institute. Published by Elsevier Inc. All rights reserved.

Leave a Reply

Your email address will not be published. Required fields are marked *

Skip to toolbar