Output list
Journal article
HARD: A performance portable radiation hydrodynamics code based on FleCSI framework
First online publication 11/10/2025
SoftwareX, 32, 102441
Conference proceeding
FleCSI 2.0: The Flexible Computational Science Infrastructure Project
Published 01/01/2022
EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS, 13098, 480 - 495
The FleCSI 2.0 programming system supports multiphysics application development through a runtime abstraction layer, and by providing core topology types that can be customized for specific numerical methods. The abstraction layer provides a single-source programming interface for distributed and shared-memory data parallelism through task and kernel execution, and has been demonstrated to introduce virtually no runtime overhead. FleCSI's core topology types represent a rich set of basic data structures that can be specialized to create applicationfacing interfaces for a variety of different physics packages. Using the FleCSI control and data models, it is straightforward to compose multiple packages to create full multiphysics applications. When used with a task-based backend, FleCSI offers extended runtime analysis that can increase task concurrency, facilitate load balancing, and allow for portability across heterogeneous computing architectures.
Journal article
Published 07/2020
SoftwareX, 12, 100602
Conference proceeding
FleCSPH: a Parallel and Distributed Smoothed Particle Hydrodynamics Framework Based on FleCSI
Published 01/01/2018
PROCEEDINGS 2018 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 484 - 491
FleCSPH(1) is a complement of the FleCSI framework, focusing on tree data structures with support for binary, quad and octrees. The framework provides parallel, distributed and accelerated tree construction and search in the context of multi-physics problems. FleCSI(2) is a compile-time configurable framework designed to support multi-physics applications and is developed and maintained by the Los Alamos National Laboratory. FleCSI provides domain scientists with a set of data structures and tools to target parallel and distributed architectures on current and future supercomputers, including the ongoing 2020 target to support the first Exascale supercomputers. Our work on FleCSPH is based on a specific method that emphasizes different walls in HPC called Smoothed Particle Hydrodynamics (SPH). This method can be efficiently solved using binary, quad and octrees while providing irregularities in terms of computation and communications. This paper is decomposed as follows: The introduction describes the SPH method and the reasons that makes it a good test case for the FleCSPH framework. We give more details on the FleCSI framework; The second part is dedicated to the tree data structure itself and the choices we made for the domain decomposition, the tree construction and search. We also describe our distribution strategies and their reliability to the FleCSI model; The third part describes our test cases and the current results of the application. The test cases are the Sod shock tube, the Sedov blast and 2D/3D fluid flows.
Conference proceeding
Taking Lessons Learned from a Proxy Application to a Full Application for SNAP and PARTISN
Published 01/01/2017
INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 108, 555 - 565
SNAP is a proxy application which simulates the computational motion of a neutral particle transport code, PARTISN. In this work, we have adapted parts of SNAP separately; we have re-implemented the iterative shell of SNAP in the task-model runtime Legion, showing an improvement to the original schedule, and we have created multiple Kokkos implementations of the computational kernel of SNAP, displaying similar performance to the native Fortran. We then translate our Kokkos experiments in SNAP to PARTISN, necessitating engineering development, regression testing, and further thought. (C) 2017 The Authors. Published by Elsevier B.V.
Journal article
Published 03/2016
Physics of Plasmas, 23, 3, 032703
Journal article
Published 01/01/2016
Computer physics communications, 198, C, 47 - 58
We describe a spectral method for the numerical solution of the Vlasov-Poisson system where the velocity space is decomposed by means of an Hermite basis, and the configuration space is discretized via a Fourier decomposition. The novelty of our approach is an implicit time discretization that allows exact conservation of charge, momentum and energy. The computational efficiency and the cost-effectiveness of this method are compared to the fully-implicit PIC method recently introduced by Markidis and Lapenta (2011) and Chen et al. (2011). The following examples are discussed: Langmuir wave, Landau damping, ion-acoustic wave, two-stream instability. The Fourier Hermite spectral method can achieve solutions that are several orders of magnitude more accurate at a fraction of the cost with respect to PIC. (C) 2015 Elsevier B.V. All rights reserved.
Journal article
Stimulated scattering in laser driven fusion and high energy density physics experiments
Published 09/01/2014
Physics of plasmas, 21, 9, 92707
In laser driven fusion and high energy density physics experiments, one often encounters a kkD range of 0.15< k lambda(D)< 0.5, where stimulated Raman scattering (SRS) is active (k is the initial electron plasma wave number and lambda(D) is the Debye length). Using particle-in-cell simulations, the SRS reflectivity is found to scale as similar to (k lambda(D))(-4) for k lambda(D) >= 0.3 where electron trapping effects dominate SRS saturation; the reflectivity scaling deviates from the above for k lambda(D) < 0.3 when Langmuir decay instability (LDI) is present. The SRS risk is shown to be highest for k lambda(D) between 0.2 and 0.3. SRS re-scattering processes are found to be unimportant under conditions relevant to ignition experiments at the National Ignition Facility (NIF). Large-scale simulations of the hohlraum plasma show that the SRS wavelength spectrum peaks below 600 nm, consistent with most measured NIF spectra, and that nonlinear trapping in the presence of plasma gradients determines the SRS spectral peak. Collisional effects on SRS, stimulated Brillouin scattering (SBS), LDI, and re-scatter, together with three dimensional effects, are examined. Effects of collisions are found to include de-trapping as well as cross-speckle electron temperature variation from collisional heating, the latter of which reduces gain, introduces a positive frequency shift that counters the trapping-induced negative frequency shift, and affects SRS and SBS saturation. Bowing and breakup of ion-acoustic wavefronts saturate SBS and cause a dramatic, sharp decrease in SBS reflectivity. Mitigation of SRS and SBS in the strongly nonlinear trapping regime is discussed. (C) 2014 AIP Publishing LLC.
Journal article
Published 01/01/2013
Physics of plasmas, 20, 1, 12702
Nonlinear physics governing the kinetic behavior of stimulated Raman scattering (SRS) in multi-speckled laser beams has been identified in the trapping regime over a wide range of k lambda(D) values (here k is the wave number of the electron plasma waves and lambda(D) is the Debye length) in homogeneous and inhomogeneous plasmas. Hot electrons from intense speckles, both forward and side-loss hot electrons produced during SRS daughter electron plasma wave bowing and filamentation, seed and enhance the growth of SRS in neighboring speckles by reducing Landau damping. Trapping-enhanced speckle interaction through transport of hot electrons, backscatter, and sidescatter SRS light waves enable the system of speckles to self-organize and exhibit coherent, sub-ps SRS bursts with more than 100% instantaneous reflectivity, resulting in an SRS transverse coherence width much larger than a speckle width and a SRS spectrum that peaks outside the incident laser cone. SRS reflectivity is found to saturate above a threshold laser intensity at a level of reflectivity that depends on k lambda(D): higher k lambda(D) leads to lower SRS and the reflectivity scales as similar to(k lambda(D))(-4). As k lambda(D) and Landau damping increase, speckle interaction via sidescattered light and side-loss hot electrons decreases and the occurrence of self-organized events becomes infrequent, leading to the reduction of time-averaged SRS reflectivity. It is found that the inclusion of a moderately strong magnetic field in the laser direction can effectively control SRS by suppressing transverse speckle interaction via hot electron transport. (C) 2013 American Institute of Physics. [http://dx.doi.org/10.1063/1.4774964]
Conference proceeding
Poster: The Hashed Oct-Tree N-Body Algorithm at a Petaflop
Published 11/2012
2012 SC Companion: High Performance Computing, Networking Storage and Analysis, 1442 - 1442
We have recently demonstrated our hashed oct-tree N-body code (HOT) scaling to 256k processors on Jaguar at Oak Ridge National Laboratory with a performance of 1.79 Petaflops (single precision) on 2 trillion particles. We have additionally performed preliminary studies with NVIDIA Fermi GPUs, achieving single GPU performance on our hexadecapole inner loop near 1 Tflop (single precision) and application performance speedup of 2x by offloading the most computationally intensive part of the code to the GPU.