Output list
Journal article
Breaking the mold: Overcoming the time constraints of molecular dynamics on general-purpose hardware
First online publication 02/19/2025
The Journal of Chemical Physics, 162, 7
Journal article
Quantum optimization algorithms: Energetic implications
Published 07/25/2024
Concurrency and computation, 36, 16, n/a
Since the dawn of quantum computing (QC), theoretical developments like Shor's algorithm proved the conceptual superiority of QC over traditional computing. However, such quantum supremacy claims are difficult to achieve in practice because of the technical challenges of realizing noiseless qubits. In the near future, QC applications will need to rely on noisy quantum devices that offload part of their work to classical devices. One way to achieve this is by using parameterized quantum circuits in optimization or even in machine learning tasks. The energy requirements of quantum algorithms have not yet been studied extensively. In this article, we explore several optimization algorithms using both theoretical insights and numerical experiments to understand their impact on energy consumption. Specifically, we highlight why and how algorithms like quantum natural gradient descent, simultaneous perturbation stochastic approximations or circuit learning methods, are at least 2x$$ 2\times $$ to 4x$$ 4\times $$ more energy efficient than their classical counterparts; why feedback-based quantum optimization is energy-inefficient; and how techniques like Rosalin can improve the energy efficiency of other algorithms by a factor of >=$$ \ge $$20x$$ \times $$. Finally, we use the NchooseK high-level programming model to run optimization problems on both gate-based quantum computers and quantum annealers. Empirical data indicate that these optimization problems run faster, have better success rates, and consume less energy on quantum annealers than on their gate-based counterparts.
Conference proceeding
Synthesis of Approximate Parametric Circuits for Variational Quantum Algorithms
Published 09/17/2023
2023 IEEE International Conference on Quantum Computing and Engineering (QCE), 2, 306 - 307
This work develops a novel approach to exploit synthesized, approximate circuits for the ansatz of variational quantum algorithms (VQA) and demonstrates its effectiveness for NchooseK, a domain-specific language supporting quantum-based solving of constraint-based problems. Synthesis is generalized to produce parametric circuits of short depth in close approximation of the original circuit offline. This removes syn-thesis from the critical path (online) between repeated quantum circuit executions of VQA while reducing circuit depth, thereby resulting in higher fidelity results than the baseline without synthesis. Simulation experiments indicate improvements of 98% on average. Further, experiments indicate that this approach can obtain viable solutions when the baseline could not. All of this is achieved with an average variation in circuit depth of less than 10%.
Journal article
CLC: A cross-level program characterization method
First online publication 07/20/2023
Performance Evaluation, 102354
Conference proceeding
Harnessing Extreme Heterogeneity for Ocean Modeling with Tensors
Published 02/25/2023
Proceedings of the 2nd International Workshop on Extreme Heterogeneity Solutions, 1 - 6
ExHET 23:: 2nd International Workshop on Extreme Heterogeneity Solutions
Specialized processors designed to accelerate tensor operations are evolving faster than conventional processors. This trend of architectural innovations greatly benefits artificial intelligence (AI) workloads. However, it is unknown how well AI-optimized accelerators can be retargeted to scientific applications. To answer this question we explore (1) whether a typical scientific modeling kernel can be mapped efficiently to tensor operations and (2) whether this approach is portable across diverse processors and AI accelerators. In this paper we implement two versions of tracer advection in an ocean-modeling application using PyTorch and evaluate these on one CPU, two GPUs, and Google's TPU. Our findings are that scientific modeling can observe both a performance boost and improved portability by mapping key computational kernels to tensor operations.
Journal article
Quantum Algorithm Implementations for Beginners
Published 12/31/2022
ACM Transactions on Quantum Computing, 3, 4, 18
Conference proceeding
Combining Hard and Soft Constraints in Quantum Constraint-Satisfaction Systems
Published 11/2022
SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, 2022-, 1 - 14
This work presents a generalization of NchooseK, a constraint satisfaction system designed to target both quantum circuit devices and quantum annealing devices. Previously, NchooseK supported only hard constraints, which made it suitable for expressing problems in NP (e.g., 3-SAT) but not NP-hard problems (e.g., minimum vertex cover). In this paper we show how support for soft constraints can be added to the model and implementation, broadening the classes of problems that can be expressed elegantly in NchooseK without sacrificing portability across different quantum devices. Through a set of examples, we argue that this enhanced version of NchooseK enables problems to be expressed in a more concise, less error-prone manner than if these problems were encoded manually for quantum execution. We include an empirical evaluation of performance, scalability, and fidelity on both a large IBM Q system and a large D- Wave system.
Conference proceeding
Cross-Level Characterization of Program Execution
Published 10/2022
2022 30th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2022-, 33 - 40
Characterization of program execution plays a key role in performance improvement. There are numerous transformations applied to each step a program takes on its lowering from source code to a compiler intermediate representation to machine language to microarchitecture-specific execution. The unpredictable benefit of each transformation step could lead a notionally superior algorithm to exhibit inferior performance once actually run, and it can be opaque at what step in the transformation path contradicted the code developer's assumptions. However, conventional approaches to program execution characterization consider the behavior after only a single one of those steps, which limits the information that can be provided to the user. To help address the issue of myopic views of program execution, this paper presents a novel cross-level characterization approach for understanding the behavior of program execution at different levels in the process of writing, compiling, and running a program. We show that this approach provides a richer view of the sources of performance gains and losses and helps identify program execution in a more accurate manner.
Conference proceeding
Cross-Level Characterization of Program Behavior : (Extended Poster Abstract)
Published 05/2022
2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 245 - 247
Program behavior can be defined as a collection of executions [1]. Program behavior strongly relates to actual program performance but can be complicated to be characterized and analyzed. Characterization is important as it helps better understand program behavior by measuring various operations a program performs. There are many existing techniques [2]-[7] for program characterization, which operate at different levels of instrumentation: source code, intermediate representation (IR), instruction set architecture (ISA), and CPU microarchitecture. Each of these levels provides different capabilities and limitations. In this paper, we introduce Cross-Level Characterization (CLC), an analysis of similarities and differences in resource counts as measured at each level of instrumentation during a program's transformation from source code through execution on a specific microarchitecture.
Journal article
Published 02/01/2022
IEEE transactions on parallel and distributed systems, 33, 2, 249 - 250