Ab Initio Molecular Dynamics at the Exa-Scale

PI: Jürg Hutter (University of Zurich)

July 1, 2021 – June 30, 2024

Project Summary

Electronic structure calculations for realistic condensed-phase systems require more resources than single molecule calculations. Therefore, condensed-phase electronic structure modeling often relies on simple approximations like Kohn-Sham density functional theory (KS DFT) with generalizedgradient approximation (GGA) functionals. GGA DFT combines computational feasibility for extended systems with reasonable accuracy for many important system types and properties. However, GGA DFT functionals suffer from inaccuracies and have a limited reliability. A systematic way of eliminating sources of error is through explicit many-body correlation from wave function theory. One option is to compute correlation energy on the KS-DFT reference with the random phase approximation (RPA) or by double-hybrid functionals (DHDF). Improvement in accuracy and reliability comes at considerable computational price: RPA and DHDFs as scaling as N4 − N5 with significant prefactors and are 2-3 orders of magnitude times more expensive that KS-DFT calculations for extended system.

Solving relevant problems in chemistry and materials science requires extensive sampling of configuration space, which can be done either with Monte Carlo (MC) methods or molecular dynamics (MD). MD requires both energies and forces but is a more versatile approach. High computational cost of force evaluation with MP2/RPA/DHDF sets current limits to the applicability of the methods: ca. 10000 time steps for systems of a few hundred atoms. This is not sufficient to achieve reliable statistics, but is typically enough to obtain reliable interaction potentials based on modern machine-learning (ML) techniques. These ML potentials increase the scope of applications dramatically allowing MD trajectories of nanoseconds and even quantum dynamics with path-integral techniques. We will develop RPA and DHDF methods within the CP2K code for MD applications. The methods enhanced with ML potentials will be applied to systems challenging for GGA KS-DFT. Dynamics will bring them closer to experimental reality and increase the applications scope. The CP2K code is widely used in many different fields of computational science and is actively developed and has a large community following. Depending on the application profile different parts of the program will dominate the CPU usage. However, in all electronic structure applications we were able to identify five basic kernels: sparse and dense linear algebra, Gaussian grid manipulations, 3d fast Fourier transforms, and exchange functional grid integration. Some of these kernels are provided by external libraries, some of them are CP2K internal routines. Code refactoring and better interfaces to external libraries will enhance the overall performance and scalability by increasing GPU usage. The project is organized in four independent tasks. The major task is related to new features within the CP2K code, namely, implementing analytic nuclear gradients for RPA and DHDF. These new features lead directly to the possibility to perform new science. We will make use of these features to submit Tier-0 projects on ab-initio molecular dynamics of covalent organic frameworks (COF) and metal organic frameworks (MOF), as well as solvated electron systems. Task 2 is related to the DBCSR library. This is the main performance library of CP2K and is essential for an efficient usage of modern supercomputing hardware. The sparse block matrix and tensor library has been developed in recent years by us in collaboration with the CSCS/PASC team and we will build on this successful partnership also in this project. In Task 3 we will work together with the CP2K developers communit and the CSCS/PASC team on a full refactored version of the Gaussian grid library. This library is at the core of the GPW (Gaussian and Plane Wave) method. The goal of the refactoring is to maintain the performance of the implementation and at the same time make it possible to proceed to a GPU enabled implementation. Task 4 addresses a very old part of the CP2K code, the grid treatment of XC functionals using FFT. We plan to either adapt external libraries or to follow an alternate route using finite difference methods. All developments will be directly available to the community through the development version of CP2K.

The development of an efficient module for the calculation of nuclear gradients for RPA and doublyhybrid functionals will strengthen the lead of the CP2K code in the application of these methods for soft condensed matter systems. With the possibility to couple these calculations to machine learning of neural network potentials it will open new possibilities for high level electronic structure methods. The application team plans to submit several computer time proposals for Tier-0 systems based on the development made for this project. The applications are geared towards systems of interest currently studied by our group.

The immediate availability of the methodology to the whole community will most likely drive a further growth of the CP2K user basis. The overall improvements for basic density functional methods expected by the developments will benefit the whole community. The developments envisaged by this proposals will have an effect for most users of the code at CSCS. The further development of the DBCSR library, the interface to the dense linear algebra routines developed by the CSCS future systems group, refactoring of the Gaussian grid routines for accelerator usage will lead to a better performing code in general, and a more efficient use of GPU accelerated compute nodes.