Development and optimization of the domain-specific library SIRIUS for electronic-structure calculations, and integration within the Quantum-ESPRESSO distribution

PI: Nicola Marzari (EPFL)

July 1, 2017 - June 30, 2020

Project Summary

The characterization, design, and discovery of novel or complex materials with first-principles simulations is one of the most urgent and strategic areas of growth and investment in science and technology. The reasons for this are a combination of the predictive accuracy of current quantummechanical simulations, especially when paired with the moderate costs of density-functional theory calculations, and with the emerging trend of high-throughput calculations (HTC) to explore systematically many different materials, compositions, or phases. It’s oGen said that we will reach the exascale through a combination of HPC (high-performance computing) and HTC.

A simple but compelling example is shown in Fig. 1 below, where we used validated Van-der-Waals density-functional theory to calculate the binding energies of a set of 6,000+ compounds that were classified as layered according to simple geometric and bonding criteria, starting from a curated set of more than 110,000 experimental structures. Massive HTC runs performed on the Piz Dora Cray XC40 supercomputer at CSCS, and involving more than half-a-million density-functional theory (DFT) calculations, have led to the identification of 1,800+ compounds that could easily or potentially be exfoliated in novel two-dimensional materials, with novel, exciting, and promising electronic, optical, magnetic, and chemical properties (see Fig. 2 for a more detailed breakdown of the calculations performed).

Figure 1: The set of 1,800+ materials that can be exfoliated into novel 2D materials, as obtained from high-throughput computations on a database of 110,000+ experimentally recorded compounds. All materials that have been experimentally exfoliated have been recovered - we show a few in the graph above.

Thus, the need to provide optimal performance in such massive undertakings becomes obvious - at the same time, the challenge for the scientific computing community is as high as ever, given the complexity of the present and evolving hardware landscape, that is accelerating in a manner that is even more dramatic than that that took place in the early 1990s, with the move from serial or vector computers to parallel architectures. In particular, graphical processing units and accelerators are going to be a key component of current and forecast systems, with a roadmap both worldwide and especially at CSCS that is dependent on extracting optimal performance from such systems.

For this reason, the most powerful forward-looking strategy in the field is to develop domain specific libraries that abstract away some of the core, computationally intensive, but common tasks that all electronic-structure codes need to perform (e.g. calculation of the Hamiltonian and its application onto a wave function) that is put in the hand of computer scientists and specialists, and where supercomputing centres can focus their performance analysis, profiling, and optimization. This move mirrors the efforts that were made, starting in the 1970s, to first codify common linear-algebra operations into libraries for serial, vector, and finally parallel architectures (e.g. LINPACK to LAPACK to ScaLAPACK).

The present project will be devoted to the development of missing core functionalities in SIRIUS, and their integration in the Quantum-ESPRESSO distribution: the calculation of forces and stress tensors, of Hubbard U corrections, of collinear and non-collinear magnetism, spinorbit coupling, and linear-response, with an extensive verification and continuous testing of SIRIUS results against standard calculations. While this effort is targeted first at Quantum- ESPRESSO, that is currently the most used open-source code for pseudopotential calculations, these developements will benefit the entire community that uses open or even closed-source plane wave codes, thanks to the BSD license of SIRIUS.

Figure 2: A breakdown of all the calculations that have been performed to produce the results summarised in Fig. 1 (CSCS Piz Dora XC40). Each calculation is stored, together with it’s entire provenance (all parent and children calculation) in a fully traversable AiiDA directed acyclic graph.