February 1, 2021 – The Texascale Days event in December 2020 allowed nine research groups to use large sections of the National Science Foundation-funded Frontera supercomputer at the Texas Advanced Computing Center (TACC) to solve problems that, in many many cases have never been attempted.
What follows are abbreviated first-person accounts of how some of the researchers used their one-day access to the Frontera system to run groundbreaking simulations.
SURVEY OF COVID-19 MEMBRANES BY COMPUTER
Rommie Amaro, Professor and Endowed Chair in Chemistry and Biochemistry at the University of California, San Diego and winner of the 2020 Gordon Bells Prize for COVID-19 Research
We used Texascale Days to continue our efforts to construct and simulate the full envelope of SARS-CoV-2 using all-atom molecular dynamics simulations with NAMD2.
The virion has ~300 million atoms with an explicit solvent. A significant challenge of working with systems of this size and complexity is ensuring membrane stability, which requires extensive balancing of hundreds of millions of atoms.
We were grateful for the opportunity to use nearly 4,000 Frontera nodes to speed up our virion balancing. The additional data we gained from this run will help us develop methods and better understand how bilayer dynamics are affected by surface pressure, the presence of the spike protein, and the curvature of the viral envelope.
INFLUENCE OF GRAVITY WAVES ON STELLAR GASES
Paul Woodward, Professor of Astronomy, University of Minnesota – Twin Cities
In mid-December 2020, our research team had access to the entire Frontera machine at TACC for 24 hours. We took this opportunity to run special high-resolution simulations of an 80-solar-mass main-sequence star. We ran two such simulations side-by-side, using 3,672 nodes for each run, to see the effects of radiation scattering on the results.
These simulations are part of a study of central convection in massive main sequence stars. We investigate gas mixing at the boundary of the central convection zone, as well as the behavior and observational signatures of internal gravity waves created by convection in the stably stratified outer envelopes of these stars. When they reach the stellar surface, these gravity waves can be detected by satellites like TESS that monitor the time-varying brightness of stars as they search for exoplanets.
Not only do these gravity waves carry information about gas movements in the star’s core to the surface, but they can, in principle, cause matter mixing and, for rotating stars, angular momentum transport. in the envelope. To extract such subtle effects from our simulation data, our code must provide very high precision and track the behavior of the gas over a long enough period. We meet both of these requirements for our PPMstar code, and we could not hope to obtain the necessary fidelity of the simulation without the formidable computing power of the Frontera system.
AI + SIMULATION DRUG DISCOVERY PIPELINE FOR COVID-19
Andre Merzky, Senior Research Programmer, Rutgers University, Radical Group; Arvind Ramanathan, Principal Investigator, Argonne National Laboratory
We ran a drug testing pipeline similar to our last Texascale races. While the first run of Texascale was used to ensure we could operate efficiently at scale, the second was used to test different configurations of our software stack to ensure high resource utilization under different workloads. Again, this is something we can’t easily test at a lower scale: results at, say, 1,000 knots don’t easily translate to similar numbers at 8,000 knots.
The results were mixed: we had to rule out a number of configurations that were found to choke our communication system, but we finally managed to determine a configuration that allows us to run the pipeline at around 94% utilization of steady-state resources, which we are really happy with (this is production data, not benchmarks!).
In summary, we consider the pipeline to be stable at both small and large scale now, and production cycles are currently faster than the lab preparatory work needed to feed data into the production pipeline. In other words: we’re fast enough right now not to slow down the entire research pipeline.
SCALING QUANTUM ESPRESSO TO THE LIMIT
Feliciano Giustino, professor of physics and director of the Center for Quantum Materials Engineering, Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin; Hyungjun Lee, Principal Software Architect, Oden Institute
The Center for Quantum Materials Engineering at the University of Texas at Austin specializes in computational modeling and design of functional materials at the atomic scale using quantum mechanics. One of our main goals is to exploit the interaction between electrons and lattice vibrations (phonons) in material design. To this end, we are leading the development of EPW, an open-source community code of the Quantum ESPRESSO simulation suite for predictive calculations of electron-phonon physics and associated material properties.
In 2020, we refactored EPW in preparation for exascale computing, and extended the previous single-tier MPI parallelization to a new hybrid two-tier MPI and OpenMP parallelization. This hybrid hierarchical parallelization scheme allows us to reduce the overhead in MPI communications as well as the memory footprint of the calculations. During Texascale Days in December 2020, we had the opportunity to test this new hybrid parallelization scheme and assess the readiness of the EPW code for future exascale simulations on leading supercomputers.
With exclusive access to Frontera, we calibrated the EPW code with calculations on the superconductor MgB2 (magnesium diboride) at up to 7,840 knots (out of a total of 8,008). We demonstrated very good high-scaling performance, hitting 86% of ideal acceleration across 439,040 cores. These tests are documented on the project web page.
MASSIVE TORNADOES – BUILDING AT THE FULL STORM
Leigh Orf, Atmospheric Scientist, Space Science and Engineering Center, University of Wisconsin
I simulate thunderstorms producing tornadoes. The fidelity of simulations increases with increased resolution (smaller spacing between grid points on a cubic lattice), especially with respect to tornado behavior. However, doubling the resolution results in a 16x increase in the amount of resources needed. Things get crazy fast and it’s easy to use all available resources.
As the resolution increases, the time step of the model decreases in order to maintain stability. (This is a universal problem in fluid dynamics codes.) Not only do we have to do more computations due to more grid points, but we have to step forward in smaller time increments to prevent the pattern to explode.
I was given 7700 nodes to run for a day. I started the simulation from scratch in an environment that matches one of the large tornadoes that occurred during the April 27, 2011 outbreak in the southeastern United States. We went from t=0 to t=1700 seconds, with low I/O overhead because not much is happening at the start of these simulations.
Simulations should generally exit at t=9000 seconds [150 minutes] or so to finish. What we have so far is a perfectly reasonable looking storm, early in its life cycle. Because it’s so early in the simulation, nothing of note has happened. It would still take about four days at 7,700 knots to get the simulation to the point where interesting things happen, if they even happen. (There is no guarantee that the simulated storm will produce a tornado – much like what happens with tornado warnings that are issued when no tornado is observed).
If it was a movie, we’d only be 20 minutes away! I hope to be able to continue the simulation during the next Texascale Days.
EVOLVING THE FIRST UNIVERSITY IN 24 HOURS ON FRONTERA
Simeon Bird, professor of physics and astronomy, University of California, Riverside; Tiziana DiMatteo, Principal Investigator, Carnegie Mellon University
Read the full story in first person.
TURBULENCE IN THE COSMOS — AND ON EARTH
Alexei Kritsuk, researcher, Center for Astrophysics and Space Science, University of California, San Diego; Michael Norman, Principal Investigator, UCSD
We had a large portion of the Frontera system (up to 4,000 nodes) available to us for 24 hours on December 14-15, 2020. We used part of the time to run a series of scaling tests for our ADPDIS3D code ranging from 2,900 to 3,900 nodes. These scaling tests explored the performance of the code on a stochastic, homogeneous, and compressible turbulence problem at a grid resolution of 4096^3 points.
Most of the allotted 24-hour period was used for a production run, which employed 3,900 nodes, or about half the machine. This simulation evolved a Mach 1 turbulence model on a grid of 2048^3 points for about 3.3 dynamic times, which is sufficient to approximate a statistically stationary turbulence regime.
These simulations address important applications in astrophysics and cosmology such as interstellar turbulence and star formation, turbulence in the dilute intergalactic medium, and galactic foregrounds polarized to cosmic microwave background radiation. They also address the basic physics of compressible turbulence, which has a wide range of potential applications in astrophysics, solar wind physics, magnetospheric physics, atmospheric science, and vehicle design.
In the near future, we plan to evolve the model over time and collect statistics characterizing energy transfer across scales in compressible turbulence. The planned run will not be the largest in terms of grid resolution, but will provide unprecedented scale separation and highly accurate statistics of transonic turbulence using a class of new high-order methods that are efficient in terms of calculation implemented in the code.
Source: Aaron Dubrow, TACC