Accelerating HEP Science: Inference and Machine Learning at Extreme Scales

This SciDAC-4 project brings together ASCR and HEP researchers to develop and apply new methods and algorithms in the area of extreme-scale inference and machine learning. The research program melds high performance computing (HPC) and techniques for “big data” analysis to enable new avenues of scientific discovery. The focus is on developing powerful and widely applicable approaches to attack problems that would otherwise be largely intractable. We will apply the new techniques to cosmological surveys across multiple wavebands as well as explore applications to the broader landscape of high energy physics.

The role of leading-edge HPC in scientific discovery is undergoing a sea change driven by massive growth in problem size as modeling tasks rapidly increase in complexity and descriptive detail. Consequently, HPC systems often function as “big data” generators when modeling complex systems and experiments with end-to-end realism; at the same time they are also “big data” analysis engines when targeted at extracting science from large-scale experimental facilities. The ready availability of computational power and of large data sets also drives the burgeoning use of Bayesian methods for scientific inference, as well as machine learning, especially deep learning, for a variety of classification and reconstruction tasks. In this project, we aim to harness HPC and cutting-edge inference and machine learning techniques to attack outstanding problems within HEP’s Cosmic Frontier mission space. More significantly, we will also explore a number of new approaches that arise only as a consequence of the ability to combine HPC with inference and learning algorithms.

The key science drivers cover some of the most important issues in fundamental physics and the methods needed to address them. These include the nature of dark energy, the nature and distribution of dark matter, primordial fluctuations, new cosmological probes, searches for new particles and dark matter, and object classification and significance. It is widely recognized that use of HPC platforms for data-intensive tasks, while still in its early stages, has enormous future potential. In the HEP context, a roadmap for realizing this potential has been recently laid out in the joint ASCR/HEP Exascale Requirements Review report, with significant contributions from several members of the proposing team. Members of the team have made significant contributions in applying HPC systems to large-scale scientific simulations and data-intensive applications, and the development of state of the art emulation and statistical inference techniques.