The HEP research infrastructure relies on the availability of high bandwidth data movement across facilities and data/compute hubs. Very large datasets originating from observations and simulations often have to be moved over short timescales. The synergy between simulation and observation is a new driver for large-scale data movement.
HEP-CCE is collaborating with ESnet on a joint project to achieve and maintain the (automated production) ability to move data at the rate of roughly one Petabyte per week between major ASCR and HEP facilities. Considerable testing has been undertaken between ALCF (Argonne), NERSC (LBNL), OLCF (Oak Ridge) and NCSA (UIUC). BNL (Brookhaven), SDSC (San Diego) and Fermilab (FNAL) are being added to the list of testing sites. The Globus transfer service is used as the standard automated file transfer mechanism across all the sites. Our work has already resulted in considerable increases in achievable network rates between the collaborating centers (see figure below), due to improvements in provisioning of data transfer nodes (DTNs), network tuning, and Globus configuration tuning. Further improvements are planned including considerable investment in DTN infrastructure at the facilities.
The achieved performance reached the target goal in April 2017 across all the transfer hubs; the link between ALCF and NCSA was the first to exceed the requirement.
(Petascale DTN Project interim data report, Eli Dart et al. 2017, HEP-CCE report in preparation)
HEP Cosmic Frontier simulation datasets with a mix of file sizes were used to test, build out, and optimize the service as a first step. Other use cases will be added in the following stages. The initial discussions for this project took place at the ESnet/Internet2 sponsored CrossConnects 2015 Workshop on Improving Data Mobility and Management for International Cosmology. The topics, discussions, and conclusions are presented in the CrossConnects 2015 Workshop Summary Report.