Archive for the ‘publications’ Category

New Parallel I/O textbook: High Performance Parallel I/O

October 23rd, 2014
Comments Off on New Parallel I/O textbook: High Performance Parallel I/O

Prabhat and Quincey Koziol have edited a new parallel I/O textbook.  In addition to ROMIO and MPI-IO concepts, the book spans the software stack, covering  storage devices, I/O libraries, file systems, applications, facilities.    This book marks a major update over the field’s previous survey textbook (2001’s “Parallel I/O for High Perfomance Computing”, by John May, now out of print).

The book contains material from me (Rob Latham),  several other Argonne researchers, and a large number of our past and present collaborators.

“High Performance Parallel I/O”, Prabhat and Quincey Koziol, Editors. Chapman and Hall/CRC, October 2014.

You can get the book from Amazon and all the other places books are sold: Amazon, Google Books.


Understanding I/O performance at Leadership Scale

November 14th, 2009
Comments Off on Understanding I/O performance at Leadership Scale

Getting the best performance out of ROMIO can take a good bit of work and a great deal of experience.  Before one can  properly tune the various performance optimizations (data sieving, two-phase collective buffering, I/O aggregator selection and placement, etc), one needs to understand the entire storage stack.   Such a study is time and labor intensive, but sometimes the resulting paper can get you an SC publication:

I/O Perfomance Challenges at Leadership Scale (a link tot he PDF) (a link to the citation)

We spent a million CPU hours and benchmarked all the links between disks and compute nodes.  While Intrepid will only be with us for a few years (UPDATE: Intrepid was decommissioned at the end of 2013), the approach in this paper should be applied to all new leadership-class machines.


Tuning collective I/O strategies for GPFS and Lustre

November 20th, 2008
Comments Off on Tuning collective I/O strategies for GPFS and Lustre

ROMIO’s strength lies in how it hides the details of the underlying file system from the end-user. MPI_File_open looks the same if you are writing to PVFS or GPFS, even though both systems have very different lower-level interfaces.

When it comes to the venerable but powerful two-phase collective I/O optimization, those file system quirks matter more than we thought. Wei-keng Liao at Northwestern has studied the problem and demonstrated just how badly a “one size fits all” approach is

If your storage system uses locks to enforce POSIX consistency, ROMIO can work around that

A little bit of file-system awareness can result in much better performance

The summary: if you align your ROMIO file domians to file system block boundaries, you get great GPFS performance, but terrible Lustre performance. If you carefully split up your domain into a “group cyclic” style, where each ROMIO I/O aggregator communicates with only one Lustre OSS, you get great Lustre performance, but terrible GPFS performance.

If you try to do this in an application, you are probably going to get it wrong. Let ROMIO deal with it!

Here’s the paper: Dynamically Adapting File Domain Partitioning Methods for Collective I/O Based on Underlying Parallel File System Locking Protocols.


ROMIO on Blue Gene /L

February 15th, 2006
Comments Off on ROMIO on Blue Gene /L

The Blue Gene /L machine came out of IBM’s research division, and then became a product.  This unusual lineage resulted in its software stack using a bit more open source technology than usual.  For example, ROMIO provides the underlying MPI-IO implementation instead of IBM’s PE product.  Hao Yu published a paper at HPCA 2006 describing some of the improvements IBM made to ROMIO:

  • align file domains to block boundaries
  • use MPI collectives instead of point-to-point operations in ROMIO’s two-phase algorithm
  • selecting ROMIO’s two-phase aggregators such that they are spread out over Blue Gene I/O nodes

H. Yu, R. K. Sahoo, C. Howson, G. Almasi, J. G. Castanos, M. Gupta J. E. Moreira, J. J. Parker, T. E. Engelsiepen, R. Ross, R. Thakur, R. Latham, and W. D. Gropp, High Performance File I/O for the BlueGene/L Supercomputer, in Proc. of the 12th International Symposium on High-Performance Computer Architecture (HPCA-12), February 2006. DOI