Welcome to the Darshan project

July 31st, 2009

This is the home page for  Darshan, a scalable HPC I/O characterization tool. Darshan is designed to capture an accurate picture of application I/O behavior, including properties such as patterns of access within files, with minimum overhead.  The name is taken from a Sanskrit word for “sight” or “vision”.

Darshan can be used to investigate and tune the I/O behavior of complex HPC applications.  In addition, Darshan’s lightweight design makes it suitable for full time deployment for workload characterization of large systems.  We hope that such studies will help the storage research community to better serve the needs of scientific computing.

Darshan was originally developed on the IBM Blue Gene series of computers deployed at the Argonne Leadership Computing Facility, but it is portable across a wide variety of platforms include the Cray XE6, Cray XC30, and Linux clusters.  Darshan routinely instruments jobs using up to 786,432 compute cores on the Mira system at ALCF.

You will find current news about the Darshan project posted below.   Additional documentation and details about the Darshan are available from the links at the top of this page.

Darshan 2.3.1-pre1 experimental release

October 29th, 2014
Comments Off

Darshan 2.3.1-pre1 is now available for download, and the release changelog is listed below. Please let us know if you have any feedback or suggestions. We’ll be working to turn this into a stable release in the coming weeks.

darshan-2.3.1-pre1


  • Add wrappers for mkstemp(), mkostemp(), mkstemps(), and mkostemps() (reported by Tom Peterka)
  • Change OPEN_TIMESTAMP field to report timestamp right before open() is invoked rather than after timestamp after open is completed.
    NOTE: updated log format version to 2.06 to reflect this change.
  • Change start_time and end_time fields in job record to use min and max (respectively) across all ranks
  • Fix bug in write volume data reported in file system table in darshan-job-summary.pl (reported by Matthieu Dorier)
  • Clean up autoconf test for zlib and make zlib mandatory (reported by Kalyana Chadalavada)
  • add –start-group and –end-group notation to Darshan libraries for Cray PE 2.x environment to fix link-time corner cases (Yushu Yao)
  • improve y axis labels on time interval graphs in darshan-job-summary.pl (reported by Tom Peterka)
  • misc. improvements to darshan-parser –perf output (reported by Shane Snyder)
    • indicate which rank was slowest in unique file results
    • label I/O vs. meta time more clearly
    • include unique file meta time in agg_perf_by_slowest calculation

Upcoming Darshan events, Fall 2014

October 3rd, 2014
Comments Off
  • October 26, 2014, Raleigh NC: “Darshan – I/O Workload Characterization for MPI Applications” tutorial to be held at IISWC 2014.  The presenters are Yushu Yao of Lawrence Berkeley National Laboratory and Phil Carns of Argonne National Laboratory.  For more information see the tutorial website (http://www.mcs.anl.gov/research/projects/darshan/tutorials/iiswc2014/) or the conference web site (http://www.iiswc.org/iiswc2014/index.html).
  • November 20, 2014, New Orleans LA: “Analyzing Parallel I/O” BOF at SC 2014.  This BOF will include a discussion of multiple I/O instrumentation tools, including Darshan.  The session leaders are Julian Kunkel (German Climate Computing Center), Phil Carns (Argonne National Laboratory), and Alvaro Aguilera (Technical University Dresden).  For more information see the SC BOF website (http://sc14.supercomputing.org/schedule/event_detail?evid=bof121).

Source repository migrated to git

September 2nd, 2014
Comments Off

The Darshan source repository has been migrated to git.  You can find information about how to access the new repository here:

http://www.mcs.anl.gov/research/projects/darshan/developer-access/

Access to the old svn repository will be disabled shortly.

Darshan 2.3.0 release

May 28th, 2014
Comments Off

Darshan 2.3.0 is a minor update from Darshan 2.2.9.  It fixes incorrect version numbering in the darshan-runtime component.

darshan-2.3.0

  • Fix incorrect version numbering in darshan-runtime component of Darshan  2.2.9, reported by Jean-Guillaume Piccinali

 

Darshan 2.2.9 release

May 6th, 2014
Comments Off

Darshan 2.2.9 is now available for download on the Darshan web site. This is primarily a maintenance and bug fix release.  It also includes support for versions 2.0 and newer of the Cray Programming Environment. The full change log is listed below:

darshan-2.2.9

  • Bug fixes:
    • Fix mnt table overflow if a large number of file systems are mounted, reported by David Shrader.
    • Fix argument parsing for darshan-convert, reported by Mouhamed Gueye.
    • Fix metadata annotation overflow in darshan-convert, reported by Mouhamed Gueye.
    • Fix const-correctness in dynamic library when built against MPI 3.x libraries.
    • Fix “undefined symbol: dlsym” error when using preloaded dynamic library on some platforms, reported by Florin Isaila.
    • Normalize timestamps to always be relative to MPI_Init().
    • Better library name matching in compiler wrappers to handle more MPICH variations on Blue Gene systems.
    • Fixed header dependency problems when using darshan-logutils.h in external programs (Shane Snyder).
  • Enhancements:
    • Add support (both in documentation and in provided module files) for Cray PE 2.x.
    • Honor CC variable to allow darshan-util to be built with other compilers besides gcc.
    • Update documentation to clarify the interaction between rpath and LD_PRELOAD when instrumenting dynamic libraries, issue reported and investigated by Davide Del Vento.
    • Ability to disable shared-file reduction by setting the DARSHAN_DISABLE_SHARED_REDUCTION environment variable.
    • More thorough output from darshan-parser –perf, suggested by Huong Luu.
    • Increased metadata annotation room from 64 bytes to 1KiB in header.
    • CP_F_{FASTEST/SLOWEST}_RANK_TIME counters now take MPI-IO time into account, not just POSIX time, issue reported by Huong Luu.
    • Better handling of systems with many mounted file systems (after which point Darshan will assume file resides on / file system), issue reported and investigated by David Shrader:
    • Track up to 64 rather than 32 mounted file systems at runtime.
    • Increase header space available for storing mount point information in log file from approximately 1 KiB to approximately 3 KiB.
    • Prioritize storing information about non-NFS volumes over NFS volumes if too many file systems are mounted to record them all.
    • Added darshan-util pkgconfig file (Shane Snyder).
    • Added –enable-shared configure option to darshan-util to build and a shared library version of libdarshan-util

 

“I/O characterization of large-scale applications with Darshan” presented at SDAV meeting

March 20th, 2014
Comments Off

A poster and presentation entitled “I/O characterization of large-scale applications with Darshan” was presented at the SciDAC Institute of Scalable Data Management, Analysis and Visualization All Hands Meeting in February 2014 at Georgia Tech.  The poster and presentation are available for download on the publications page.

ALCF I/O Data Repository update

February 18th, 2014
Comments Off

The ALCF I/O Data Repository has been updated to include all of the Darshan logs collected on Intrepid from October through December of 2013.  This brings the total log count in the repository to 156,338.

2012-2013 Darshan logs added to the the ALCF I/O Data Repository

October 29th, 2013
Comments Off

Anonymized versions of all Darshan logs collected on the Intrepid Blue Gene/P system at the Argonne Leadership Computing Facility from January 1, 2012 through October 22, 2013 are now available for download as part of the ALCF I/O Data Repository.

Darshan is a scalable HPC I/O characterization tool that collects concise I/O access pattern information from large-scale production applications.  The Darshan data provided in the ALCF I/O Data Repository includes:

  • I/O characterization from 152,167 unique production application runs*
  • over 721 million core hours of execution time
  • 31 PiB of I/O activity
  • examples of application runs with up to 163,840 processes
  • examples of application runs that accessed up to 204 TiB of data

More information about how to use the data can be found  at http://www.mcs.anl.gov/research/projects/darshan/data/.

* note: previously announced log file count of 195,233 was in error, but all other statistics are accurate

Darshan-Ruby 1.1

September 10th, 2013
Comments Off

Matthieu Dorier of ENS Cachan Brittany and IRISA has released Darshan-Ruby 1.1, including support for Darshan 2.2.8 and a gem installation package.  Darshan-Ruby provides object-oriented Ruby bindings for analysis of Darshan log files.

Darshan 2.2.8 release

September 3rd, 2013
Comments Off

Darshan 2.2.8 is now available for download on the Darshan web site. This release contains a number of bug fixes for problems recently reported on the mailing list. It also includes a few new tools for splitting up Darshan logs based on files opened by the application. The full change log is listed below:

darshan-2.2.8

  • WARNING: please note that the Darshan module file for Cray environments has been updated, especially in the DARSHAN_POST_LINK_OPTS variable. Please update your module file accordingly when upgrading from 2.2.7 or earlier on Cray platforms.
  • Improved ability to analyze I/O activity related to particular files opened by an application
    • darshan-summary-per-file.sh script to generate a separate pdf summary for each file opened by an application. Developed by Rob Latham.
    • Added –file-list and –file-list-detailed options to darshan-parser to list files opened by an application along with brief statistics
    • Added –file option to darshan-convert to filter out activity for a specific file from a Darshan log
  • Add wrappers for POSIX AIO operations (fixes tracking of underlying POSIX operations resulting from nonblocking MPI-IO operations. Bug reported by David Shrader.)
  • Fix compile-time errors when Darshan is configured for use with MPICH 3.x installations with const correct MPI function prototypes. Also affects MPICH 1.5+ installations with optional const support. Reported by Yushu Yao.
  • Fix segmentation fault when using LD_PRELOAD instrumentation on programs that use MPI_Init_thread() rather than MPI_Init(). Reported by Myriam Botalla.