Darshan – HPC I/O Characterization Tool

Welcome to the Darshan project

February 7, 2023July 31, 2009 by carns

This is the home page for Darshan, a scalable HPC I/O characterization tool. Darshan is designed to capture an accurate picture of application I/O behavior, including properties such as patterns of access within files, with minimum overhead. The name is taken from a Sanskrit word for “sight” or “vision”.
Darshan can be used to investigate and tune the I/O behavior of complex HPC applications. In addition, Darshan’s lightweight design makes it suitable for full time deployment for workload characterization of large systems. We hope that such studies will help the storage research community to better serve the needs of scientific computing.
Darshan was originally developed on IBM Blue Gene series computers at the Argonne Leadership Computing Facility, but today it is portable across a wide variety of platforms and is deployed in production at computing facilities around the world.
You will find current news about the Darshan project posted below. Additional documentation and details about the Darshan are available from the links at the top of this page.

Darshan 3.4.4 now available

August 8, 2023 by Snyder, Shane

Darshan version 3.4.4 is now officially available for download HERE. This point release includes a few minor bug fixes:

Fixed bug leading to inconsistent heatmap record shapes when Darshan shared file reductions are disabled
- Also added a darshan-util library fix to resolve this inconsistency on already impacted logs (any generated with 3.4.0+ versions of Darshan)
Added workaround for potential undefined symbol errors for ‘H5FD_mpio_init’ when LD_PRELOADing an HDF5-enabled runtime library
- Bug triggered by 1.13+ versions of HDF5

We have also released PyDarshan 3.4.4.0 on PyPI, though this is just to track the 3.4.4 darshan-util library. There are no modifications to PyDarshan functionality.

Documentation for Darshan and PyDarshan is available HERE.

Please report any questions, issues, or concerns with this release using the darshan-users mailing list or by opening an issue on our GitHub.

Darshan 3.4.3 now available

June 13, 2023 by Snyder, Shane

Darshan version 3.4.3 is now officially available for download here: https://www.mcs.anl.gov/research/projects/darshan/download/. This point release includes a few minor bug fixes for darshan-runtime libraries:

Added new configure option ‘–with-username-env’ to allow specification of an env variable to use to find the username associated with a job (e.g., SLURM_JOB_USER)
Fixed bug causing crashes for applications that call fork() and use Darshan app exclusions settings
Fixed bug related to not closing open HDF5 file ID when instrumenting H5Fflush() calls

More notably, we have also released PyDarshan 3.4.3.0 on PyPI, with this release including a number of improvements/changes to the log analysis package and corresponding tools:

PyDarshan job summary tool improvements:
- Added new module overview table
- Added new file count summary table
- Added new plot of POSIX module sequential/consecutive accesses
- Included PnetCDF `wait` time in I/O cost figures
- Dropped default generation of DXT-based heatmaps and added a new cmdline option to force generate them (–enable_dxt_heatmap)
- Dropped usage of scientific notation in “Data access by category” plot
- Made captions, axis labels, and annotations clearer and easier to read
Integrated Python support for darshan-util accumulator API for aggregating file records and calculating derived metrics
- Added backend routine `accumulate_records`, which returns a derived metric structure and a summary record for an input set of records
- Added backend routine `_df_to_rec` to allow conversion of a DataFrame of records into raw byte arrays to pass into the darshan-util C library (e.g., for using accumulator API)
Fixed bug allowing binary wheel installs to prefer darshan-util libraries found in LD_LIBRARY_PATH
Fixed bug in DXT heatmap plotting code related to determining the job’s runtime
Updated docs for installation/usage of PyDarshan
Dropped support for Python 3.6

For reference, an example report generated by the updated PyDarshan job summary tool can be found here: https://www.mcs.anl.gov/research/projects/darshan/docs/e3sm_io_report.html.

Documentation for Darshan and PyDarshan is available here: https://www.mcs.anl.gov/research/projects/darshan/documentation/.

Please report any questions, issues, or concerns with this release using our mailing list, or by opening an issue on our GitHub: https://github.com/darshan-hpc/darshan.

Join us on Slack

January 12, 2023 by Snyder, Shane

Follow the below invitation to join Darshan’s new Slack workspace:

https://join.slack.com/t/darshan-io/shared_invite/zt-1n6rhkqu8-waSQCVWYDrUpBdcg_1DwqQ

We hope this workspace will provide another opportunity for the Darshan team and users to engage, whether it be about bug reports, usage questions, feature requests, project roadmap, etc. The Darshan team will also use this workspace to get user feedback on upcoming Darshan enhancements and other changes, as well as to announce new software releases.

Hope to see you there!

Darshan 3.4.2 release is now available

December 16, 2022 by Snyder, Shane

Darshan version 3.4.2 is now officially available for download here. This point release includes important bug fixes for Darshan’s new PnetCDF module:

Fixed segfault when defining scalar variables in PnetCDF module
Fixed bug attributing all PnetCDF variable instrumentation to the first variable instrumented
Fixed memory corruption (and potential segfault) when reading/writing high-dimensional PnetCDF variables using vara/vars/varm interfaces
Fixed crashes related to using PnetCDF vard interfaces with input MPI_DATATYPE_NULL datatypes

Note that these bugs can only be triggered by the PnetCDF module released in Darshan version 3.4.1, which is disabled by default. There should be no impact on Darshan 3.4.1 configurations that did not explicitly enable PnetCDF instrumentation.

We have also released PyDarshan 3.4.2.0 on PyPI, though this is just to track the 3.4.2 darshan-util library. There are no new modifications to PyDarshan functionality.

Documentation for Darshan and PyDarshan is available here.

Please report any questions, issues, or concerns with this release using the darshan-users mailing list, or by opening an issue on our GitHub: https://github.com/darshan-hpc/darshan.

Darshan 3.4.1 release is now available

December 2, 2022 by Snyder, Shane

Darshan version 3.4.1 is now officially available for download here. This release includes the following new features, bug fixes, etc.:

Added comprehensive instrumentation of PnetCDF APIs via PNETCDF_FILE and PNETCDF_VAR modules (contributed by Wei-Keng Liao)
- disabled by default, enabled by passing `–enable-pnetcdf-mod` to configure
Modified Darshan log format to support a max of 64 instrumentation modules, since the current version of Darshan reached the old max (16)
Modified Darshan to report job start/end times at nanosecond granularity (previously only second granularity was possible)
Added support for instrumenting H5Oopen family of calls
Modified HDF5 module extraction of dataspace selection details
- Extraction of point selections now possible regardless of HDF5 version
- H5S_ALL selections are no longer counted as regular hyperslab accesses
Fixed bug causing no instrumentation of child processes of fork() calls (reported by Rui Wang)
Deprecated –file-list and –file-list-detailed options in darshan-parser
Added “darshan_accumulator” API to the logutils library
- _create(), _inject(), _emit(), and _destroy()
- generalizes the mechanism for producing summation records and derived metrics for sets of records from a given module
- refactored darshan-parser to use new API
- implemented support for accumulators in POSIX, STDIO, and MPIIO modules
Fixed memory leak in darshan-util helper functions used by PyDarshan
- darshan_log_get_name_records
- darshan_log_get_filtered_name_records
Integrated the µnit Testing Framework in darshan-util
- implemented unit tests for darshan_accumlator API

We have also released PyDarshan 3.4.1.0 on PyPI, which includes a number of improvements:

Fixed memory leaks in the following backend CFFI bindings (reported by Jesse Hines):
- log_get_modules
- log_get_mounts
- log_get_record
- log_get_name_records
- log_lookup_name_records
Added PnetCDF module information to job summary tool
Testing modifications:
- Switched to use of context managers for log Report objects to avoid test hangs in certain environments
- Marked tests requiring lxml package as xfail when not installed

Documentation for Darshan and PyDarshan is available here.

Please report any questions, issues, or concerns with this release using the darshan-users mailing list, or by opening an issue on our GitHub: https://github.com/darshan-hpc/darshan.

Darshan 3.4.0 release is now available

July 21, 2022 by Snyder, Shane

Darshan version 3.4.0 is now officially available for download here. This release is a follow-up to our recent 3.4.0-pre1 pre-release, and we believe it is stable and ready for production use. In addition to features and bug fixes introduced in 3.4.0-pre1, this full release includes the following bug fixes to Darshan libraries/tools:

Fix segfault affecting new DARSHAN_MOD_DISABLE/ENABLE environment variables
Fix divide-by-zero condition that can potentially be triggered by new heatmap module
Fix potential MPI errors related to calling MPI_Type_size() on a user-supplied MPI_DATATYPE_NULL type (reported by Jim Edwards)
cuserid() is no longer the default method for determining username, and must be manually enabled at configure time
Fix backwards compatibility bug affecting darshan-3.0.0 logs in darshan-util C library functions used by PyDarshan
Suppress noisy output warnings when using darshan-job-summary.pl
Clarify units displayed by darshan-job-summary.pl (reported by Jeff Layton)

We have also released PyDarshan 3.4.0.1 on PyPI, which includes a number of improvements:

New Darshan job summary report styling
- HTML job summary reports can be generated using: python -m darshan summary <logfile_path>
Bug fix to heatmap module plotting code caused by logs with inactive ranks
Fix warnings related to Pandas deprecation of df.append

Documentation for Darshan and PyDarshan is available here.

Please report any questions, issues, or concerns with this release using the darshan-users mailing list, or by opening an issue on our GitHub: https://github.com/darshan-hpc/darshan.

darshan-3.4.0-pre1 release is now available

May 9, 2022 by Snyder, Shane

We are pleased to announce a pre-release version of Darshan 3.4.0 (3.4.0-pre1) is now available HERE. As always, please be aware that Darshan pre-releases are experimental and not recommended for full-time use in production yet. An official 3.4.0 release will be made available soon.

This release contains a number of exciting new features and enhancements to Darshan:

Added new heatmap module to record per-process histograms of I/O activity over time for POSIX, MPI-IO, and STDIO modules
Added comprehensive darshan-runtime library configuration support, via environment variables and/or configuration file
- Allows user to control how much memory Darshan modules use at runtime, restricts instrumentation to specific file name patterns, etc.
- See the following link for more details: https://www.mcs.anl.gov/research/projects/darshan/docs/darshan-runtime.html#_configuring_darshan_library_at_runtime
Implemented performance optimizations to Darshan’s wrappers, locking mechanisms, and timing mechanisms
- Includes optional RDTSCP-based timers via ‘–enable-rdtscp’ configure option
Removed deprecated performance estimates from darshan-parser and added 2 new derived metrics when using ‘–perf’ :
- agg_time_by_slowest (total elapsed time performing I/O by the slowest rank)
- slowest_rank_rw_only_time (total elapsed time performing read/write operations by the slowest rank)
Adopted automake/libtool support for Darshan build (contributed by Wei-Keng Liao)
Increased default record name memory to 1 MiB per-process to avoid recent user reports of exceeding old limit (256 KiB)

This release also marks our first stable release of the PyDarshan log analysis module, including a new PyDarshan-based job summary tool (ultimately will replace darshan-job-summary script). Users can get PyDarshan directly from PyPI, e.g., using ‘pip install darshan’. Documentation can be found here: https://www.mcs.anl.gov/research/projects/darshan/documentation/

Please report any questions, issues, or concerns with this pre-relase using the darshan-users mailing list, or by opening an issue on our GitHub: https://github.com/darshan-hpc/darshan.

Darshan 3.3.1 point release now available

June 30, 2021 by Snyder, Shane

A new 3.3.1 point release of Darshan is now available for download HERE.

This release contains an important bug fix that resolves dynamic linker errors when using Darshan’s Lustre instrumentation module in conjunction with Darshan’s LD_PRELOAD mechanism for instrumenting applications.

Additionally, this release is the first Darshan release that supports instrumenting non-MPI applications that call fork(), with log files now accurately generated for both the parent process and the child process.

Please report any issues, comments, or questions to us using the Darshan-users mailing list or our GitHub page.

Darshan has moved to GitHub

May 13, 2021 by Snyder, Shane

A quick heads up that we have just wrapped up our migration from GitLab to GitHub. You can find the new repository here: https://github.com/darshan-hpc/darshan

Going forward, please use our GitHub for submitting issues, pull requests, or for obtaining up-to-date versions of our code.

Darshan version 3.3.0 is now available!

May 10, 2021 by Snyder, Shane

Following up on our recent pre-releases, a new stable release of Darshan 3.3.0 is now available for download. You can get it HERE.

In addition to the new features and bug fixes introduced in 3.3.0 pre-releases, this release marks the first Darshan version with AutoPerf support. AutoPerf implements two additional Darshan instrumentation modules that can provide details on application MPI communication usage and application performance characteristics on Cray XC platforms:

APMPI: Instrumentation of over 70 MPI-3 communication routines, providing operation counts, datatype sizes, and timing information for each application MPI rank.
APXC: Instrumentation of Cray XC environments to provide network and compute counters of interest, via PAPI.

See darshan-runtime documentation for more details on how to build Darshan with AutoPerf support.

Please report any issues, comments, or questions to us using the Darshan-users mailing list or our GitLab page.