Accessibility navigation


Toward understanding I/O behavior in HPC workflows

Lüttgau, J., Snyder, S., Carns, P., Wozniak, J. M., Kunkel, J. and Ludwig, T. (2018) Toward understanding I/O behavior in HPC workflows. In: PDSW-DISCS, 12 November 2018, Dallas, Texas, pp. 64-75.

[img]
Preview
Text - Accepted Version
· Please see our End User Agreement before downloading.

905kB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Official URL: https://ieeexplore.ieee.org/document/8638425

Abstract/Summary

Scientific discovery increasingly depends on complex workflows consisting of multiple phases and sometimes millions of parallelizable tasks or pipelines. These workflows access storage resources for a variety of purposes, including preprocessing, simulation output, and postprocessing steps. Unfortunately, most workflow models focus on the scheduling and allocation of com- putational resources for tasks while the impact on storage systems remains a secondary objective and an open research question. I/O performance is not usually accounted for in workflow telemetry reported to users. In this paper, we present an approach to augment the I/O efficiency of the individual tasks of workflows by combining workflow description frameworks with system I/O telemetry data. A conceptual architecture and a prototype implementation for HPC data center deployments are introduced. We also identify and discuss challenges that will need to be addressed by workflow management and monitoring systems for HPC in the future. We demonstrate how real-world applications and workflows could benefit from the approach, and we show how the approach helps communicate performance-tuning guidance to users.

Item Type:Conference or Workshop Item (Paper)
Refereed:Yes
Divisions:Faculty of Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:80104
Publisher:IEEE

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation