Accessibility navigation


Semi-automatic assessment of I/O behavior by inspecting the individual client-node timelines — an explorative study on 10^6 jobs

Betke, E. and Kunkel, J. (2020) Semi-automatic assessment of I/O behavior by inspecting the individual client-node timelines — an explorative study on 10^6 jobs. In: ISC HPC, 21-25 Jun 2020, Frankfurt, Germany. (In Press)

[img] Text - Accepted Version
· Restricted to Repository staff only
· The Copyright of this document has not been checked yet. This may affect its availability.

1MB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Abstract/Summary

HPC applications with suboptimal I/O behavior interfere with well-behaving applications and lead to increased application runtime. In some cases, this may even lead to unresponsive systems and unfinished jobs. HPC monitoring systems can aid users and support staff to identify problematic behavior and support optimization of problematic applications. The key issue is how to identify relevant applications? A profile of an application doesn’t allow to identify problematic phases during the execution but tracing of each individual I/O is too invasive. In this work, we split the execution into segments, i.e., windows of fixed size and analyze profiles of them. We develop three I/O metrics to identify three relevant classes of inefficient I/O behaviors, and evaluate them on raw data of 1,000,000 jobs on the supercomputer Mistral. The advantages of our method is that temporal information about I/O activities during job runtime is preserved to some extent and can be used to identify phases of inefficient I/O. The main contribution of this work is the segmentation of time series and computation of metrics (Job-I/O-Utilization, Job-I/O-Problem-Time, and Job-I/O-Balance) that are effective to identify problematic I/O phases and jobs.

Item Type:Conference or Workshop Item (Paper)
Refereed:Yes
Divisions:Faculty of Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:89639

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation