Accessibility navigation


The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O

Kunkel, J. M., Zimmer, M., Hübbe, N., Aguilera, A., Mickler, H., Wang, X., Chut, A., Bönisch, T., Lüttgau, J., Michel, R. and Weging, J. (2014) The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O. In: Kunkel, J. M., Ludwig, T. and Meuer, H. W. (eds.) Supercomputing. Lecture Notes in Computer Science (8488). Springer, pp. 245-260. ISBN 9783319075174

[img]
Preview
Text - Accepted Version
· Please see our End User Agreement before downloading.

972kB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1007/978-3-319-07518-1_16

Abstract/Summary

Performance analysis and optimization of high-performance I/O systems is a daunting task. Mainly, this is due to the overwhelmingly complex interplay of the involved hardware and software layers. The Scalable I/O for Extreme Performance (SIOX) project provides a versatile environment for monitoring I/O activities and learning from this information. The goal of SIOX is to automatically suggest and apply performance optimizations, and to assist in locating and diagnosing performance problems. In this paper, we present the current status of SIOX. Our modular architecture covers instrumentation of POSIX, MPI and other high-level I/O libraries; the monitoring data is recorded asynchronously into a global database, and recorded traces can be visualized. Furthermore, we offer a set of primitive plug-ins with additional features to demonstrate the flexibility of our architecture: A surveyor plug-in to keep track of the observed spatial access patterns; an fadvise plug-in for injecting hints to achieve read-ahead for strided access patterns; and an optimizer plug-in which monitors the performance achieved with different MPI-IO hints, automatically supplying the best known hint-set when no hints were explicitly set. The presentation of the technical status is accompanied by a demonstration of some of these features on our 20 node cluster. In additional experiments, we analyze the overhead for concurrent access, for MPI-IO’s 4-levels of access, and for an instrumented climate application. While our prototype is not yet full-featured, it demonstrates the potential and feasibility of our approach.

Item Type:Book or Report Section
Refereed:Yes
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:77680
Publisher:Springer

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation