pyfive: A pure-Python HDF5 reader

[thumbnail of Open Access]
Preview
Text (Open Access)
- Published Version
· Available under License Creative Commons Attribution.

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Lawrence, B. N. ORCID: https://orcid.org/0000-0001-9262-7860, Cimadevilla, E. ORCID: https://orcid.org/0000-0002-8437-2068, Nolf, W. D. ORCID: https://orcid.org/0000-0003-2258-9402, Hassell, D. ORCID: https://orcid.org/0000-0002-5312-4950, Helmus, J., Hodel, B., Maranville, B. ORCID: https://orcid.org/0000-0002-6105-8789, Mühlbauer, K. ORCID: https://orcid.org/0000-0001-6599-1034 and Predoi, V. ORCID: https://orcid.org/0000-0002-9729-6578 (2026) pyfive: A pure-Python HDF5 reader. Journal of Open Source Software, 11 (118). 9688. ISSN 2475-9066 doi: 10.21105/joss.09688

Abstract/Summary

pyfive is an open-source and thread-safe pure Python package for reading data stored in HDF5. While it is not a complete implementation of all the specifications and capabilities of HDF5, it includes all the core functionality necessary to read gridded datasets, whether stored contiguously or with chunks (with or without standard compression options). All data access is fully lazy as the data is only read from storage when the numpy data arrays are manipulated. Originally developed some years ago, the package has recently been expanded to support lazy data access, and to add missing features necessary for handling all the HDF5-based environmental data known to the authors. It is now a realistic option for production data access in environmental science and more broadly across other domains. The API is based on that of h5py (https://github.com/h5py/h5py, a Python shimmy over the HDF5 C-library which itself is not thread-safe), with some API extensions to help optimise remote access. With these extensions, coupled with thread safety, many of the limitations precluding the efficient use of HDF5 (and netCDF4) on cloud storage have been removed.

Altmetric Badge

Dimensions Badge

Item Type Article
URI https://centaur.reading.ac.uk/id/eprint/128494
Identification Number/DOI 10.21105/joss.09688
Refereed Yes
Divisions Science > School of Mathematical, Physical and Computational Sciences > NCAS
Science > School of Mathematical, Physical and Computational Sciences > Department of Meteorology
Download/View statistics View download statistics for this item

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record