Storing and manipulating environmental big data with JASMIN
Lawrence, B. N. , Bennett, V. L., Churchill, J., Juckes, M., Kershaw, P., Pascoe, S., Pepler, S., Priitchard, M. and Stephens, A. (2013) Storing and manipulating environmental big data with JASMIN. In: IEEE Big Data, October 6-9, 2013, San Francisco.
Official URL: http://dx.doi.org/10.1109/BigData.2013.6691556
JASMIN is a super-data-cluster designed to provide a high-performance high-volume data analysis environment for the UK environmental science community. Thus far JASMIN has been used primarily by the atmospheric science and earth observation communities, both to support their direct scientific workflow, and the curation of data products in the STFC Centre for Environmental Data Archival (CEDA). Initial JASMIN configuration and first experiences are reported here. Useful improvements in scientific workflow are presented. It is clear from the explosive growth in stored data and use that there was a pent up demand for a suitable big-data analysis environment. This demand is not yet satisfied, in part because JASMIN does not yet have enough compute, the storage is fully allocated, and not all software needs are met. Plans to address these constraints are introduced.