Accessibility navigation

Fast retrieval of weather analogues in a multi-petabyte meteorological archive

Raoult, B. (2020) Fast retrieval of weather analogues in a multi-petabyte meteorological archive. PhD thesis, University of Reading

Text - Thesis
· Please see our End User Agreement before downloading.

[img] Text - Thesis Deposit Form
· Restricted to Repository staff only


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.48683/1926.00093430


The European Centre for Medium-Range Weather Forecasts (ECMWF) manages the largest archive of meteorological data in the world. At the time of writing, it holds around 300 petabytes and grows at a rate of 1 petabyte per week. This archive is now mature, and contains valuable datasets such as several reanalyses, providing a consistent view of the weather over several decades. Weather analogue is the term used by meteorologists to refer to similar weather situations. Looking for analogues in an archive using a brute force approach requires data to be retrieved from tape and then compared to a user-provided weather pattern, using a chosen similarity measure. Such an operation would be very long and costly. In this work, a wavelet-based fingerprinting scheme is proposed to index all weather patterns from the archive, over a selected geographical domain. The system answers search queries by computing the fingerprint of the query pattern and looking for close matched in the index. Searches are fast enough that they are perceived as being instantaneous. A web-based application is provided, allowing users to express their queries interactively in a friendly and straightforward manner by sketching weather patterns directly in their web browser. Matching results are then presented as a series of weather maps, labelled with the date and time at which they occur. The system has been deployed as part of the Copernicus Climate Data Store and allows the retrieval of weather analogues from ERA5, a 40-years hourly reanalysis dataset. Some preliminary results of this work have been presented at the International Conference on Computational Science 2018 (Raoult et al. (2018)).

Item Type:Thesis (PhD)
Thesis Supervisor:Di Fatta, G.
Thesis/Report Department:School of Mathematical, Physical & Computational Sciences
Identification Number/DOI:
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:93430


Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation