Accessibility navigation

Copula-based synthetic data augmentation for machine-learning emulators

Meyer, D. ORCID:, Nagler, T. and Hogan, R. J. ORCID: (2021) Copula-based synthetic data augmentation for machine-learning emulators. Geoscientific Model Development, 14 (8). pp. 5205-5215. ISSN 1991-9603

Text (Open access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading.


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.5194/gmd-14-5205-2021


Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m−2).

Item Type:Article
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Meteorology
ID Code:101309
Publisher:European Geosciences Union


Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation