Copula-based synthetic data augmentation for machine-learning emulatorsMeyer, D. ORCID: https://orcid.org/0000-0002-7071-7547, Nagler, T. and Hogan, R. J. ORCID: https://orcid.org/0000-0002-3180-5157 (2021) Copula-based synthetic data augmentation for machine-learning emulators. Geoscientific Model Development, 14 (8). pp. 5205-5215. ISSN 1991-9603
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. To link to this item DOI: 10.5194/gmd-14-5205-2021 Abstract/SummaryCan we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m−2).
Download Statistics DownloadsDownloads per month over past year Altmetric Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |