Copula-based synthetic data augmentation for machine-learning emulators
Meyer, D.
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. To link to this item DOI: 10.5194/gmd-14-5205-2021 Abstract/SummaryCan we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62 % for the mean absolute error (from 1.17 to 0.44 W m−2).
Download Statistics DownloadsDownloads per month over past year Altmetric Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |