Accessibility navigation

Methane prediction equations including genera of rumen bacteria as predictor variables improve prediction accuracy

Zhang, B., Lin, S., Moraes, L., Firkins, J., Hristov, A. N., Kebreab, E., Janssen, P. H., Bannink, A., Bayat, A. R., Crompton, L. A., Dijkstra, J., Eugène, M. A., Kreuzer, M., McGee, M., Reynolds, C. K. ORCID:, Schwarm, A., Yáñez-Ruiz, D. R. and Yu, Z. (2023) Methane prediction equations including genera of rumen bacteria as predictor variables improve prediction accuracy. Scientific Reports, 13. 21305. ISSN 2045-2322

Text (Open Access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading.


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1038/s41598-023-48449-y


Methane (CH4) emissions from ruminants are of a significant environmental concern, necessitatingaccurate prediction for emission inventories. Existing models rely solely on dietary and host animal‑related data, ignoring the predicting power of rumen microbiota, the source of CH4. To address thislimitation, we developed novel CH4 prediction models incorporating rumen microbes as predictors,alongside animal‑ and feed‑related predictors using four statistical/machine learning (ML) methods.These include random forest combined with boosting (RF‑B), least absolute shrinkage and selectionoperator (LASSO), generalized linear mixed model with LASSO (glmLasso), and smoothlyclipped absolute deviation (SCAD) implemented on linear mixed models. With a sheep dataset (218 observations) of both animal data and rumen microbiota data (relative sequence abundance of 330genera of rumen bacteria, archaea, protozoa, and fungi), we developed linear mixed models to predictCH4 production (g CH 4 /animal·d, ANIM‑B models) and CH 4 yield (g CH 4 /kg of dry matter intake, DMI‑Bmodels). We also developed models solely based on animal‑related data. Prediction performancewas evaluated 200 times with random data splits, while fitting performance was assessed withoutdata splitting. The inclusion of microbial predictors improved the models, as indicated by decreasedroot mean square prediction error (RMSPE) and mean absolute error (MAE), and increased Lin’sconcordance correlation coefficient (CCC). Both glmmLasso and SCAD reduced the Akaike informationcriterion (AIC) and Bayesian information criterion (BIC) for both the ANIM‑B and the DMI‑B models,while the other two ML methods had mixed outcomes. By balancing prediction performance andfitting performance, we obtained one ANIM‑B model (containing 10 genera of bacteria and 3 animaldata) fitted using glmmLasso and one DMI‑B model (5 genera of bacteria and 1 animal datum) fittedusing SCAD. This study highlights the importance of incorporating rumen microbiota data in CH 4prediction models to enhance accuracy and robustness. Additionally, ML methods facilitate theselection of microbial predictors from high‑dimensional metataxonomic data of the rumen microbiotawithout overfitting. Moreover, the identified microbial predictors can serve as biomarkers of CH4 emissions from sheep, providing valuable insights for future research and mitigation strategies.

Item Type:Article
Divisions:Life Sciences > School of Agriculture, Policy and Development > Department of Animal Sciences
ID Code:114218
Publisher:Nature Publishing Group


Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation