Accessibility navigation


Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes

Edmunds, N. S., Genc, A. G. and McGuffin, L. J. ORCID: https://orcid.org/0000-0003-4501-4767 (2024) Benchmarking of AlphaFold2 accuracy self-estimates as indicators of empirical model quality and ranking: a comparison with independent model quality assessment programmes. Bioinformatics, 40 (8). btae491. ISSN 1460-2059

[img]
Preview
Text (Open Access) - Published Version
· Available under License Creative Commons Attribution.
· Please see our End User Agreement before downloading.

1MB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1093/bioinformatics/btae491

Abstract/Summary

Motivation Despite an increase in protein modelling accuracy following the development of AlphaFold2, there remains an accuracy gap between predicted and observed model quality assessment (MQA) scores. In CASP15, variations in AlphaFold2 model accuracy prediction were noticed for quaternary models of very similar observed quality. In this study, we compare plDDT and pTM to their observed counterparts the local distance difference test (lDDT) and TM-score for both tertiary and quaternary models to examine whether reliability is retained across the scoring range under normal modelling conditions and in situations where AlphaFold2 functionality is customized. We also explore plDDT and pTM ranking accuracy in comparison with the published independent MQA programmes ModFOLD9 and ModFOLDdock. Results plDDT was found to be an accurate descriptor of tertiary model quality compared to observed lDDT-Cα scores (Pearson r = 0.97), and achieved a ranking agreement true positive rate (TPR) of 0.34 with observed scores, which ModFOLD9 could not improve. However, quaternary structure accuracy was reduced (plDDT r = 0.67, pTM r = 0.70) and significant overprediction was seen with both scores for some lower quality models. Additionally, ModFOLDdock was able to improve upon AF2-Multimer model ranking compared to TM-score (TPR 0.34) and oligo-lDDT score (TPR 0.43). Finally, evidence is presented for increased variability in plDDT and pTM when using custom template recycling, which is more pronounced for quaternary structures.

Item Type:Article
Refereed:Yes
Divisions:Interdisciplinary centres and themes > Institute for Cardiovascular and Metabolic Research (ICMR)
Life Sciences > School of Biological Sciences > Biomedical Sciences
ID Code:117806
Publisher:Oxford University Press

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation