Transformer-decoder GPT models for generating virtual screening libraries of HMG-Coenzyme A reductase inhibitors: effects of temperature, prompt-length and transfer-learning strategiesCafiero, M. ORCID: https://orcid.org/0000-0002-4895-1783 (2024) Transformer-decoder GPT models for generating virtual screening libraries of HMG-Coenzyme A reductase inhibitors: effects of temperature, prompt-length and transfer-learning strategies. Journal of Chemical Information and Modeling. ISSN 1549-960X
It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing. To link to this item DOI: 10.1021/acs.jcim.4c01309 Abstract/SummaryAttention-based decoder models were used to generate libraries of novel inhibitors for the HMG-Coenzyme A reductase (HMGCR) enzyme. These deep neural network models were pre-trained on previously synthesized drug-like molecules from the ZINC15 database to learn the syntax of SMILES strings, and then fine-tuned with a set of ~1,000 molecules that inhibit HMGCR. The numbers of layers used for pre-training and fine-tuning were varied to find the optimal balance for robust library generation. Virtual screening libraries were also generated with different temperatures and numbers of input tokens (prompt-length) to find the most desirable molecular properties. The resulting libraries were screened against several criteria, including: IC50 values predicted by a Dense Neural Network (DNN) trained on experimental HMGCR IC50 values, docking scores from AutoDock Vina (via Dockstring), a calculated Quantitative Estimate of Druglikeness (QED), and Tanimoto similarity to known HMGCR inhibitors. It was found that 50/50 or 25/75% pre-trained/fine-tuned models with a non-zero temperature and shorter prompt-lengths produced the most robust libraries, and the DNN-predicted IC50 values had good correlation with docking scores and statin-similarity. 42% of generated molecules were classified as statin-like by k-means clustering, with the rosuvastatin-like group having the lowest IC50 values and lowest docking scores.
Download Statistics DownloadsDownloads per month over past year Altmetric Deposit Details University Staff: Request a correction | Centaur Editors: Update this record |