Explainable machine learning-based prediction of psoriatic arthritis flares using heterogenous real-world data for personalised patient care

[thumbnail of Psoriatic_Arthritis_Manuscript_07_10_2025 (1).docx]
Text
- Accepted Version
· Restricted to Repository staff only
· The Copyright of this document has not been checked yet. This may affect its availability.

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Moon, P., Li, W. ORCID: https://orcid.org/0000-0003-2878-3185, Chan, A., Wang, B. ORCID: https://orcid.org/0000-0003-1403-1847 and Bazuaye, E. (2025) Explainable machine learning-based prediction of psoriatic arthritis flares using heterogenous real-world data for personalised patient care. Methods. ISSN 1046-2023 doi: 10.1016/j.ymeth.2025.10.010 (In Press)

Abstract/Summary

Psoriatic arthritis (PsA) is a chronic inflammatory disease characterised by unpredictable flare-ups that are difficult to forecast, particularly in patients without an acute phase response. In this paper, we propose and apply an explainable, multimodal machine learning framework that jointly leverages structured temporal electronic patient records (EPRs) – sequential blood tests, disease activity scores, comorbidity burden, medications, and demographics – and unstructured clinical referral letters pre-processed with large language models ((LLMs, (Qwen-2.5 family)) to predict PsA flares. Gradient boosting models, Light Gradient Boosting Machine (LGBM) and eXtreme Gradient Boosting (XGBoost) were used to predict PsA flares, achieving the highest predictive performance 3 months before a clinic visit (accuracy = 92.8 %, AUROC = 0.94). Model performance gradually declined for longer timeframes (6 months: 78.2 %, AUROC = 0.80; 9 months: 76.6 %, AUROC = 0.78; 12 months: 72.2 %, AUROC = 0.75). LLMs applied to unstructured GP referral letters had limited standalone predictive value, but enhanced sensitivity and specificity when combined with the structured models in an ensemble approach. SHapley Additive exPlanations (SHAP) helped explain the prediction and demonstrated comorbidity count, disease scores, and immunosuppressive medications as the top predictors. Our results show that integrating both structured longitudinal data with unstructured clinical narratives using interpretable multimodal artificial intelligence can enable time-sensitive, personalised management of PsA flares and early clinical intervention.

Altmetric Badge

Item Type Article
URI https://centaur.reading.ac.uk/id/eprint/127277
Identification Number/DOI 10.1016/j.ymeth.2025.10.010
Refereed Yes
Divisions Interdisciplinary centres and themes > Health Innovation Partnership (HIP)
Henley Business School > Digitalisation, Marketing and Entrepreneurship
Publisher Elsevier
Download/View statistics View download statistics for this item

University Staff: Request a correction | Centaur Editors: Update this record