Accessibility navigation


Multi-stage multimodal fusion network with language models and uncertainty evaluation for early risk stratification in rheumatic and musculoskeletal diseases

Wang, B. ORCID: https://orcid.org/0000-0003-1403-1847, Li, W. ORCID: https://orcid.org/0000-0003-2878-3185, Bradlow, A., Watt, A., Chan, A. T. Y. and Bazuaye, E. (2025) Multi-stage multimodal fusion network with language models and uncertainty evaluation for early risk stratification in rheumatic and musculoskeletal diseases. Information Fusion. ISSN 15662535 (In Press)

[img] Text - Accepted Version
· Restricted to Repository staff only
· The Copyright of this document has not been checked yet. This may affect its availability.

2MB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Abstract/Summary

Precise risk stratification of rheumatic musculoskeletal diseases (RMDs) is crucial for ensuring patients get right referrals and treatments quickly. However, it is challenging due to the non-specific symptoms and the lack of the diagnostically definitive single biomarker. The real-world referral data present several challenges such as the free format texts and incomplete data challenges, which introduces further modeling complexity, and makes uncertainty quantification crucial for ensuring reliable predictions and outcomes. To solve these challenges, we developed a multi-stage multimodal fusion network with conformal prediction method that can accurately risk stratify RMDs at the point of referrals, quantify the uncertainty and flag unreliable predictions for physician's interventions. The proposed models were trained and evaluated using referral data from 128 General Practices (GPs) in the UK, which include patients who visited and were referred by GPs with suspected inflammatory conditions in RMDs between February 2018 and January 2024. Our model achieved 0.73 accuracy, 0.79 AUC, and 0.75 G-Mean to differentiate inflammatory conditions (IC) and non-inflammatory conditions (NIC) using patients’ presenting condition description (PCD) and medical history (MH) data, and 0.90 accuracy, 0.92 AUC, and 0.89 G-Mean using patients’ PCD, MH and additional blood test data (BTD). Furthermore, conformal prediction-based method has been developed to evaluate prediction uncertainty and can further identify 75.71% unreliable predictions for patients with PCD and MH data, and 66.67% unreliable predictions for patients with additional BTD data, which could be given a second-round examination by GP/secondary care clinicians for patient safety. The findings of this study suggest that language models with multi-stage multimodal fusion and uncertainty evaluation can risk stratify RMDs accurately using data available at the point of referral in the real world. Therefore, it is possible to be used by GPs and clinicians to help patients get the right treatment faster, demonstrating practical potential to improve RMDs referrals in the real world.

Item Type:Article
Refereed:Yes
Divisions:Interdisciplinary centres and themes > Health Innovation Partnership (HIP)
Henley Business School > Digitalisation, Marketing and Entrepreneurship
ID Code:121660
Publisher:Elsevier

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation