Multiprobabilistic prediction in early medical diagnoses

Nouretdinov, I.; Devetyarov, D.; Vovk, V.; Burford, B.; Camuzeaux, S.; Gentry-Maharaj, A.; Tiss, Ali; Smith, Celia; Luo, Z.; Chervonenkis, A.; Hallett, R.; Waterfield, M.; Cramer, Rainer; Timms, J. F.; Jacobs, I.; Menon, U.; Gammerman, A.

Download

Full text not archived in this repository.

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Nouretdinov, I., Devetyarov, D., Vovk, V., Burford, B., Camuzeaux, S., Gentry-Maharaj, A., Tiss, A., Smith, C., Luo, Z., Chervonenkis, A., Hallett, R., Waterfield, M., Cramer, R. ORCID: https://orcid.org/0000-0002-8037-2511, Timms, J. F., Jacobs, I., Menon, U. and Gammerman, A. (2015) Multiprobabilistic prediction in early medical diagnoses. Annals of Mathematics and Artificial Intelligence, 74 (1-2). pp. 203-222. ISSN 1573-7470 doi: 10.1007/s10472-013-9367-5

Abstract/Summary

This paper describes the methodology of providing multiprobability predictions for proteomic mass spectrometry data. The methodology is based on a newly developed machine learning framework called Venn machines. Is allows to output a valid probability interval. The methodology is designed for mass spectrometry data. For demonstrative purposes, we applied this methodology to MALDI-TOF data sets in order to predict the diagnosis of heart disease and early diagnoses of ovarian cancer and breast cancer. The experiments showed that probability intervals are narrow, that is, the output of the multiprobability predictor is similar to a single probability distribution. In addition, probability intervals produced for heart disease and ovarian cancer data were more accurate than the output of corresponding probability predictor. When Venn machines were forced to make point predictions, the accuracy of such predictions is for the most data better than the accuracy of the underlying algorithm that outputs single probability distribution of a label. Application of this methodology to MALDI-TOF data sets empirically demonstrates the validity. The accuracy of the proposed method on ovarian cancer data rises from 66.7 % 11 months in advance of the moment of diagnosis to up to 90.2 % at the moment of diagnosis. The same approach has been applied to heart disease data without time dependency, although the achieved accuracy was not as high (up to 69.9 %). The methodology allowed us to confirm mass spectrometry peaks previously identified as carrying statistically significant information for discrimination between controls and cases.

Altmetric Badge

Dimensions Badge

Item Type	Article
URI	https://centaur.reading.ac.uk/id/eprint/58185
Identification Number/DOI	10.1007/s10472-013-9367-5
Refereed	Yes
Divisions	Interdisciplinary centres and themes > Chemical Analysis Facility (CAF) Life Sciences > School of Chemistry, Food and Pharmacy > Department of Chemistry
Publisher	Springer
Download/View statistics	View download statistics for this item

Related URLs

Deposit Details

CORE (COnnecting REpositories)

University Staff: Request a correction | Centaur Editors: Update this record

Date Deposited:	21 Mar 2016 10:19	Date item deposited into CentAUR
Last Modified:	29 Jun 2025 01:30	Date item last modified