We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

Scalable prediction of acute myeloid leukemia using high-dimensional machine learning and blood transcriptomics.

Warnat-Herresthal, Stefanie and Perrakis, Konstantinos and Taschler, Bernd and Becker, Matthias and Baßler, Kevin and Beyer, Marc and Günther, Patrick and Schulte-Schrepping, Jonas and Seep, Lea and Klee, Kathrin and Ulas, Thomas and Haferlach, Torsten and Mukherjee, Sach and Schultze, Joachim L. (2020) 'Scalable prediction of acute myeloid leukemia using high-dimensional machine learning and blood transcriptomics.', iScience., 23 (1). p. 100780.


Acute myeloid leukemia (AML) is a severe, mostly fatal hematopoietic malignancy. We were interested in whether transcriptomic-based machine learning could predict AML status without requiring expert input. Using 12,029 samples from 105 different studies, we present a large-scale study of machine learning-based prediction of AML in which we address key questions relating to the combination of machine learning and transcriptomics and their practical use. We find data-driven, high-dimensional approaches—in which multivariate signatures are learned directly from genome-wide data with no prior knowledge—to be accurate and robust. Importantly, these approaches are highly scalable with low marginal cost, essentially matching human expert annotation in a near-automated workflow. Our results support the notion that transcriptomics combined with machine learning could be used as part of an integrated -omics approach wherein risk prediction, differential diagnosis, and subclassification of AML are achieved by genomics while diagnosis could be assisted by transcriptomic-based machine learning.

Item Type:Article
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution Non-commercial No Derivatives.
Download PDF
Publisher Web site:
Publisher statement:© 2020 The Authors.This is an open access article under the CC BY-NC-ND license (
Date accepted:12 December 2019
Date deposited:18 June 2020
Date of first online publication:18 December 2019
Date first made open access:18 June 2020

Save or Share this output

Look up in GoogleScholar