EHA Library - The official digital education library of European Hematology Association (EHA)

APPLICATION OF MACHINE LEARNING IN STRATIFYING MINIMAL RESIDUAL DISEASE LEVEL WITH FLOW CYTOMETRY IN ACUTE MYELOID LEUKEMIA
Author(s): ,
Bor-Sheng Ko
Affiliations:
Department of Internal Medicine,National Taiwan University Hospital,Taipei,Taiwan, Province of China;Department of Hematological Oncology,National Taiwan University Cancer Center,Taipei,Taiwan, Province of China
,
Yu-Fen Wang
Affiliations:
AHEAD Medicine Corporation,Berkeley, CA,United States;National Taiwan University,Taipei,Taiwan, Province of China
,
Chi-Chun Lee
Affiliations:
Department of Electrical Engineering,National Tsing Hua University,HsinChu,Taiwan, Province of China
,
Hsin-An Hou
Affiliations:
Department of Internal Medicine,National Taiwan University Hospital,Taipei,Taiwan, Province of China
,
Jeng-Lin Li
Affiliations:
Department of Electrical Engineering,National Tsing Hua University,HsinChu,Taiwan, Province of China
,
Ting-Yu Chang
Affiliations:
AHEAD Medicine Corporation,Berkeley, CA,United States
,
Wang-Ting Hsieh
Affiliations:
AHEAD Medicine Corporation,Berkeley, CA,United States
Jih-Luh Tang
Affiliations:
Department of Hematological Oncology,National Taiwan University Cancer Center,Taipei,Taiwan, Province of China;Tai-Cheng Stem Cell Therapy Center,National Taiwan University,Taipei,Taiwan, Province of China
EHA Library. Ko B. 06/09/21; 325243; EP483
Bor-Sheng Ko
Bor-Sheng Ko
Contributions
Abstract
Presentation during EHA2021: All e-poster presentations will be made available as of Friday, June 11, 2021 (09:00 CEST) and will be accessible for on-demand viewing until August 15, 2021 on the Virtual Congress platform.

Abstract: EP483

Type: E-Poster Presentation

Session title: Acute myeloid leukemia - Clinical

Background
Residual disease detection and monitoring by flow cytometry guides clinical physicians to modify treatment strategies according to patient’s risk profile. Current flow cytometry analysis approach is based on manual interpretation, which is relatively labor-intensive and time-consuming. 

Aims
We propose to use machine learning algorithm for residual disease percentage (RDP) classification by using clinical flow cytometry (FC) data.

Methods
Retrospective clinical FC data of AML patients, as well as demographic data (age & gender) were collected from National Taiwan University Hospital. From 2013 to 2016, a total of 487 FC data positive for residual disease from 249 AML patients were enrolled in this study. There are 81 FC data with RDP within 0.01 to less than 1%, and 406 FC data with RDP greater than or equal to 1%. The median age at flow cytometry test performed was 51.8 years old.

Our proposed machine learning framework includes a phenotype representation learning paradigm and a classification model. To derive the phenotype representation, we trained a multivariate Gaussian Mixture Model (GMM) on the 38-dimension FC data to capture the training data distribution and characteristics in a probabilistic unsupervised manner. Then, a Fisher-scoring method derived from the differential of the learned GMM parameters was used to vectorize each sample as a high dimensional representation. This Fisher vectorization method transformed samples to a high dimensional feature space as phenotype vectors, which were finally fed into the random forest (RF) classifier. To alleviate the negative effects of imbalance classes in RDP identification tasks, we applied synthetic minority oversampling technique (SMOTE) algorithm which augmented the minority class by linearly interpolating synthetic samples from existing samples in the minority class. We train RF models for original fisher vectorization feature set and oversampled set separately to discriminate the RDP classes. The algorithm is evaluated by randomly divided 5-fold cross validation which separates 80% data for training and 20% for testing.

Results
The accuracy (ACC) and area under the ROC curve (AUC) of RDP prediction models achieved 0.897 and 0.934, respectively (Table 1a). Around 91.9% of those FC data with RDP greater than or equal to 1%, and 79.0% of those FC data with RDP within 0.01 to less than 1% are correctly classified when using oversampled set (Table 1b).

Conclusion
This study demonstrated the potential of machine learning algorithm used in RDP prediction in patient with AML. Further study with larger cohorts or different data sources are needed to validate this machine learning based prediction model as a clinical support tool to assist physicians in clinical decision making.

Keyword(s): Acute myeloid leukemia, Automation, Flow cytometry, Minimal residual disease (MRD)

Presentation during EHA2021: All e-poster presentations will be made available as of Friday, June 11, 2021 (09:00 CEST) and will be accessible for on-demand viewing until August 15, 2021 on the Virtual Congress platform.

Abstract: EP483

Type: E-Poster Presentation

Session title: Acute myeloid leukemia - Clinical

Background
Residual disease detection and monitoring by flow cytometry guides clinical physicians to modify treatment strategies according to patient’s risk profile. Current flow cytometry analysis approach is based on manual interpretation, which is relatively labor-intensive and time-consuming. 

Aims
We propose to use machine learning algorithm for residual disease percentage (RDP) classification by using clinical flow cytometry (FC) data.

Methods
Retrospective clinical FC data of AML patients, as well as demographic data (age & gender) were collected from National Taiwan University Hospital. From 2013 to 2016, a total of 487 FC data positive for residual disease from 249 AML patients were enrolled in this study. There are 81 FC data with RDP within 0.01 to less than 1%, and 406 FC data with RDP greater than or equal to 1%. The median age at flow cytometry test performed was 51.8 years old.

Our proposed machine learning framework includes a phenotype representation learning paradigm and a classification model. To derive the phenotype representation, we trained a multivariate Gaussian Mixture Model (GMM) on the 38-dimension FC data to capture the training data distribution and characteristics in a probabilistic unsupervised manner. Then, a Fisher-scoring method derived from the differential of the learned GMM parameters was used to vectorize each sample as a high dimensional representation. This Fisher vectorization method transformed samples to a high dimensional feature space as phenotype vectors, which were finally fed into the random forest (RF) classifier. To alleviate the negative effects of imbalance classes in RDP identification tasks, we applied synthetic minority oversampling technique (SMOTE) algorithm which augmented the minority class by linearly interpolating synthetic samples from existing samples in the minority class. We train RF models for original fisher vectorization feature set and oversampled set separately to discriminate the RDP classes. The algorithm is evaluated by randomly divided 5-fold cross validation which separates 80% data for training and 20% for testing.

Results
The accuracy (ACC) and area under the ROC curve (AUC) of RDP prediction models achieved 0.897 and 0.934, respectively (Table 1a). Around 91.9% of those FC data with RDP greater than or equal to 1%, and 79.0% of those FC data with RDP within 0.01 to less than 1% are correctly classified when using oversampled set (Table 1b).

Conclusion
This study demonstrated the potential of machine learning algorithm used in RDP prediction in patient with AML. Further study with larger cohorts or different data sources are needed to validate this machine learning based prediction model as a clinical support tool to assist physicians in clinical decision making.

Keyword(s): Acute myeloid leukemia, Automation, Flow cytometry, Minimal residual disease (MRD)

By clicking “Accept Terms & all Cookies” or by continuing to browse, you agree to the storing of third-party cookies on your device to enhance your user experience and agree to the user terms and conditions of this learning management system (LMS).

Cookie Settings
Accept Terms & all Cookies