Differential dementia diagnosis on incomplete data with latent trees

Christian Ledig, Sebastian Kaltwang, Antti Tolonen, Juha Koikkalainen, Philip Scheltens, Frederik Barkhof, Hanneke Rhodius-Meester, Betty Tijms, Afina W. Lemstra, Wiesje van der Flier, Jyrki Lötjönen, Daniel Rueckert

Research output: Contribution to conferencePaperOther research output

Abstract

Incomplete patient data is a substantial problem that is not sufficiently addressed in current clinical research. Many published methods assume both completeness and validity of study data. However,this assumption is often violated as individual features might be unavailable due to missing patient examination or distorted/wrong due to inaccurate measurements or human error. In this work we propose to use the Latent Tree (LT) generative model to address current limitations due to missing data. We show on 491 subjects of a challenging dementia dataset that LT feature estimation is more robust towards incomplete data as compared to mean or Gaussian Mixture Model imputation and has a synergistic effect when combined with common classifiers (we use SVM as example). We show that LTs allow the inclusion of incomplete samples into classifier training. Using LTs,we obtain a balanced accuracy of 62% for the classification of all patients into five distinct dementia types even though 20% of the features are missing in both training and testing data (68% on complete data). Further,we confirm the potential of LTs to detect outlier samples within the dataset.

Original languageEnglish
Pages44-52
Number of pages9
DOIs
Publication statusPublished - 2016

Cite this

@conference{7423b3656c1b44dcbedf9f3ab3c9fee8,
title = "Differential dementia diagnosis on incomplete data with latent trees",
abstract = "Incomplete patient data is a substantial problem that is not sufficiently addressed in current clinical research. Many published methods assume both completeness and validity of study data. However,this assumption is often violated as individual features might be unavailable due to missing patient examination or distorted/wrong due to inaccurate measurements or human error. In this work we propose to use the Latent Tree (LT) generative model to address current limitations due to missing data. We show on 491 subjects of a challenging dementia dataset that LT feature estimation is more robust towards incomplete data as compared to mean or Gaussian Mixture Model imputation and has a synergistic effect when combined with common classifiers (we use SVM as example). We show that LTs allow the inclusion of incomplete samples into classifier training. Using LTs,we obtain a balanced accuracy of 62{\%} for the classification of all patients into five distinct dementia types even though 20{\%} of the features are missing in both training and testing data (68{\%} on complete data). Further,we confirm the potential of LTs to detect outlier samples within the dataset.",
keywords = "Dementia, Differential diagnosis, Incomplete data, Latent trees",
author = "Christian Ledig and Sebastian Kaltwang and Antti Tolonen and Juha Koikkalainen and Philip Scheltens and Frederik Barkhof and Hanneke Rhodius-Meester and Betty Tijms and Lemstra, {Afina W.} and {van der Flier}, Wiesje and Jyrki L{\"o}tj{\"o}nen and Daniel Rueckert",
year = "2016",
doi = "10.1007/978-3-319-46723-8_6",
language = "English",
pages = "44--52",

}

Differential dementia diagnosis on incomplete data with latent trees. / Ledig, Christian; Kaltwang, Sebastian; Tolonen, Antti; Koikkalainen, Juha; Scheltens, Philip; Barkhof, Frederik; Rhodius-Meester, Hanneke; Tijms, Betty; Lemstra, Afina W.; van der Flier, Wiesje; Lötjönen, Jyrki; Rueckert, Daniel.

2016. 44-52.

Research output: Contribution to conferencePaperOther research output

TY - CONF

T1 - Differential dementia diagnosis on incomplete data with latent trees

AU - Ledig, Christian

AU - Kaltwang, Sebastian

AU - Tolonen, Antti

AU - Koikkalainen, Juha

AU - Scheltens, Philip

AU - Barkhof, Frederik

AU - Rhodius-Meester, Hanneke

AU - Tijms, Betty

AU - Lemstra, Afina W.

AU - van der Flier, Wiesje

AU - Lötjönen, Jyrki

AU - Rueckert, Daniel

PY - 2016

Y1 - 2016

N2 - Incomplete patient data is a substantial problem that is not sufficiently addressed in current clinical research. Many published methods assume both completeness and validity of study data. However,this assumption is often violated as individual features might be unavailable due to missing patient examination or distorted/wrong due to inaccurate measurements or human error. In this work we propose to use the Latent Tree (LT) generative model to address current limitations due to missing data. We show on 491 subjects of a challenging dementia dataset that LT feature estimation is more robust towards incomplete data as compared to mean or Gaussian Mixture Model imputation and has a synergistic effect when combined with common classifiers (we use SVM as example). We show that LTs allow the inclusion of incomplete samples into classifier training. Using LTs,we obtain a balanced accuracy of 62% for the classification of all patients into five distinct dementia types even though 20% of the features are missing in both training and testing data (68% on complete data). Further,we confirm the potential of LTs to detect outlier samples within the dataset.

AB - Incomplete patient data is a substantial problem that is not sufficiently addressed in current clinical research. Many published methods assume both completeness and validity of study data. However,this assumption is often violated as individual features might be unavailable due to missing patient examination or distorted/wrong due to inaccurate measurements or human error. In this work we propose to use the Latent Tree (LT) generative model to address current limitations due to missing data. We show on 491 subjects of a challenging dementia dataset that LT feature estimation is more robust towards incomplete data as compared to mean or Gaussian Mixture Model imputation and has a synergistic effect when combined with common classifiers (we use SVM as example). We show that LTs allow the inclusion of incomplete samples into classifier training. Using LTs,we obtain a balanced accuracy of 62% for the classification of all patients into five distinct dementia types even though 20% of the features are missing in both training and testing data (68% on complete data). Further,we confirm the potential of LTs to detect outlier samples within the dataset.

KW - Dementia

KW - Differential diagnosis

KW - Incomplete data

KW - Latent trees

UR - http://www.scopus.com/inward/record.url?scp=84996598640&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-46723-8_6

DO - 10.1007/978-3-319-46723-8_6

M3 - Paper

SP - 44

EP - 52

ER -