Correcting for cell-type effects in DNA methylation studies: Reference-based method outperforms latent variable approaches in empirical studies

Mohammad W. Hattab, Andrey A. Shabalin, Shaunna L. Clark, Min Zhao, Gaurav Kumar, Robin F. Chan, Lin Ying Xie, Rick Jansen, Laura K.M. Han, Patrik K.E. Magnusson, Gerard van Grootheest, Christina M. Hultman, Brenda W.J.H. Penninx, Karolina A. Aberg, Edwin J.C.G. van den Oord

Research output: Contribution to journalLetterAcademicpeer-review

Abstract

Based on an extensive simulation study, McGregor and colleagues recently recommended the use of surrogate variable analysis (SVA) to control for the confounding effects of cell-type heterogeneity in DNA methylation association studies in scenarios where no cell-type proportions are available. As their recommendation was mainly based on simulated data, we sought to replicate findings in two large-scale empirical studies. In our empirical data, SVA did not fully correct for cell-type effects, its performance was somewhat unstable, and it carried a risk of missing true signals caused by removing variation that might be linked to actual disease processes. By contrast, a reference-based correction method performed well and did not show these limitations. A disadvantage of this approach is that if reference methylomes are not (publicly) available, they will need to be generated once for a small set of samples. However, given the notable risk we observed for cell-type confounding, we argue that, to avoid introducing false-positive findings into the literature, it could be well worth making this investment. Please see related Correspondence article: https://genomebiology.biomedcentral.com/articles/10/1186/s13059-017-1149-7and related Research article: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0935-y

Original languageEnglish
Article number24
JournalGenome Biology
Volume18
Issue number1
DOIs
Publication statusPublished - 30 Jan 2017

Cite this

Hattab, M. W., Shabalin, A. A., Clark, S. L., Zhao, M., Kumar, G., Chan, R. F., ... van den Oord, E. J. C. G. (2017). Correcting for cell-type effects in DNA methylation studies: Reference-based method outperforms latent variable approaches in empirical studies. Genome Biology, 18(1), [24]. https://doi.org/10.1186/s13059-017-1148-8
Hattab, Mohammad W. ; Shabalin, Andrey A. ; Clark, Shaunna L. ; Zhao, Min ; Kumar, Gaurav ; Chan, Robin F. ; Xie, Lin Ying ; Jansen, Rick ; Han, Laura K.M. ; Magnusson, Patrik K.E. ; van Grootheest, Gerard ; Hultman, Christina M. ; Penninx, Brenda W.J.H. ; Aberg, Karolina A. ; van den Oord, Edwin J.C.G. / Correcting for cell-type effects in DNA methylation studies : Reference-based method outperforms latent variable approaches in empirical studies. In: Genome Biology. 2017 ; Vol. 18, No. 1.
@article{634e2746595742ef95f7bce0f1385bb5,
title = "Correcting for cell-type effects in DNA methylation studies: Reference-based method outperforms latent variable approaches in empirical studies",
abstract = "Based on an extensive simulation study, McGregor and colleagues recently recommended the use of surrogate variable analysis (SVA) to control for the confounding effects of cell-type heterogeneity in DNA methylation association studies in scenarios where no cell-type proportions are available. As their recommendation was mainly based on simulated data, we sought to replicate findings in two large-scale empirical studies. In our empirical data, SVA did not fully correct for cell-type effects, its performance was somewhat unstable, and it carried a risk of missing true signals caused by removing variation that might be linked to actual disease processes. By contrast, a reference-based correction method performed well and did not show these limitations. A disadvantage of this approach is that if reference methylomes are not (publicly) available, they will need to be generated once for a small set of samples. However, given the notable risk we observed for cell-type confounding, we argue that, to avoid introducing false-positive findings into the literature, it could be well worth making this investment. Please see related Correspondence article: https://genomebiology.biomedcentral.com/articles/10/1186/s13059-017-1149-7and related Research article: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0935-y",
author = "Hattab, {Mohammad W.} and Shabalin, {Andrey A.} and Clark, {Shaunna L.} and Min Zhao and Gaurav Kumar and Chan, {Robin F.} and Xie, {Lin Ying} and Rick Jansen and Han, {Laura K.M.} and Magnusson, {Patrik K.E.} and {van Grootheest}, Gerard and Hultman, {Christina M.} and Penninx, {Brenda W.J.H.} and Aberg, {Karolina A.} and {van den Oord}, {Edwin J.C.G.}",
year = "2017",
month = "1",
day = "30",
doi = "10.1186/s13059-017-1148-8",
language = "English",
volume = "18",
journal = "Genome Biology",
issn = "1465-6906",
publisher = "BioMed Central Ltd.",
number = "1",

}

Correcting for cell-type effects in DNA methylation studies : Reference-based method outperforms latent variable approaches in empirical studies. / Hattab, Mohammad W.; Shabalin, Andrey A.; Clark, Shaunna L.; Zhao, Min; Kumar, Gaurav; Chan, Robin F.; Xie, Lin Ying; Jansen, Rick; Han, Laura K.M.; Magnusson, Patrik K.E.; van Grootheest, Gerard; Hultman, Christina M.; Penninx, Brenda W.J.H.; Aberg, Karolina A.; van den Oord, Edwin J.C.G.

In: Genome Biology, Vol. 18, No. 1, 24, 30.01.2017.

Research output: Contribution to journalLetterAcademicpeer-review

TY - JOUR

T1 - Correcting for cell-type effects in DNA methylation studies

T2 - Reference-based method outperforms latent variable approaches in empirical studies

AU - Hattab, Mohammad W.

AU - Shabalin, Andrey A.

AU - Clark, Shaunna L.

AU - Zhao, Min

AU - Kumar, Gaurav

AU - Chan, Robin F.

AU - Xie, Lin Ying

AU - Jansen, Rick

AU - Han, Laura K.M.

AU - Magnusson, Patrik K.E.

AU - van Grootheest, Gerard

AU - Hultman, Christina M.

AU - Penninx, Brenda W.J.H.

AU - Aberg, Karolina A.

AU - van den Oord, Edwin J.C.G.

PY - 2017/1/30

Y1 - 2017/1/30

N2 - Based on an extensive simulation study, McGregor and colleagues recently recommended the use of surrogate variable analysis (SVA) to control for the confounding effects of cell-type heterogeneity in DNA methylation association studies in scenarios where no cell-type proportions are available. As their recommendation was mainly based on simulated data, we sought to replicate findings in two large-scale empirical studies. In our empirical data, SVA did not fully correct for cell-type effects, its performance was somewhat unstable, and it carried a risk of missing true signals caused by removing variation that might be linked to actual disease processes. By contrast, a reference-based correction method performed well and did not show these limitations. A disadvantage of this approach is that if reference methylomes are not (publicly) available, they will need to be generated once for a small set of samples. However, given the notable risk we observed for cell-type confounding, we argue that, to avoid introducing false-positive findings into the literature, it could be well worth making this investment. Please see related Correspondence article: https://genomebiology.biomedcentral.com/articles/10/1186/s13059-017-1149-7and related Research article: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0935-y

AB - Based on an extensive simulation study, McGregor and colleagues recently recommended the use of surrogate variable analysis (SVA) to control for the confounding effects of cell-type heterogeneity in DNA methylation association studies in scenarios where no cell-type proportions are available. As their recommendation was mainly based on simulated data, we sought to replicate findings in two large-scale empirical studies. In our empirical data, SVA did not fully correct for cell-type effects, its performance was somewhat unstable, and it carried a risk of missing true signals caused by removing variation that might be linked to actual disease processes. By contrast, a reference-based correction method performed well and did not show these limitations. A disadvantage of this approach is that if reference methylomes are not (publicly) available, they will need to be generated once for a small set of samples. However, given the notable risk we observed for cell-type confounding, we argue that, to avoid introducing false-positive findings into the literature, it could be well worth making this investment. Please see related Correspondence article: https://genomebiology.biomedcentral.com/articles/10/1186/s13059-017-1149-7and related Research article: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0935-y

UR - http://www.scopus.com/inward/record.url?scp=85011097316&partnerID=8YFLogxK

U2 - 10.1186/s13059-017-1148-8

DO - 10.1186/s13059-017-1148-8

M3 - Letter

VL - 18

JO - Genome Biology

JF - Genome Biology

SN - 1465-6906

IS - 1

M1 - 24

ER -