Correcting for cell-type effects in DNA methylation studies: Reference-based method outperforms latent variable approaches in empirical studies

Mohammad W. Hattab, Andrey A. Shabalin, Shaunna L. Clark, Min Zhao, Gaurav Kumar, Robin F. Chan, Lin Ying Xie, Rick Jansen, Laura K.M. Han, Patrik K.E. Magnusson, Gerard van Grootheest, Christina M. Hultman, Brenda W.J.H. Penninx, Karolina A. Aberg, Edwin J.C.G. van den Oord*

*Corresponding author for this work

Research output: Contribution to journalLetterAcademicpeer-review


Based on an extensive simulation study, McGregor and colleagues recently recommended the use of surrogate variable analysis (SVA) to control for the confounding effects of cell-type heterogeneity in DNA methylation association studies in scenarios where no cell-type proportions are available. As their recommendation was mainly based on simulated data, we sought to replicate findings in two large-scale empirical studies. In our empirical data, SVA did not fully correct for cell-type effects, its performance was somewhat unstable, and it carried a risk of missing true signals caused by removing variation that might be linked to actual disease processes. By contrast, a reference-based correction method performed well and did not show these limitations. A disadvantage of this approach is that if reference methylomes are not (publicly) available, they will need to be generated once for a small set of samples. However, given the notable risk we observed for cell-type confounding, we argue that, to avoid introducing false-positive findings into the literature, it could be well worth making this investment. Please see related Correspondence article: related Research article:

Original languageEnglish
Article number24
JournalGenome Biology
Issue number1
Publication statusPublished - 30 Jan 2017

Cite this