Multi-set Pre-processing of Multicolor Flow Cytometry Data

Rita Folcarelli*, Gerjen H. Tinnevelt, Bart Hilvering, Kristiaan Wouters, Selma van Staveren, Geert J. Postma, Nienke Vrisekoop, Lutgarde M.C. Buydens, Leo Koenderman, Jeroen J. Jansen

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Flow Cytometry is an analytical technology to simultaneously measure multiple markers per single cell. Ten thousands to millions of single cells can be measured per sample and each sample may contain a different number of cells. All samples may be bundled together, leading to a ‘multi-set’ structure. Many multivariate methods have been developed for Flow Cytometry data but none of them considers this structure in their quantitative handling of the data. The standard pre-processing used by existing multivariate methods provides models mainly influenced by the samples with more cells, while such a model should provide a balanced view of the biomedical information within all measurements. We propose an alternative ‘multi-set’ preprocessing that corrects for the difference in number of cells measured, balancing the relative importance of each multi-cell sample in the data while using all data collected from these expensive analyses. Moreover, one case example shows how multi-set pre-processing may benefit removal of undesired measurement-to-measurement variability and another where class-based multi-set pre-processing enhances the studied response upon comparison to the control reference samples. Our results show that adjusting data analysis algorithms to consider this multi-set structure may greatly benefit immunological insight and classification performance of Flow Cytometry data.

Original languageEnglish
Article number9716
JournalScientific Reports
Issue number1
Publication statusPublished - 1 Dec 2020

Cite this