To examine the consistency between FSL, FreeSurfer, SPM for GM atrophy measurement (for volumes, patient/control discrimination, and correlations with cognition).
Materials and Methods
127 MS patients and 50 controls were included and cortical and deep grey matter (DGM) volumetrics were performed. Consistency of volumes was assessed with Intraclass Correlation Coefficient/ICC. Consistency of patients/controls discrimination was assessed with Cohen’s d, t-tests, MANOVA and a penalized double-loop logistic classifier. Consistency of association with cognition was assessed with Pearson correlation coefficient and ANOVA. Voxel-based morphometry (SPM-VBM and FSL-VBM) and vertex-wise FreeSurfer were used for group-level comparisons.
The highest volumetry ICC were between SPM and FreeSurfer for cortical regions, and the lowest between SPM and FreeSurfer for DGM. The caudate nucleus and temporal lobes had high consistency between all software, while amygdala had lowest volumetric consistency. Consistency of patients/controls discrimination was largest in the DGM for all software, especially for thalamus and pallidum. The penalized double-loop logistic classifier most often selected the thalamus, pallidum and amygdala for all software. FSL yielded the largest number of significant correlations. DGM yielded stronger correlations with cognition than cortical volumes. Bilateral putamen and left insula volumes correlated with cognition using all methods.
GM volumes from FreeSurfer, FSL and SPM are different, especially for cortical regions. While group-level separation between MS and controls is comparable, correlations between regional GM volumes and clinical/cognitive variables in MS should be cautiously interpreted.