The human hippocampal formation can be divided into a set of cytoarchitecturally and functionally distinct subregions, involved in different aspects of memory formation. Neuroanatomical disruptions within these subregions are associated with several debilitating brain disorders including Alzheimer's disease, major depression, schizophrenia, and bipolar disorder. Multi-center brain imaging consortia, such as the Enhancing Neuro Imaging Genetics through Meta-Analysis (ENIGMA) consortium, are interested in studying disease effects on these subregions, and in the genetic factors that affect them. For large-scale studies, automated extraction and subsequent genomic association studies of these hippocampal subregion measures may provide additional insight. Here, we evaluated the test–retest reliability and transplatform reliability (1.5 T versus 3 T) of the subregion segmentation module in the FreeSurfer software package using three independent cohorts of healthy adults, one young (Queensland Twins Imaging Study, N = 39), another elderly (Alzheimer's Disease Neuroimaging Initiative, ADNI-2, N = 163) and another mixed cohort of healthy and depressed participants (Max Planck Institute, MPIP, N = 598). We also investigated agreement between the most recent version of this algorithm (v6.0) and an older version (v5.3), again using the ADNI-2 and MPIP cohorts in addition to a sample from the Netherlands Study for Depression and Anxiety (NESDA) (N = 221). Finally, we estimated the heritability (h2) of the segmented subregion volumes using the full sample of young, healthy QTIM twins (N = 728). Test–retest reliability was high for all twelve subregions in the 3 T ADNI-2 sample (intraclass correlation coefficient (ICC) = 0.70–0.97) and moderate-to-high in the 4 T QTIM sample (ICC = 0.5–0.89). Transplatform reliability was strong for eleven of the twelve subregions (ICC = 0.66–0.96); however, the hippocampal fissure was not consistently reconstructed across 1.5 T and 3 T field strengths (ICC = 0.47–0.57). Between-version agreement was moderate for the hippocampal tail, subiculum and presubiculum (ICC = 0.78–0.84; Dice Similarity Coefficient (DSC) = 0.55–0.70), and poor for all other subregions (ICC = 0.34–0.81; DSC = 0.28–0.51). All hippocampal subregion volumes were highly heritable (h2 = 0.67–0.91). Our findings indicate that eleven of the twelve human hippocampal subregions segmented using FreeSurfer version 6.0 may serve as reliable and informative quantitative phenotypes for future multi-site imaging genetics initiatives such as those of the ENIGMA consortium.