Abstract

Background: Tumor segmentation of glioma on MRI is a technique to monitor, quantify and report disease progression. Manual MRI segmentation is the gold standard but very labor intensive. At present the quality of this gold standard is not known for different stages of the disease, and prior work has mainly focused on treatment-naive glioblastoma. In this paper we studied the inter-rater agreement of manual MRI segmentation of glioblastoma and WHO grade II-III glioma for novices and experts at three stages of disease. We also studied the impact of inter-observer variation on extent of resection and growth rate. Methods: In 20 patients with WHO grade IV glioblastoma and 20 patients with WHO grade II-III glioma (defined as non-glioblastoma) both the enhancing and non-enhancing tumor elements were segmented on MRI, using specialized software, by four novices and four experts before surgery, after surgery and at time of tumor progression. We used the generalized conformity index (GCI) and the intra-class correlation coefficient (ICC) of tumor volume as main outcome measures for inter-rater agreement. Results: For glioblastoma, segmentations by experts and novices were comparable. The inter-rater agreement of enhancing tumor elements was excellent before surgery (GCI 0.79, ICC 0.99) poor after surgery (GCI 0.32, ICC 0.92), and good at progression (GCI 0.65, ICC 0.91). For non-glioblastoma, the inter-rater agreement was generally higher between experts than between novices. The inter-rater agreement was excellent between experts before surgery (GCI 0.77, ICC 0.92), was reasonable after surgery (GCI 0.48, ICC 0.84), and good at progression (GCI 0.60, ICC 0.80). The inter-rater agreement was good between novices before surgery (GCI 0.66, ICC 0.73), was poor after surgery (GCI 0.33, ICC 0.55), and poor at progression (GCI 0.36, ICC 0.73). Further analysis showed that the lower inter-rater agreement of segmentation on postoperative MRI could only partly be explained by the smaller volumes and fragmentation of residual tumor. The median interquartile range of extent of resection between raters was 8.3% and of growth rate was 0.22 mm/year. Conclusion: Manual tumor segmentations on MRI have reasonable agreement for use in spatial and volumetric analysis. Agreement in spatial overlap is of concern with segmentation after surgery for glioblastoma and with segmentation of non-glioblastoma by non-experts.
Original languageEnglish
Article number101727
JournalNeuroImage: Clinical
Volume22
DOIs
Publication statusPublished - 2019

Cite this

@article{54239f655b7b4423859907abf3493294,
title = "Inter-rater agreement in glioma segmentations on longitudinal MRI",
abstract = "Background: Tumor segmentation of glioma on MRI is a technique to monitor, quantify and report disease progression. Manual MRI segmentation is the gold standard but very labor intensive. At present the quality of this gold standard is not known for different stages of the disease, and prior work has mainly focused on treatment-naive glioblastoma. In this paper we studied the inter-rater agreement of manual MRI segmentation of glioblastoma and WHO grade II-III glioma for novices and experts at three stages of disease. We also studied the impact of inter-observer variation on extent of resection and growth rate. Methods: In 20 patients with WHO grade IV glioblastoma and 20 patients with WHO grade II-III glioma (defined as non-glioblastoma) both the enhancing and non-enhancing tumor elements were segmented on MRI, using specialized software, by four novices and four experts before surgery, after surgery and at time of tumor progression. We used the generalized conformity index (GCI) and the intra-class correlation coefficient (ICC) of tumor volume as main outcome measures for inter-rater agreement. Results: For glioblastoma, segmentations by experts and novices were comparable. The inter-rater agreement of enhancing tumor elements was excellent before surgery (GCI 0.79, ICC 0.99) poor after surgery (GCI 0.32, ICC 0.92), and good at progression (GCI 0.65, ICC 0.91). For non-glioblastoma, the inter-rater agreement was generally higher between experts than between novices. The inter-rater agreement was excellent between experts before surgery (GCI 0.77, ICC 0.92), was reasonable after surgery (GCI 0.48, ICC 0.84), and good at progression (GCI 0.60, ICC 0.80). The inter-rater agreement was good between novices before surgery (GCI 0.66, ICC 0.73), was poor after surgery (GCI 0.33, ICC 0.55), and poor at progression (GCI 0.36, ICC 0.73). Further analysis showed that the lower inter-rater agreement of segmentation on postoperative MRI could only partly be explained by the smaller volumes and fragmentation of residual tumor. The median interquartile range of extent of resection between raters was 8.3{\%} and of growth rate was 0.22 mm/year. Conclusion: Manual tumor segmentations on MRI have reasonable agreement for use in spatial and volumetric analysis. Agreement in spatial overlap is of concern with segmentation after surgery for glioblastoma and with segmentation of non-glioblastoma by non-experts.",
keywords = "Glioblastoma, Glioma, Inter-rater agreement, Low-grade glioma, MRI, Manual segmentation",
author = "M. Visser and M{\"u}ller, {D. M. J.} and {van Duijn}, {R. J. M.} and M. Smits and N. Verburg and Hendriks, {E. J.} and Nabuurs, {R. J. A.} and Bot, {J. C. J.} and Eijgelaar, {R. S.} and M. Witte and {van Herk}, {M. B.} and F. Barkhof and {de Witt Hamer}, {P. C.} and {de Munck}, {J. C.}",
year = "2019",
doi = "10.1016/j.nicl.2019.101727",
language = "English",
volume = "22",
journal = "NeuroImage: Clinical",
issn = "2213-1582",
publisher = "Elsevier BV",

}

Inter-rater agreement in glioma segmentations on longitudinal MRI. / Visser, M.; Müller, D. M. J.; van Duijn, R. J. M.; Smits, M.; Verburg, N.; Hendriks, E. J.; Nabuurs, R. J. A.; Bot, J. C. J.; Eijgelaar, R. S.; Witte, M.; van Herk, M. B.; Barkhof, F.; de Witt Hamer, P. C.; de Munck, J. C.

In: NeuroImage: Clinical, Vol. 22, 101727, 2019.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Inter-rater agreement in glioma segmentations on longitudinal MRI

AU - Visser, M.

AU - Müller, D. M. J.

AU - van Duijn, R. J. M.

AU - Smits, M.

AU - Verburg, N.

AU - Hendriks, E. J.

AU - Nabuurs, R. J. A.

AU - Bot, J. C. J.

AU - Eijgelaar, R. S.

AU - Witte, M.

AU - van Herk, M. B.

AU - Barkhof, F.

AU - de Witt Hamer, P. C.

AU - de Munck, J. C.

PY - 2019

Y1 - 2019

N2 - Background: Tumor segmentation of glioma on MRI is a technique to monitor, quantify and report disease progression. Manual MRI segmentation is the gold standard but very labor intensive. At present the quality of this gold standard is not known for different stages of the disease, and prior work has mainly focused on treatment-naive glioblastoma. In this paper we studied the inter-rater agreement of manual MRI segmentation of glioblastoma and WHO grade II-III glioma for novices and experts at three stages of disease. We also studied the impact of inter-observer variation on extent of resection and growth rate. Methods: In 20 patients with WHO grade IV glioblastoma and 20 patients with WHO grade II-III glioma (defined as non-glioblastoma) both the enhancing and non-enhancing tumor elements were segmented on MRI, using specialized software, by four novices and four experts before surgery, after surgery and at time of tumor progression. We used the generalized conformity index (GCI) and the intra-class correlation coefficient (ICC) of tumor volume as main outcome measures for inter-rater agreement. Results: For glioblastoma, segmentations by experts and novices were comparable. The inter-rater agreement of enhancing tumor elements was excellent before surgery (GCI 0.79, ICC 0.99) poor after surgery (GCI 0.32, ICC 0.92), and good at progression (GCI 0.65, ICC 0.91). For non-glioblastoma, the inter-rater agreement was generally higher between experts than between novices. The inter-rater agreement was excellent between experts before surgery (GCI 0.77, ICC 0.92), was reasonable after surgery (GCI 0.48, ICC 0.84), and good at progression (GCI 0.60, ICC 0.80). The inter-rater agreement was good between novices before surgery (GCI 0.66, ICC 0.73), was poor after surgery (GCI 0.33, ICC 0.55), and poor at progression (GCI 0.36, ICC 0.73). Further analysis showed that the lower inter-rater agreement of segmentation on postoperative MRI could only partly be explained by the smaller volumes and fragmentation of residual tumor. The median interquartile range of extent of resection between raters was 8.3% and of growth rate was 0.22 mm/year. Conclusion: Manual tumor segmentations on MRI have reasonable agreement for use in spatial and volumetric analysis. Agreement in spatial overlap is of concern with segmentation after surgery for glioblastoma and with segmentation of non-glioblastoma by non-experts.

AB - Background: Tumor segmentation of glioma on MRI is a technique to monitor, quantify and report disease progression. Manual MRI segmentation is the gold standard but very labor intensive. At present the quality of this gold standard is not known for different stages of the disease, and prior work has mainly focused on treatment-naive glioblastoma. In this paper we studied the inter-rater agreement of manual MRI segmentation of glioblastoma and WHO grade II-III glioma for novices and experts at three stages of disease. We also studied the impact of inter-observer variation on extent of resection and growth rate. Methods: In 20 patients with WHO grade IV glioblastoma and 20 patients with WHO grade II-III glioma (defined as non-glioblastoma) both the enhancing and non-enhancing tumor elements were segmented on MRI, using specialized software, by four novices and four experts before surgery, after surgery and at time of tumor progression. We used the generalized conformity index (GCI) and the intra-class correlation coefficient (ICC) of tumor volume as main outcome measures for inter-rater agreement. Results: For glioblastoma, segmentations by experts and novices were comparable. The inter-rater agreement of enhancing tumor elements was excellent before surgery (GCI 0.79, ICC 0.99) poor after surgery (GCI 0.32, ICC 0.92), and good at progression (GCI 0.65, ICC 0.91). For non-glioblastoma, the inter-rater agreement was generally higher between experts than between novices. The inter-rater agreement was excellent between experts before surgery (GCI 0.77, ICC 0.92), was reasonable after surgery (GCI 0.48, ICC 0.84), and good at progression (GCI 0.60, ICC 0.80). The inter-rater agreement was good between novices before surgery (GCI 0.66, ICC 0.73), was poor after surgery (GCI 0.33, ICC 0.55), and poor at progression (GCI 0.36, ICC 0.73). Further analysis showed that the lower inter-rater agreement of segmentation on postoperative MRI could only partly be explained by the smaller volumes and fragmentation of residual tumor. The median interquartile range of extent of resection between raters was 8.3% and of growth rate was 0.22 mm/year. Conclusion: Manual tumor segmentations on MRI have reasonable agreement for use in spatial and volumetric analysis. Agreement in spatial overlap is of concern with segmentation after surgery for glioblastoma and with segmentation of non-glioblastoma by non-experts.

KW - Glioblastoma

KW - Glioma

KW - Inter-rater agreement

KW - Low-grade glioma

KW - MRI

KW - Manual segmentation

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85062145311&origin=inward

UR - https://www.ncbi.nlm.nih.gov/pubmed/30825711

U2 - 10.1016/j.nicl.2019.101727

DO - 10.1016/j.nicl.2019.101727

M3 - Article

VL - 22

JO - NeuroImage: Clinical

JF - NeuroImage: Clinical

SN - 2213-1582

M1 - 101727

ER -