OBJECTIVES: We sought to quantitatively determine the inter-observer variability of expert radiotherapy target-volume delineation for thymic cancers, as part of a larger effort to develop an expert-consensus contouring atlas.
METHODS: A pilot dataset was created consisting of a standardized case presentation with pre- and post-operative DICOM CT image sets from a single patient with Masaoka-Koga Stage III thymoma. Expert thoracic radiation oncologists delineated tumor targets on the pre- and post-operative scans as they would for a definitive and adjuvant case, respectively. Respondents completed a survey including recommended dose prescription and target volume margins for definitive and post-operative scenarios. Inter-observer variability was analyzed quantitatively with Warfield's simultaneous truth, performance level estimation (STAPLE) algorithm and Dice similarity coefficient (DSC).
RESULTS: Seven users completed contouring for definitive and adjuvant cases; of these, 5 completed online surveys. Segmentation performance was assessed, with high mean±SD STAPLE-estimated segmentation sensitivity for definitive case GTV and CTV at 0.77 and 0.80, respectively, and post-operative CTV sensitivity of 0.55; all volumes had specificity of ≥0.99. Inter-observer agreement was markedly higher for the definitive target volumes, with mean±SD DSC of 0.88±0.03 and 0.89±0.04 for GTV and CTV respectively, compared to post-op CTV DSC of 0.69±0.06 (Kruskal-Wallis p<0.01.
CONCLUSION: Expert agreement for definitive case volumes was exceptionally high, though significantly lower agreement was noted post-operatively. Technique and dose prescription between experts was substantively consistent, and these preliminary results will be utilized to create an expert-consensus contouring atlas to aid the non-expert radiation oncologist in the planning of these challenging, rare tumors.