TY - JOUR
T1 - Global clinical performance rating, reliability and validity in an undergraduate clerkship
AU - Daelmans, H. E.M.
AU - van der Hem-Stokroos, H. H.
AU - Hoogenboom, R. J.I.
AU - Scherpbier, A. J.J.A.
AU - Stehouwer, C. D.A.
AU - van der Vleuten, C. P.M.
PY - 2005/7
Y1 - 2005/7
N2 - Background: Global performance rating is frequently used in clinical training despite its known psychometric drawbacks. Inter-rater reliability is low in undergraduate training but better in residency training, possibly because residency offers more opportunities for supervision. The low or moderate predictive validity of global performance ratings in undergraduate and residency training may be due to low or unknown reliability of both global performance ratings and criterion measures. In an undergraduate clerkship, we investigated whether reliability improves when raters are more familiar with students' work and whether validity improves with increased reliability of the predictor and criterion instrument. Methods: Inter-rater reliability was determined in a clerkship with more student-rater contacts than usual. The in-training assessment programme of the clerkship that immediately followed was used as the criterion measure to determine predictive validity. Results: With four ratings, inter-rater reliability was 0.41 and predictive validity was 0.32. Reliability was lower and validity slightly higher than similar results published for residency training. Conclusion: Even with increased student-rater interaction, the reliability and validity of global performance ratings were too low to warrant the usage of global performance ratings as individual assessment format. However, combined with other assessment measures, global performance ratings may lead to improved integral assessment.
AB - Background: Global performance rating is frequently used in clinical training despite its known psychometric drawbacks. Inter-rater reliability is low in undergraduate training but better in residency training, possibly because residency offers more opportunities for supervision. The low or moderate predictive validity of global performance ratings in undergraduate and residency training may be due to low or unknown reliability of both global performance ratings and criterion measures. In an undergraduate clerkship, we investigated whether reliability improves when raters are more familiar with students' work and whether validity improves with increased reliability of the predictor and criterion instrument. Methods: Inter-rater reliability was determined in a clerkship with more student-rater contacts than usual. The in-training assessment programme of the clerkship that immediately followed was used as the criterion measure to determine predictive validity. Results: With four ratings, inter-rater reliability was 0.41 and predictive validity was 0.32. Reliability was lower and validity slightly higher than similar results published for residency training. Conclusion: Even with increased student-rater interaction, the reliability and validity of global performance ratings were too low to warrant the usage of global performance ratings as individual assessment format. However, combined with other assessment measures, global performance ratings may lead to improved integral assessment.
KW - Clerkship
KW - Disattenuation
KW - Global clinical performance rating
KW - Inter-rater reliability
KW - Predictive validity
UR - http://www.scopus.com/inward/record.url?scp=23844500326&partnerID=8YFLogxK
M3 - Article
C2 - 16093582
AN - SCOPUS:23844500326
SN - 0300-2977
VL - 63
SP - 279
EP - 284
JO - Netherlands Journal of Medicine
JF - Netherlands Journal of Medicine
IS - 7
ER -