Interrater agreement and reliability of clinical tests for assessment of patients with shoulder pain in primary care

Adri T. Apeldoorn, Marjolein C. den Arend, Ruud Schuitemaker, Dick Egmond, Karin Hekman, Tjeerd van der Ploeg, Steven J. Kamper, Maurits W. van Tulder, Raymond W. Ostelo

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Background: There is limited information about the agreement and reliability of clinical shoulder tests. Objectives: To assess the interrater agreement and reliability of clinical shoulder tests in patients with shoulder pain treated in primary care. Methods: Patients with a primary report of shoulder pain underwent a set of 21 clinical shoulder tests twice on the same day, by pairs of independent physical therapists. The outcome parameters were observed and specific interrater agreement for positive and negative scores, and interrater reliability (Cohen’s kappa (κ)). Positive and negative interrater agreement values of ≥0.75 were regarded as sufficient for clinical use. For Cohen’s κ, the following classification was used: <0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, 0.81–1.00 very good reliability. Participating clinics were randomized in two groups; with or without a brief practical session on how to conduct the tests. Results: A total of 113 patients were assessed in 12 physical therapy practices by 36 physical therapists. Positive and negative interrater agreement values were both sufficient for 1 test (the Full Can Test), neither sufficient for 5 tests, and only sufficient for either positive or negative agreement for 15 tests. Interrater reliability was fair for 11 tests, moderate for 9 tests, and good for 1 test (the Full Can Test). An additional brief practical session did not result in better agreement or reliability. Conclusion: Clinicians should be aware that interrater agreement and reliability for most shoulder tests is questionable and their value in clinical practice limited.
Original languageEnglish
JournalPhysiotherapy Theory and Practice
DOIs
Publication statusPublished - 2019

Cite this

Apeldoorn, Adri T. ; den Arend, Marjolein C. ; Schuitemaker, Ruud ; Egmond, Dick ; Hekman, Karin ; van der Ploeg, Tjeerd ; Kamper, Steven J. ; van Tulder, Maurits W. ; Ostelo, Raymond W. / Interrater agreement and reliability of clinical tests for assessment of patients with shoulder pain in primary care. In: Physiotherapy Theory and Practice. 2019.
@article{ddf178868d3346eab77bb8297d4544e5,
title = "Interrater agreement and reliability of clinical tests for assessment of patients with shoulder pain in primary care",
abstract = "Background: There is limited information about the agreement and reliability of clinical shoulder tests. Objectives: To assess the interrater agreement and reliability of clinical shoulder tests in patients with shoulder pain treated in primary care. Methods: Patients with a primary report of shoulder pain underwent a set of 21 clinical shoulder tests twice on the same day, by pairs of independent physical therapists. The outcome parameters were observed and specific interrater agreement for positive and negative scores, and interrater reliability (Cohen’s kappa (κ)). Positive and negative interrater agreement values of ≥0.75 were regarded as sufficient for clinical use. For Cohen’s κ, the following classification was used: <0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, 0.81–1.00 very good reliability. Participating clinics were randomized in two groups; with or without a brief practical session on how to conduct the tests. Results: A total of 113 patients were assessed in 12 physical therapy practices by 36 physical therapists. Positive and negative interrater agreement values were both sufficient for 1 test (the Full Can Test), neither sufficient for 5 tests, and only sufficient for either positive or negative agreement for 15 tests. Interrater reliability was fair for 11 tests, moderate for 9 tests, and good for 1 test (the Full Can Test). An additional brief practical session did not result in better agreement or reliability. Conclusion: Clinicians should be aware that interrater agreement and reliability for most shoulder tests is questionable and their value in clinical practice limited.",
author = "Apeldoorn, {Adri T.} and {den Arend}, {Marjolein C.} and Ruud Schuitemaker and Dick Egmond and Karin Hekman and {van der Ploeg}, Tjeerd and Kamper, {Steven J.} and {van Tulder}, {Maurits W.} and Ostelo, {Raymond W.}",
year = "2019",
doi = "10.1080/09593985.2019.1587801",
language = "English",
journal = "Physiotherapy Theory and Practice",
issn = "0959-3985",
publisher = "Informa Healthcare",

}

Interrater agreement and reliability of clinical tests for assessment of patients with shoulder pain in primary care. / Apeldoorn, Adri T.; den Arend, Marjolein C.; Schuitemaker, Ruud; Egmond, Dick; Hekman, Karin; van der Ploeg, Tjeerd; Kamper, Steven J.; van Tulder, Maurits W.; Ostelo, Raymond W.

In: Physiotherapy Theory and Practice, 2019.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Interrater agreement and reliability of clinical tests for assessment of patients with shoulder pain in primary care

AU - Apeldoorn, Adri T.

AU - den Arend, Marjolein C.

AU - Schuitemaker, Ruud

AU - Egmond, Dick

AU - Hekman, Karin

AU - van der Ploeg, Tjeerd

AU - Kamper, Steven J.

AU - van Tulder, Maurits W.

AU - Ostelo, Raymond W.

PY - 2019

Y1 - 2019

N2 - Background: There is limited information about the agreement and reliability of clinical shoulder tests. Objectives: To assess the interrater agreement and reliability of clinical shoulder tests in patients with shoulder pain treated in primary care. Methods: Patients with a primary report of shoulder pain underwent a set of 21 clinical shoulder tests twice on the same day, by pairs of independent physical therapists. The outcome parameters were observed and specific interrater agreement for positive and negative scores, and interrater reliability (Cohen’s kappa (κ)). Positive and negative interrater agreement values of ≥0.75 were regarded as sufficient for clinical use. For Cohen’s κ, the following classification was used: <0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, 0.81–1.00 very good reliability. Participating clinics were randomized in two groups; with or without a brief practical session on how to conduct the tests. Results: A total of 113 patients were assessed in 12 physical therapy practices by 36 physical therapists. Positive and negative interrater agreement values were both sufficient for 1 test (the Full Can Test), neither sufficient for 5 tests, and only sufficient for either positive or negative agreement for 15 tests. Interrater reliability was fair for 11 tests, moderate for 9 tests, and good for 1 test (the Full Can Test). An additional brief practical session did not result in better agreement or reliability. Conclusion: Clinicians should be aware that interrater agreement and reliability for most shoulder tests is questionable and their value in clinical practice limited.

AB - Background: There is limited information about the agreement and reliability of clinical shoulder tests. Objectives: To assess the interrater agreement and reliability of clinical shoulder tests in patients with shoulder pain treated in primary care. Methods: Patients with a primary report of shoulder pain underwent a set of 21 clinical shoulder tests twice on the same day, by pairs of independent physical therapists. The outcome parameters were observed and specific interrater agreement for positive and negative scores, and interrater reliability (Cohen’s kappa (κ)). Positive and negative interrater agreement values of ≥0.75 were regarded as sufficient for clinical use. For Cohen’s κ, the following classification was used: <0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, 0.81–1.00 very good reliability. Participating clinics were randomized in two groups; with or without a brief practical session on how to conduct the tests. Results: A total of 113 patients were assessed in 12 physical therapy practices by 36 physical therapists. Positive and negative interrater agreement values were both sufficient for 1 test (the Full Can Test), neither sufficient for 5 tests, and only sufficient for either positive or negative agreement for 15 tests. Interrater reliability was fair for 11 tests, moderate for 9 tests, and good for 1 test (the Full Can Test). An additional brief practical session did not result in better agreement or reliability. Conclusion: Clinicians should be aware that interrater agreement and reliability for most shoulder tests is questionable and their value in clinical practice limited.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85063209561&origin=inward

UR - https://www.ncbi.nlm.nih.gov/pubmed/30900508

U2 - 10.1080/09593985.2019.1587801

DO - 10.1080/09593985.2019.1587801

M3 - Article

JO - Physiotherapy Theory and Practice

JF - Physiotherapy Theory and Practice

SN - 0959-3985

ER -