Introduction: Transition ratings (TRs) are single item measures which ask patients to report on their health change. They allow for a simple assessment of improvement or deterioration and are frequently used as an “anchor” to determine interpretation thresholds on a patient-reported outcome measure (PROM). Despite their widespread use, a routinely applicable method to assess their reliability is lacking. This paper introduces a method to estimate the reliability of TRs based on confirmatory factor analysis (CFA) for categorical data. Method: We modelled longitudinal PROM data as independent factors representing Time 1 and Time 2 in a CFA model. PROM items taken at Time 1 (T1) loaded on the first factor, although the same items taken at Time 2 (T2) loaded on the second. The TR item loaded onto both T1 and T2 factors. Three models with various constraints on the loadings and thresholds were examined. The communality (R2) statistic was used as a measure of the TR reliability. The approach was evaluated using simulated data and exemplified in four empirical datasets. Results: The simplest CFA model without constraints on the item loadings and thresholds performed equivalently to models with constraints on loadings and thresholds over time. Further constraints on the TR item loadings to be equal and opposite over time caused biased TR reliability estimates if the T1 and T2 loadings differed in magnitude. In the four empirical datasets, reliability of TRs ranged from 0.27 to 0.48. In three examples the TR had numerically stronger loading on T2 than on T1. Discussion and conclusions: Results support the use of the proposed method in understanding the reliability of TRs. Empirical study results reflect the typical range of reliability that has previously been reported for single items. Methodological considerations to improve TR reliability are presented, and developments of this method, are posited.