TY - JOUR
T1 - Internal-external cross-validation helped to evaluate the generalizability of prediction models in large clustered datasets
AU - Takada, Toshihiko
AU - Nijman, Steven
AU - Denaxas, Spiros
AU - Snell, Kym I. E.
AU - Uijl, Alicia
AU - Nguyen, Tri-Long
AU - Asselbergs, Folkert W.
AU - Debray, Thomas P. A.
PY - 2021/9/1
Y1 - 2021/9/1
N2 - Objective: To illustrate how to evaluate the need of complex strategies for developing generalizable prediction models in large clustered datasets. Study Design and Setting: We developed eight Cox regression models to estimate the risk of heart failure using a large population-level dataset. These models differed in the number of predictors, the functional form of the predictor effects (non-linear effects and interaction) and the estimation method (maximum likelihood and penalization). Internal-external cross-validation was used to evaluate the models’ generalizability across the included general practices. Results: Among 871,687 individuals from 225 general practices, 43,987 (5.5%) developed heart failure during a median follow-up time of 5.8 years. For discrimination, the simplest prediction model yielded a good concordance statistic, which was not much improved by adopting complex strategies. Between-practice heterogeneity in discrimination was similar in all models. For calibration, the simplest model performed satisfactorily. Although accounting for non-linear effects and interaction slightly improved the calibration slope, it also led to more heterogeneity in the observed/expected ratio. Similar results were found in a second case study involving patients with stroke. Conclusion: In large clustered datasets, prediction model studies may adopt internal-external cross-validation to evaluate the generalizability of competing models, and to identify promising modelling strategies.
AB - Objective: To illustrate how to evaluate the need of complex strategies for developing generalizable prediction models in large clustered datasets. Study Design and Setting: We developed eight Cox regression models to estimate the risk of heart failure using a large population-level dataset. These models differed in the number of predictors, the functional form of the predictor effects (non-linear effects and interaction) and the estimation method (maximum likelihood and penalization). Internal-external cross-validation was used to evaluate the models’ generalizability across the included general practices. Results: Among 871,687 individuals from 225 general practices, 43,987 (5.5%) developed heart failure during a median follow-up time of 5.8 years. For discrimination, the simplest prediction model yielded a good concordance statistic, which was not much improved by adopting complex strategies. Between-practice heterogeneity in discrimination was similar in all models. For calibration, the simplest model performed satisfactorily. Although accounting for non-linear effects and interaction slightly improved the calibration slope, it also led to more heterogeneity in the observed/expected ratio. Similar results were found in a second case study involving patients with stroke. Conclusion: In large clustered datasets, prediction model studies may adopt internal-external cross-validation to evaluate the generalizability of competing models, and to identify promising modelling strategies.
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85105014589&origin=inward
UR - https://www.ncbi.nlm.nih.gov/pubmed/33836256
U2 - 10.1016/j.jclinepi.2021.03.025
DO - 10.1016/j.jclinepi.2021.03.025
M3 - Article
C2 - 33836256
SN - 0895-4356
VL - 137
SP - 83
EP - 91
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
ER -