Objective: To inform the development of an AGREE II extension specifically tailored for surgical guidelines. Summary background data: AGREE II was designed to inform the development, reporting, and appraisal of clinical practice guidelines. Previous research has suggested substantial room for improvement of the quality of surgical guidelines. Methods: A previously published search in MEDLINE for clinical practice guidelines published by surgical scientific organizations with an international scope between 2008 and 2017, resulted in a total of 67 guidelines. The quality of these guidelines was assessed using AGREE II. We performed a series of statistical analyses (reliability, correlation and Factor Analysis, Item Response Theory) with the objective to calibrate AGREE II for use specifically in surgical guidelines. Results: Reliability/correlation/factor analysis and Item Response Theory produced similar results and suggested that a structure of 5 domains, instead of 6 domains of the original instrument, might be more appropriate. Furthermore, exclusion and re-arrangement of items to other domains was found to increase the reliability of AGREE II when applied in surgical guidelines. Conclusions: The findings of this study suggest that statistical calibration of AGREE II might improve the development, reporting, and appraisal of surgical guidelines.