Background: Content validity is the most important measurement property of a patient-reported outcome measure (PROM) and the most challenging to assess. Our aims were to: (1) develop standards for evaluating the quality of PROM development; (2) update the original COSMIN standards for assessing the quality of content validity studies of PROMs; (3) develop criteria for what constitutes good content validity of PROMs, and (4) develop a rating system for summarizing the evidence on a PROM’s content validity and grading the quality of the evidence in systematic reviews of PROMs. Methods: An online 4-round Delphi study was performed among 159 experts from 21 countries. Panelists rated the degree to which they (dis)agreed to proposed standards, criteria, and rating issues on 5-point rating scales (‘strongly disagree’ to ‘strongly agree’), and provided arguments for their ratings. Results: Discussion focused on sample size requirements, recording and field notes, transcribing cognitive interviews, and data coding. After four rounds, the required 67% consensus was reached on all standards, criteria, and rating issues. After pilot-testing, the steering committee made some final changes. Ten criteria for good content validity were defined regarding item relevance, appropriateness of response options and recall period, comprehensiveness, and comprehensibility of the PROM. Discussion: The consensus-based COSMIN methodology for content validity is more detailed, standardized, and transparent than earlier published guidelines, including the previous COSMIN standards. This methodology can contribute to the selection and use of high-quality PROMs in research and clinical practice.