Objectives To assess quality and reusability of coded cancer diagnoses in routine primary care data. To identify factors that influence data quality and areas for improvement. Methods A dynamic cohort study in a Dutch network database containing 250,000 anonymized electronic medical records (EMRs) from 52 general practices was performed. Coded data from 2000 to 2011 for the three most common cancer types (breast, colon and prostate cancer) was compared to the Netherlands Cancer Registry. Measurements Data quality is expressed in Standard Incidence Ratios (SIRs): the ratio between the number of coded cases observed in the primary care network database and the expected number of cases based on the Netherlands Cancer Registry. Ratios were multiplied by 100% for readability. Results The overall SIR was 91.5% (95%CI 88.5–94.5) and showed improvement over the years. SIRs differ between cancer types: from 71.5% for colon cancer in males to 103.9% for breast cancer. There are differences in data quality (SIRs 76.2% − 99.7%) depending on the EMR system used, with SIRs up to 232.9% for breast cancer. Frequently observed errors in routine healthcare data can be classified as: lack of integrity checks, inaccurate use and/or lack of codes, and lack of EMR system functionality. Conclusions Re-users of coded routine primary care Electronic Medical Record data should be aware that 30% of cancer cases can be missed. Up to 130% of cancer cases found in the EMR data can be false-positive. The type of EMR system and the type of cancer influence the quality of coded diagnosis registry. While data quality can be improved (e.g. through improving system design and by training EMR system users), re-use should only be taken care of by appropriately trained experts.