Acute Physiology and Chronic Health Evaluation (APACHE) II scoring is widely used as an index of illness severity, for outcome prediction, in research protocols and to assess intensive care unit performance and quality of care. Despite its widespread use, little is known about the reliability and validity of APACHE II scores generated in everyday clinical practice. We retrospectively re-assessed APACHE II scores from the charts of 186 randomly selected patients admitted to our medical and surgical intensive care units. These 'new' scores were compared with the original scores calculated by the attending physician. We found that most scores calculated retrospectively were lower than the original scores; 51% of our patients would have received a lower score, 26% a higher score and only 23% would have remained unchanged. Overall, the original scores changed by an average of 6.4 points. We identified various sources of error and concluded that wide variability exists in APACHE II scoring in everyday clinical practice, with the score being generally overestimated. Accurate use of the APACHE II scoring system requires adherence to strict guidelines and regular training of medical staff using the system.