The use of depth-sensing computer vision to capture bodily movement is increasingly being exploited in healthcare. Yet, there are few descriptions of how real-world practices influence the design of such applications. To this end, we present the development and empirical evaluation of ASSESS MS, a system to support the clinical assessment of Multiple Sclerosis using Kinect. A key issue for developing machine-learning based systems is the need for standardized data on which statistical inferences can be made. We demonstrate that there are many aspects of clinical practice that are at odds with the need to capture standardized data for a computer vision system. We offer three design guidelines so address these: 1) Standardization is a multi-disciplinary issue and needs to be addressed early in the development process; 2) Tools that provide a view into what the camera “sees” can support the achievement of standardized data capture in real environments; 3) Tools to support standardized data capture should maintain the agency of human interaction. More broadly we show that when considering every day contexts, the traditional focus on measurement accuracy is only a small part of the effort needed to make a technology “work” in practice.