Purpose: The computerized Animated Activity Questionnaire (AAQ) for assessing activity limitations in hip and knee osteoarthritis (HKOA) consists of video animations from which patients can choose the animation that best matches their own performance. The AAQ has demonstrated good validity and reliability. Application of the AAQ in international studies, requires good cross-cultural validity, i.e., minimal Differential Item Functioning (DIF) across countries. The aim of this study was to evaluate cross-cultural validity. Methods: Patients in 7 European countries patients completed the AAQ on a computer. Ordinal logistic regression analysis was used to evaluate DIF across languages (Dutch versus 6 other languages). DIF is defined as follows: If a patient in a country has the same level of activity limitation as a patient in the Netherlands (the reference country in which the AAQ is developed), they should score the same on each item of the AAQ. If there is a statistical significant difference between countries, there is DIF. We used frequently described criteria in literature to assess DIF between countries. Criteria for non-uniform DIF were set as a statistical significance (p <0.001) pseudo R-square change according to Nagelkerke with a magnitude larger than 0.035 between the AAQ total score and the country variable (with Dutch as the reference group). Criteria for uniform DIF were set as a statistical significant (p <0.001) odds ratio (OR) of the country variable with a magnitude outside the interval 0.53-1.89. Analyses were adjusted for sex, age, weight, height, and affected joint. The influence of each individual item with DIF on the total score was calculated by means of comparing the correlation between AAQ score with and without the DIF item. A Spearman's correlation of 0.95 or less was interpreted as important influence of the DIF of that item on the total AAQ score. Results: Data of 1239 patients were available. Compared to Dutch (n = 279), none of the 17 items showed DIF in English (n = 202) and French (n = 193). Uniform DIF occurred in the activity 'walking outside' for Spanish versus Dutch (OR 0.28) and Norwegian versus Dutch (OR 0.16), with for both countries scores representing slightly worse functioning compared to Dutch. For Danish versus Dutch, DIF occurred in two items (walking outside on uneven terrain; OR 0.45, walking inside; OR 0.43), also representing slightly worse functioning compared to Dutch. In all these languages, the occurrence of DIF did not influence the total score with correlations of 0.98-0.99 in comparing AAQ scores with and without DIF item (s). For Italian (n = 203) versus Dutch, 6 items showed uniform DIF and 1 item showed non-uniform DIF, which makes it difficult to compare scores obtained in those two countries. Conclusions: Different language versions of the AAQ remain comparable with the original Dutch version except for the Italian version. The AAQ seems to have great potential for international use in research and daily clinical practice, especially in patients with low literacy and non-native speakers, because the use of video animations instead of written text. Future research will focus on responsiveness, interpretation of AAQ scores, and should explore explanations for DIF in Italy by means of qualitative research.