Whereas environmental data are increasingly available, it is often not clear how or if datasets are available for health research. Exposure metrics are typically developed for specific research initiatives using disparate exposure assessment methods and no mechanisms are put in place for centralizing, archiving, or distributing environmental datasets. In parallel, potentially vast amounts of environmental data are emerging due to new technologies such as high resolution imagery and machine learning.
Objectives: The Canadian Urban Environmental Health Research Consortium (CANUE) and the Geoscience and Health Cohort Consortium (GECCO) provide a proof of concept that centralizing and disseminating environmental data for health research is valuable and can accelerate discovery. In this essay, we argue that more efficient use of exposure data for environmental epidemiological research over the next decade requires progress in four key areas: metadata and data access portals, linkage with health databases, harmonization of exposure measures and models over large areas, and leveraging "big data" streams for exposure characterization and evaluation of temporal changes.
Discussion: Optimizing the use of existing environmental data and exploiting emerging data streams can provide unprecedented research opportunities in environmental epidemiology through a better characterization of individuals' exposures and the ability to study the intersecting impacts of multiple environmental features or urban attributes across different populations around the world. Proper documentation, linkage, and dissemination of new and emerging exposure data leads to a better awareness of data availability, a reduction of duplication of effort and increases research output.