The EMIF approach goes from data discovery, to data assessment, to data re-use. Starting with fairly high-level discovery of available data sets – meta data descriptions, which are then catalogued – the next step is to assess the suitability of the resulting databases, says Vannieuwenhuyse – i.e. “Does the database have the subjects I’m interested in? Is there a sufficiently large numbers of subjects?” Then, a way needs to found for researchers and data owners to collaborate on the analysis of this data. Finally, all of this be must done while safeguarding privacy, safety, legal and ethical concerns.
With two research projects running within the EMIF programme – EMIF-AD, which focuses on early onset and progression markers of Alzheimer’s Disease, and EMIF-Metabolics, which focuses on identifying risk markers for obesity to pinpoint co-morbidities – EMIF involves a broad and diverse range of participants. “We have a wide variety of people involved,” says Vannieuwenhuyse, “from data scientists to data owners, epidemiologists, and biologists looking at molecular pathways.” On top of that, there is also a wide variety of data types represented in the consortium, with data sources ranging from general practitioner networks, hospitals, and research cohorts, to biobanks, and more. “It’s a challenge because you need to find solutions that fit most of these data sets and users,” says Vannieuwenhuyse – adding, “it’s not always possible”.