“For me, we are talking about all the healthcare data that is not part of randomised control trials (RCTs). A lot of relevant data is a by-product of the care process and is falling by the wayside,” said Bart Vannieuwenhuyse, Senior Director Health Information Sciences at Janssen and coordinator of EMIF
, welcoming delegates.
The vision is that the end result of the five-year EMIF programme will be an ecosystem where data sources are clearly mapped, researchers can assess if a particular source suits the objectives of specific projects and they can readily engage with data owners to get permission for its re-use.
As Vannieuwenhuyse noted, beyond the confines of RCTs, pharmaceutical companies currently have little exposure to large groups of patients. Yet information generated in the course of their care, including health records, pharmacy, lab tests, claims data, and so on, could provide crucial inputs in the development of new drugs.
To ensure that the work undertaken by EMIF has relevance –and can shed light on – unmet medical needs, EMIF-AD is assessing if existing datasets and bio-banks can be used to identify early markers of Alzheimer’s disease, while a second project, EMIF-Metabolic is studying risk markers for developing metabolic complications of obesity.
These elements of EMIF are underpinned by EMIF-Platform, in which a number of tools, for example, to support biomarker discovery, and a common data model, are being developed.
Real world data has applications from biomarker discovery and predictive modelling in discovery, to trial design and recruitment in development, and on to providing evidence of effectiveness and monitoring safety once a drug is approved and on the market. “There are clear benefits for all partners in re-using data,” Vannieuwenhuyse said.
Johan Liwing, Director, Market Access RWE Partnerships, Global Commercial Strategy Organisation at Janssen, described how the company is partnering with leading institutions in the US and Europe to promote the use of real world evidence in drug discovery.
The aim is to analyse multiple data sources to generate evidence of disease pathways, healthcare delivery and the effectiveness of treatments, and use the outputs to improve and advance methodologies and support medical decision-making.
The vision is that this underpins a paradigm shift, in which rather than diagnosing and treating observable symptoms of disease, it will be possible to pinpoint the initiation of a disease-causing process and treat to pre-empt the symptoms.
“We believe this will be a big part of the future. But there are challenges, because it will change drug development,” Liwing said. To achieve this shift it will be necessary to continuously capture and monitor health data, in order to predict the onset of disease processes.
One example of how Janssen is applying this ‘Disease Interception Accelerator’ concept is in childhood Type I Diabetes, where tracking the production of autoantibodies against HbA1c (glycated haemoglobin) levels, has been shown to be predictive of progression to insulin dependency.
“If we could monitor [these two parameters], we could delay development of symptoms,” Liwing said.
Identifying precise research questions that reflect the interests and concerns of patients and regulators, whilst acknowledging the requirements of academics researchers and companies – and which can be addressed by interrogating real world data, is the route to promoting risk-sharing amongst stakeholders, suggested John Gallacher of Oxford University’s Medical Sciences Division and Director of the UK Medical Research Council’s Dementia Platform.
To take one example, the link between blood pressure and heart disease illustrates the need for large study sets: with a sample size of 5,000 there are hints of where the greatest risk may lay; at 50,000 subjects the focus sharpens. All becomes clear when looking at 500,000 subjects, Gallacher said. “It’s effectively a definitive answer: large datasets equals answers.”
The challenge in advancing treatments for dementia is to apply large datasets to identify early determinants and apply these findings to the discovery and development of drugs that delay onset, relieve symptoms and slow progression.
Information that is relevant to dementia and other disorders is held in a range of disparate databases, collected for different purposes, in different countries. There is a huge task of data interpretation to combine these sources in a meaningful way, and use them to answer questions. “Rows of people, with columns of variables is the fundamental challenge.”
Gallacher proposed simplifying data to enable its integration and analysis, pointing to UK Biobank data, from which it is possible to identify participants with memory deficits and APOE4 (apolipoprotein E) markers, as potential recruits to trials of Alzheimer’s drugs targeting APOE4.
This example illustrates how real world evidence could support the transition from a functional definition of a disease, to defining it by biological mechanisms, providing a far more potent base on which to develop a new medicine.
In contrast to clinical trials – where real world evidence can be used to identify subjects at high risk – in public health it is necessary to look at the population as a whole in order to formulate policies that will provide the greatest benefit.
“The issue is not just throwing data at it, but how you use data to answer the question,” Gallacher said.
Ferran Sanz, Director of the Research Programme of Biomedical Informatics, Hospital del Mar Medical Research Institute, Universitat Pompeu Fabra in Barcelona, outlined the many and various – and voluminous – sources of chemical and molecular biology data in Europe, and described examples of Innovative Medicines Initiative (IMI) projects in which these sources are being integrated and applied to improve drug development and safe use in areas including toxicology and pharmacovigilance.
Although there may be distinct realms, biomedical research is a continuum in which one element informs another. Each is generating huge volumes of data. For example, there are more than 20 million journal papers in an electronic format, genomics and other ‘omics databases contain many petabytes of genotypic and phenotypic information, there is much freely available information about small molecules and protein structures, millions of electronic health records, digitised medical images and inputs from social media (where health is one of the most aired topics).
“If we are able to interrogate all these heterogeneous sources of information, there would be a better view of diseases and therapies,” Sanz said.
The E-Tox project, for example, has data mined millions of electronic health records to find associations between particular drugs and adverse events. Once these signals have been picked up, there is a search for the possible biological underpinnings, using computer analyses to look at known interactions of drugs and proteins, and of how proteins are related to disease pathways.
In the project, legacy reports from pharma companies were integrated with public sources to create a combined database of human safety information. It is hoped this will enable reliable in silico prediction of side effects in the critical stages of drug development, reducing attrition and the requirement for animal testing.
Another example of the power of large biomedical databases comes from the DisGeNet, which contains 500,000 records of gene-disease associations. Amongst other applications, this can be used to construct diseaseomes and understand co-morbidities by sketching a network of relationships between diseases based on common molecular backgrounds, Sanz said.
The above are notable individual examples of the insights that can be unlocked from biomedical data stores. The case of Estonia, where there is public consensus and full legal backing for the use of real world data, underlines the far greater potential value which arises from taking a comprehensive approach that embraces all data sources.
The Estonian Biobank contains data on 52,000 participants, or 5 percent of the adult population. As Tõnu Esko, Deputy Director of Research at the Estonian Genome Centre noted, while “there are larger biobanks” the broad informed consent, legislation in the form of the Human Genes Research Act, and the country’s nationwide e-services backbone, make Estonia’s biobank a more powerful resource for research.
The e-government services network runs off a common platform, through which it is possible to link all the databases. Researchers using the biobank can integrate public repositories including hospital records, pharmacy, health insurance information, causes of death and other disease-specific registries.
Taken in combination, it becomes possible to assess individual risk of developing disease, based on genetics, environment, comorbidities and age, Esko said.
This can inform public health measures to support prevention, for example, sending people who have had one heart attack mobile phone SMS messages to encourage them to maintain lifestyle changes.
The Estonian Biobank provides the foundations for the Estonian Programme for Personalised Medicine, in which data from all major databases will be integrated and interrogated to support clinical decision-making and treatment. There are plans to develop an e-health database containing genotypes, e-health records, prescriptions and so on, relating to 500,000 people by 2022, Esko concluded.