Deploying AI, Big Data and machine learning in life sciences - what legal issues arise?

In this article we consider the rise in the use of AI in the search for novel medicines and the development of new ways of treating patients. What legal issues are presented by these new technologies? We consider the ownership and control of inventions, liability for damage and data privacy issues.

A long term trend in the innovative pharmaceutical industry has been the challenge to research and development productivity. The cost of developing a new medicine is currently estimated at around $2.6 billion, taking into account the nine out of ten candidates that end in failure.

A second challenge relates to the ever-increasing pool of data available to researchers. Making sense of the huge proliferation of scientific research results, the product of “omics” analyses and digital patient data presents an almost impossible task to research teams.

Making use of novel approaches to drug candidate generation, and the repurposing of licensed drugs, that harness AI and machine learning is increasingly seen as offering a way of tackling these issues. London-based BenevolentAI raised $115 million in April 2018 to scale up its AI platform and expand its drug discovery activities, and biopharma companies across the sector are teaming up with external AI platform providers to help with pattern spotting in the search for both new treatments and the use of existing treatments to treat new indications.

More and more data

As in many other areas of our lives, the quantities of data being generated and collected are increasing rapidly.

Genomic and other “omic” data

The interest in broad approaches to “omic” data gathering offers a new resource, but also a new challenge. 

“High-throughput platforms are creating terabytes of genomic, transcriptomic, proteomic, and metabolomic data. The challenge is to interpret these data in a meaningful manner – to uncover relationships that are not readily apparent between molecular profiles and states of health or disease.” (Sandhu et al, Panomics for Precision Medicine.

Genomics England is close to its target of 100,000 human genomes sequenced, reaching 81,000 in early September. Some 21 Petabytes of data will be generated when the target is met. A valuable resource - but at the same time a huge challenge. Recognising the need for collaboration and partnerships to maximise this resource for patient benefit, Genomics England works with others across industry, academia and the NHS, as well as with data analytics specialists through its Discovery Forum.

Patient data

Data from patient consultations and investigations is now widely stored in electronic form. Information from many kinds of scans and tissue analysis can also be stored digitally. Collecting all of this data can be extremely powerful, and it is recognised to be a valuable resource. Innovative businesses like Medicalchain are putting forward blockchain-based models to permit patients to control and gain value from their health information. 

Patient data is also gathered through the use of consumer devices and health apps. This is now being seen as a way of enabling seamless interaction between clinician and patient, as well as a choice for an individual seeking to improve their own health and lifestyle.

Making sense of data

Deploying AI and machine learning to identify new drug candidates

Reports of drug discovery initiatives seeking to harness these new technologies offer exciting prospects for the future. IBM’s Watson for Drug Discovery analyses multiple sets of life sciences knowledge to help to predict and define relationships among genes and drugs. This cloud-based tool is constantly updated and can connect in-house data with publicly available information. Pfizer was an early adopter, announcing a collaboration with IBM Watson in 2016 to use the technology to search for immuno-oncology drugs. Mikael Dolsten, President of Worldwide Research & Development explained why.

"With the incredible volume of data and literature available in this complex field, we believe that tapping into advanced technologies can help our scientific experts more rapidly identify novel combinations of immune-modulating agents.”

IBM Watson was used by researchers from the Barrow Neurological Institute to identify RNA-binding proteins showing significant changes in neurodegenerative disease ALS.

Other pharmaceutical companies have partnered with Boston-based Berg. Berg works with AstraZeneca, Sanofi Pasteur and BD, as well as a series of academic and healthcare institutions, including Genomics England. Berg’s Interrogative Biology platform combines high throughput analysis of patient samples with clinical information to identify new biomarkers for a disease.

Analysing patient data for better diagnosis and treatment

The use of AI and machine learning in digital pathology and medical imaging is also generating increasing interest. A recent study by Moorfields Eye Hospital in London working with Google DeepMind demonstrated the ability of a new deep-learning architecture to read optical scans that equalled that of leading clinicians. The technology was able to achieve this level of accuracy after training on fewer than 15,000 scans. This kind of research could lead to partially or fully automated systems for diagnosis and treatment plans where patient scans or samples are central to disease assessment.

Legal question marks

Patent protection and licensing arrangements

The traditional approach to drug discovery can benefit from strong patent protection. An in-house team working on a new drug candidate or class of candidates can achieve strong monopoly protection in this way. So important are the patents protecting the most successful drugs, their expiry can cause substantial movements in the manufacturer’s stock price. Celgene’s blood cancer drug Revlimid was one of the top selling drugs of 2017, at $8.2 billion. Key European patents are set to expire in 2022 leading to concerns over the company’s future business. This patent cliff is blamed by many as contributing to the decline in Celgene’s market capitalisation over the last year. 

Of course, it may not be possible to protect the chemical entity of interest because it is already in the public domain. Repurposing an existing molecule for the treatment of a new condition is another area where the use of new analytical techniques can pay dividends. Looking for connections between known molecules and new targets or disease states can be a powerful approach to finding new indications for existing drugs.

Where the molecule itself is known, it cannot itself be patented. Patent protection is, though, often still available for the new medical application of the molecule, either as a method of treatment claim in the US, for example, or a “substance for use” claim in European countries.

The introduction of AI and machine learning to the drug discovery process raises question marks over rights of ownership and access.

Where a therapeutic molecule is developed using these tools, or a new indication discovered for an existing molecule, who can claim rights to that invention? The identification of a previously unknown link between a particular set of genes for example, or a molecule with a disease state or pathogen, might not have been possible without use of the AI platform. If the work is done entirely within a biopharma company, using its own data as well as public data, then presumably inventorship rights would rest within that entity. Where, however, there is a collaborative team approach to the project, or provision of proprietary data sets, that may complicate the picture.

We have worked with businesses and other organisations that are grappling with these issues. A model we have seen sees the service provider accepting that rights to inventions rest with the biopharma client, but seeking to reach through into the work product of their customers using a royalty mechanism on future sales of successful products as well as upfront and annual fees.

In practical terms, it makes sense to address these issues head on when a research collaboration begins. When setting up relationships like these all participants need to think through what might be created through the collaboration and how rights to those fruits of research will be allocated and licensed. Setting this out clearly in the relationship agreement is the most effective way of establishing the parties’ rights. Bear in mind, however, that until agreements like these are tested in a courtroom, which could be ten or fifteen years away, and it may be difficult to assess how effective they will be when negotiated and entered into.

Liability for injury or damage

An area of law that is often engaged with real-world applications of AI is liability for harm or damage. If we hand over thinking processes to machines, then who is responsible if the result is an injury to a person or destruction of property? These questions have been explored in the context of automated weaponry, robots, driverless vehicles and the Internet of Things. The possibility of malevolent actions is often cited, but more mundane is the prospect of mistakes and errors arising through some unforeseen failure of the system. A European Parliament analysis of robotics and AI produced a series of recommendations for legislation in early 2017. This is being followed up with further analysis, and proposals to update Europe’s product liability legislation.

With a physical product responsibility for damage or injury caused by failure or defective design usually passes to the manufacturer. And failure to provide a competent service generally means that the service provider is legally responsible. But the use of complex interacting technology, perhaps involving both services and products, can make it difficult to pin responsibility to a link in the chain. AI analysis of retinal scans, for example, could miss a very unusual pattern that might be recognised as different by a specialist clinician through his or her experience. Could the clinician giving advice to a patient be at fault where the recommended treatment provided by an AI analytical tool was in error?

In contrast, the drug discovery process is less exposed to concerns like these. Between the identification of a potentially therapeutic molecule and a marketed medicine there are of course expanding and multiple stages of pre-clinical and clinical assessment to eliminate unsafe or ineffective products. There is no reason to believe that the possibility of harm would be greater for an AI-generated drug candidate than one created through existing processes. Indeed, there is some prospect of a lower risk of harm, if the machine learning process has successfully identified a drug candidate with lower risk of toxicity.

Data privacy

Concerns about the disclosure and sharing of patient information have frequently hit the headlines in recent years. in 2017, London’s Royal Free Hospital was sanctioned for sharing the health data of 1.6 million patients with Google DeepMind. Data watchdog the Information Commissioner’s Office recognised the tension between safeguarding confidential information and making available data to improve patient care. But it criticised the lack of transparency in the way the data had been used.

Stricter laws around the use of individuals’ data, like the GDPR in Europe, and the narrower California Consumer Privacy Act of 2018 demonstrate the increasing importance of data privacy. Under the GDPR, potential fines for serious shortcomings in safeguarding privacy can be up to €20m or 4% of worldwide turnover, so it is more important than ever to ensure data sharing is compliant. It will be vital for data sets and analytical tools to build in the principles of data privacy. This may mean rigorous anonymisation of information, or methods of identifying and removing information about a particular individual whose consent has been withdrawn.

Time to address the legal issues

New examples of AI and machine learning methods being applied in the life sciences arena appear frequently. Some regard this as over-hyped, citing the example of the excitement around computer aided drug design in the 1980s, which did not live up to its early promise. But even if the most optimistic predictions are not realised, it seems that AI drug discovery and patient diagnosis are here to stay and beginning to produce real world benefits. As this picture unfolds, addressing the legal issues at an early stage makes sense. 

Our content explained

Every piece of content we create is correct on the date it’s published but please don’t rely on it as legal advice. If you’d like to speak to us about your own legal requirements, please contact one of our expert lawyers.

Mills & Reeve Sites navigation
A tabbed collection of Mills & Reeve sites.
My Mills & Reeve navigation
Subscribe to, or manage your My Mills & Reeve account.
My M&R


Register for My M&R to stay up-to-date with legal news and events, create brochures and bookmark pages.

Existing clients

Log in to your client extranet for free matter information, know-how and documents.


Mills & Reeve system for employees.