Leveraging Big Data, Artificial Intelligence, and Machine Learning for Drug Discovery, Development, Manufacturing, and More

Leveraging Big Data, Artificial Intelligence, and Machine Learning for Drug Discovery, Development, Manufacturing, and More

October 11, 2022PAO-10-022-NI-05

Digital Transformation to Pharma 4.0

Although the pharmaceutical industry is behind many other sectors in the transition to Industry 4.0 (or more specifically Pharma 4.0), there is a recognition that digitalization is necessary for future success in managing the growing complexity in all aspects of drug discovery, development, and manufacturing. Industry 4.0 is centered around cyber–physical systems, the industrial Internet of things (IIoT), and the concepts of smart manufacturing, facilities of the future, and the importance of having a high level of systems connectivity and integration.1 Pharma 4.0 includes transformation of all aspects of drug discovery, development, and manufacturing, as well as capabilities for accessing and efficiently processing large quantities of different types of healthcare data, including post-marketing and patient data.

The pharma industry as a whole, though, is still in the early stages of its digital transformation journey. Most companies have implemented some automation and online or at-line process and data analytics but have not yet reached a point where those systems communicate with one another.2 They have eliminated paper-based records but not data silos. The widespread use of technologies, such as robotics and other automation solutions, artificial intelligence (AI) and machine learning (ML), big data analytics, cybersecurity, cloud computing, RFID, and predictive analytics and biometrics, in a connected and integrated manner has yet to occur.1

Early adopters are implementing automated systems in laboratories and on the manufacturing floor, reducing human errors, creating efficiencies, and improving data management. Digital solutions are being leveraged by a few contract service providers to provide better transparency and to enhance communication and collaboration. There is growing interest in data capture, sharing, and analytics solutions to enable greater process understanding and ultimately higher-quality processes and products. AI and ML in drug discovery and development allow more informed decisions to be made more quickly and have the potential to enable the identification of completely novel therapies.

Pharma 4.0™, as created by the International Society for Pharmaceutical Engineering (ISPE) in 2017, is a framework for helping drug manufacturers incorporate digital technologies in a holistic way and from a holistic perspective.3 It provides practical guidance to accelerate Industry 4.0 transformations in alignment with pharmaceutical regulations and best practices by enabling organizations to leverage the full potential of digitalization and thus more rapidly provide innovations for the benefit of patients.

There are, of course, many challenges facing pharma companies as they move toward Pharma 4.0. Digital transformation involves new technologies and new ways of working that require employees to learn a great deal and accept some significant levels of change. The industry must recruit or develop internally specialized expertise in data analytics, AI, ML, and other digital technologies. The perception that new approaches carry higher risk must be overcome.

On a practical level, Good Manufacturing Practices (GMPs) have been established to ensure that drugs are produced using well-understood and well-defined processes, and the use of AI within processing equipment to allow automated changes in response to trending goes against this fundamental tenet.4 Consequently, for any pharmaceutical company, the transition to Pharma 4.0™ requires understanding of existing internal capabilities, access to resources, and preparedness to make significant changes to the organizational structure, processes, information systems, and culture, while keeping in mind how those changes can be accomplished within the current regulatory framework.1,5

Focus on AI and ML

Computer algorithms that “learn” as they analyze data have artificial or augmented intelligence.6 ML or deep learning systems identify predictive patterns and then output results. They can also learn through natural language processing (NLP) (reading diverse types of written information) and via robot process automation (“bots”). Unsupervised learning finds hidden patterns, supervised learning can be used to improve the efficiency of predictions, and reinforcement learning is used in modeling.

Many Applications of AI and ML in Pharma

Most AI leveraged in the pharma industry can be classified as “narrow AI” — artificial intelligence that performs specific tasks and can learn while doing them — but without self-awareness. Whether augmented or artificial, applications for AI in pharma are numerous and range from drug discovery and development to medical imaging, diagnostics, disease diagnosis, therapy planning, and hospital workflow design.7-9

The value of the global market for AI in healthcare is estimated to be expanding at a high compound annual growth rate (CAGR) of 37% from $11.06 billion in 2021 to $187.95 billion by 2030.10 For AI in drug discovery alone, the value of the global market is estimated to be expanding at a CAGR of 36% from $1.0 billion in 2021 to $10.8 billion in 2022.11

No AI and ML without Big Data

AI and ML algorithms cannot function without data. Both historical and real-time data from many different sources can be utilized. Fortunately, the pharmaceutical industry is a science- and data-driven sector that generates increasingly vast quantities of data that can be leveraged by AI and ML systems.

Useful data ranges from experimental results during drug discovery and development and academic research to real-time manufacturing data to patient data and data from social media sites. When these large volumes of data of all types generated at the different stages of the value chain are processed using AI and ML, useful insights can be gained that, without such large and disparate data sets and the computational power for AI and ML, were previously not possible.12,13 In addition, “when properly analyzed, big data can be very powerful in providing insights for business strategy throughout the pharma value chain, including in the acceleration of drug discovery and development, the optimization of manufacturing processes, the management of supply chains, and the creation of innovative sales and marketing strategies.”14 That is where AI and ML come in.


It should be noted that cloud computing technology is critical to the implementation of AI and ML solutions, given the processing limitations of current computers.15

Leveraging AI and ML for Drug Discovery and Development

There are many applications for AI and ML in drug discovery and development, from understanding biological systems and diseases to identifying drug candidates with a high likelihood of success to optimizing clinical trials.15 Some specific examples include structure- and ligand-based virtual screening, library design and high-throughput analysis, drug repurposing and drug sensitivity, de novo design, chemical reactions and synthetic accessibility,

ADMET (chemical absorption, distribution, metabolism, excretion, and toxicity) evaluation, and quantum mechanics analysis.16 Indeed, AI can be used in each step of drug design, reducing the time and cost of developing safer and more effective drugs.17-20 It has been estimated that the use of AI could reduce drug discovery costs by as much as 70%.21

AI and ML can also be leveraged in preclinical and clinical drug development, such as in predictive modeling of drug behavior in animals and the correlation of results for humans, as well as identification of patients for clinical trial participation, monitoring, and trending as studies progress.21,22

Overall, the application of AI, ML, and NLP techniques can lead to improved predictive modeling and simulation capabilities while enabling the integration of real-world data and electronic medical records from disparate sources into the drug discovery and development process. When integrated, this data could mean improved candidate screening and trial selection, optimization of clinical trial designs, and better prediction of drug demand.23

All of these activities are made possible by a combination of advances in deep learning, the increased availability of data, and new frameworks for implementing deep neural networks (DNNs), which now are more accurate than the human brain in areas such as image, voice, and text recognition.24 AI imagination deep generative models are also enabling new applications. 

Benefits of AI and ML in Drug Manufacturing

When used effectively, AI may also be programmed to improve pharmaceutical manufacturing operations. According to Constantin Loghinov, Managing Director of MILS Group, LLC, “To make machine learning effective in biopharmaceutical manufacturing, gargantuan internal mindset and business process change around data collection, analysis, and use is necessary. There are, however, some quicker Band-Aid solutions that can be initially deployed.”25

AI is already being employed for ML-based visual inspection and bacterial culture yield optimization. AI could also be used to leverage data from multiple manufacturing sites within a single organization. Analysis of data generated by IIoT sensors (e.g., cameras, thermostats, chemical sensors) could help achieve more predictable, high-quality, flexible, and low-cost manufacturing processes. The ultimate goal — as with many initiatives in the pharma industry — is to reduce the cost and time required for drug development, commercialization, and production.

Multivariate analyses enabled by AI can help identify and troubleshoot process- and product-quality issues, increase yields, and reduce off-spec product.26 One recent study showed that the analysis of incoming material quality data, in-process data for direct tablet compression, and final product quality results (collected from separate databases) for 1,005 actual production batches completed over several years was useful for achieving effective quality predictions.27

AI is also being used for asset management to create manufacturing efficiencies and for predictive maintenance systems that can analyze failure patterns and provide warnings of possible equipment failures.26 In another example, a drug maker used AI-driven software to predict in advance the failure of its purified water system equipment and thus avoid unplanned downtime.

AI can also be applied to supply chain management and logistics processes to ensure greater security of supply across multiple vendors.22 Both AI and ML have also been shown to help companies reduce regulatory review times by improving the accuracy of the data provided to regulatory authorities.

Many Projects Underway

A GlobalData survey of healthcare industry professionals conducted in 2021 revealed that 23% of respondents worked at companies already using AI to enhance drug discovery and development processes.28 In addition, 28% and 32% of companies will be leveraging AI and big data to optimize drug discovery and development processes and streamline sales and marketing efforts, respectively.

Furthermore, Novartis, Roche, Pfizer, Merck, AstraZeneca, GlaxoSmithKline, Sanofi, Abbvie, Bristol-Myers Squibb, and Johnson & Johnson have collaborated with or acquired AI technologies.29 It is estimated that approximately 100 partnerships have been established between pharmaceutical companies and AI vendors between 2015 and 2021.30 These collaborations target a wide variety of activities: accurate diagnosis of lung cancers, diabetes prevention, reduction of drug development costs by up to 15%, predictive analytics for supply chain management and clinical trial design, patient recruitment, investigator and site selection, patient monitoring, and data analysis.

In one example, a new platform called Pharma.AI has been reported to reduce the timeframe for target discovery to the preclinical candidate nomination to just 18 months for a drug to treat idiopathic pulmonary fibrosis, a chronic lung disease.31 In another, Hong Kong–based Insilico Medicine used AI and deep learning to design, synthesize, and validate a novel drug candidate in 46 days — 15 times faster than what was previously thought possible.32

Others include a collaboration between Quartic.ai and Bright Path Labs to develop an AI platform that supports the continuous manufacturing of active pharmaceutical ingredients (APIs) and small molecule drugs, a high-performance liquid chromatography (HPLC) system that uses AI to automatically spot and address issues, an AI platform for supply chain planning from Aera that automatically identifies risks and opportunities and then automatically generates prescriptive recommendations for specific decision-makers and users about what should be done, and an AI solution that Sanofi is using to optimize inventory and which the company expects will enable a 20-day reduction in inventory levels and a 20% decrease in baseline costs with contract suppliers.31

Some AI startup targeting developing AI, ML, and NLP technologies for pharma applications include BenchSci, BenevolentAI, BioXcel Corporation, Atomwise, Exscientia, and Numerate.6 Others, such as Berg and Verge Genomics, are not only developing AI technology but using it to identify drug candidates and take them to the clinic. Of course, large tech companies have also introduced AI/ML solutions for the healthcare sectors, including IBM (Watson Health) and Google’s DeepMind Health.

Maximizing AI and ML Benefits with Quantum Computing

To maximize the benefits of AI and ML, it is necessary to analyze vast quantities of data. Processing of big data and an optimum scale with existing computer technology isn’t possible in a pharmaceutical discovery lab or manufacturing plant. The advent of quantum computers will change that, however.33

Quantum computers solve problems in a probabilistic manner, taking many different options into consideration simultaneously, which makes it possible to process much more information much faster than conventional computers.34 Problems that would take classical computers several years or more to solve can be calculated by quantum computers in seconds.

The pharma industry has recognized the potential of quantum computing, with many top biopharmaceutical companies forming the QuPharm alliance, a forum for collaboration on the development of quantum computing solutions for pharmaceutical applications.35 Many companies are getting directly involved with quantum computing as well. In a poll conducted during a webinar hosted by QuPharm, QED-C, and the Pistoia Alliance in late 2020, 82% of participants indicated that they believed quantum computing would impact the pharma industry within the next decade.36

The likely first application for quantum computing in the pharmaceutical industry will be to improve computer-assisted drug discovery (CADD), with specific activities including target identification and validation, hit generation and validation, lead optimization, protein structure prediction, and protein engineering and design.37,38 The ability to rapidly analyze complex data sets should also allow better utilization of high-throughput technologies.39 The expectation is that quantum computing will also enable the development and manufacture of new medications that were not previously thought possible.40

Other potential applications of quantum computing include elucidation of unknown disease mechanisms, optimization of clinical trials and synthetic route development, increasing the efficiency of formulation development, improvement of large-scale manufacturing, and enhancement of supply chain modeling.34,37,38,41

Certainly, the future looks bright for the application of AI and ML as a more effective means for facilitating the discovery, development, and manufacture of novel medicines. The benefits will only be magnified once these advanced algorithms can be leveraged on quantum computers. Despite the challenges that face the adoption and implementation of such novel technologies, the innovative spirit that drives the pharmaceutical industry will ultimately overcome any challenges to their implementation. In the end, the real winners will be patients.

Benefiting Both Pharma Companies and Patients

Big data, AI, and ML have the potential to revolutionize the way drugs are developed.42,43 McKinsey estimates that applying big data strategies could could save the pharmaceutical industry $100 billion annually by increasing the efficiency of clinical trials and enabling better decision making.44 There definitely is excitement abounding as initial projects leveraging big data, AI, and ML produce promising results. Indeed, many see these advanced technologies as providing a means for accelerating drug development while reducing costs — the two greatest pressures the pharmaceutical industry faces today.



  1. Chiu, Yuk Chun et al. “Applying an Advanced Pharma 4.0 Perspective to Design the Facility of the Future.” Pharma’s Almanac. In Press.
  2. Bose, Aniruddha. “Better Bioprocessing Efficiency Through Centralized Orchestration.” Pharma’s Almanac. 29 Mar. 2022.
  3. Duckworth, Yvonne. “Intro to Pharma 4.0™ and facility digitalization.” CRB Insights. N.d.
  4. Myers, Christa. “New Trends in Tech Impacting Pharma Facility Design and Construction.” Pharma’s Almanac. 28 Oct. 2019.
  5. Trapl, Josef, Wolfgang Winter, Christian Woelbeling, and Thomas Zimmer. “ISPE Accelerating Digital Transformation with Pharma 4.0 Initiative.” iSpeak Blog. 21 Oct. 2021.
  6. Challener, Cynthia A. “Pharma Makes Moves to Leverage Artificial Intelligence.” Pharma’s Almanac. 12 Mar. 2019.
  7. LaMotta, Lisa. “Pharma and AI? Let’s try augmented intelligence first.” 23 Jul. 2018.
  8. Fleming, Nic. “How artificial intelligence is changing drug discovery.” 557: S55–S57 (2018).
  9. Basak, Sayan and Sukant Khurana. “Artificial Intelligence for modern drug development.” 13 May 2018.
  10. Artificial Intelligence in Healthcare Market Size to Hit US$187.95 Bn By 2030. Precedence Research. 11 Apr. 2022.
  11. AI In Drug Discovery Market Size to Reach USD 10.80 Bn by 2030.” 14 Apr. 2022.
  12. Big data analytics in the pharmaceutical industry How is Big data analytics revolutionizing the pharma industry?” 13 May 2022.
  13. Big Data in the pharmaceutical industry: benefits and applications.” Doxee Blog. 7 Apr. 2022.
  14. Big Data Crucial for Business Strategy Across the Pharma Value Chain. GlobalData Plc. 18 May 2022.
  15. Buvailo, “4 Ways Big Data and Machine Learning Revolutionize Drug Discovery.” Biopharma Trends. 4 Jul. 2022.
  16. Muller, Christophe,Obdulia Rabal, and Constantino Diaz Gonzalez. “Artificial Intelligence, Machine Learning, and Deep Learning in Real-Life Drug Design Cases.” Methods Mol Biol. 2390:383-407 (2022).
  17. Selvaraj, Chandrabose, Ishwar Chandra, and Sanjeev Kumar Singh. “Artificial intelligence and machine learning approaches for drug design: challenges and opportunities for the pharmaceutical industries.” Mol Divers. 26(3):1893-1913 (2022).
  18. Sahu, Adarsh, Jyotika Mishra and Namrata Kushwaha. “Artificial Intelligence (AI) in Drugs and Pharmaceuticals.” Comb Chem High Throughput Screen. 7 Dec.
  19. Gupta, Rohan et al. “Artificial intelligence to deep learning: machine intelligence approach for drug discovery.” Mol Divers. 25(3):1315-1360 (2022).
  20. Anuraj Nayarisseri et al. “Artificial Intelligence, Big Data and Machine Learning Approaches in Precision Medicine & Drug Discovery.” Curr Drug Targets. 22(6):631-655 (2021).
  21. Big pharma is using AI and machine learning in drug discovery and development to save lives.” Insider Intelligence. 15 2022.
  22. Ural, Arda. “How Artificial Intelligence and Machine Learning are Transforming the Life Sciences.” Contract Pharma. 25 Jan. 2022.
  23. de Zegher, Isabelle. “Artificial intelligence revolutionizes drug development.” IDG Connect. 4 Oct. 2018.
  24. Zhavoronkov, Alex. “Artificial Intelligence for Drug Discovery, Biomarker Development, and Generation of Novel Chemistry.” Pharmaceutics. 15(10):4311-4313 (2018).
  25. Loghinov, Constantin. “Artificial Intelligence in Biopharmaceutical Manufacturing.” Pharma’s Almanac. 12 Mar. 2018.
  26. Greenfield, David. “Pharmaceutical Industry Applies Artificial Intelligence.” Automation World. 4 Mar. 2022.
  27. Žagar, Janja and Jurij Mihelič. “Big data collection in pharmaceutical manufacturing and its use for product quality predictions.” Scientific Data. 9:99 (2022).
  28. Begley, Anna. “AI and big data will continue to disrupt pharma sector, says survey.” European Pharmaceutical Review. 19 2021.
  29. Bulgaru, Iolanda. “Pharma Industry in the Age of Artificial Intelligence: The Future is Bright.” Healthcare Weekly. 8 Jun. 2021.
  30. 5 ways AI is transforming the Pharmaceutical Industry in 2022 and beyond.” 11 Jan. 2022.
  31. Newton, Emily. “Artificial Intelligence & Pharma Manufacturing.” Contract Pharma. 11 Mar. 2022.
  32. Ayer, Akhilesh and Mark Halford. “Pharma Leverages AI to Elevate Digital-Only Operations.” Gen Eng News. 5 Jan. 2022.
  33. Challener, Cynthia A. “Quantum Computing will Transform Drug Discovery, Development, Manufacturing and Supply Chain Management.” Pharma’s Almanac. 24 Jun. 2022.
  34. Advanced computing in pharma: 3 reasons why quantum computing could disrupt the pharma R&D.” 30 Aug. 2021.
  35. Hua, Kefeng (Kevin). “QuPharm – Pharmaceutical Companies Form Alliance to Share the Risks and Rewards of Quantum Computing.” LinkedIn Pulse. 13 May 2020.
  36. Nawrat, Allie. “Is quantum computing pharma’s next big disruptor?” Pharmaceutical Technology. 24 Nov. 2020.
  37. Evers, Matthias, Anna Heid, and Ivan Ostojic. “Pharma’s digital Rx: Quantum computing in drug research and development.” 18 Jun. 2021.
  38. Quantum in Life Sciences: The Future is Now.” D-Wave Systems Inc.
  39. Loren, Brad, Matthew Marrone, and Christopher Singer. “How quantum computing can benefit drug discovery.” Drug Discovery World. 14 Jan. 2022.
  40. Exploring quantum computing use cases for life sciences. Rep. IBM Institute for Business Value. 30 Apr. 2020.
  41. Buvailo, Andrii. “Merging AI and Quantum Computing To Boost Drug Discovery.” Biopharma Trend. 26 Apr. 2022.
  42. Ural, Arda. “How Artificial Intelligence and Machine Learning are Transforming the Life Sciences.” Contract Pharma. 25 Jan. 2022.
  43. Chatterjee, “How is AI moving the needle in the pharmaceutical industry?” MedCity News. 8 May 2022.
  44. Cattell, Jamie, Sastry Chilukuri, and Michael Levy. “How big data can revolutionize pharmaceutical R&D.” 1 Apr. 2013.
chat button