There is almost no aspect of the life sciences and related industries that Big Data cannot impact.

It can help in disease pattern analysis by bringing information together in new ways; it facilitates drug discovery by enabling advanced search capabilities for analyzing millions of publications, patents and clinical trial documents; it can also help in clinical trials management by profiling patients, evaluating drug readiness or even identifying adverse effects before they are reported.1

Big data could also remove the data sequence bottleneck in large-scale genome sequencing. It is already being used in multiple drug manufacturing and engineering processes and in supply chain management. Other, extensive uses are seen in sales and marketing, patient care quality and program analysis, call-center data analytics and archiving.1

Big Savings

Some of the potential savings are mind-boggling. One analysis projects that applying big-data strategies to inform decision-making could generate up to $100 billion/year in value across the U.S. healthcare system, while another, from a 2011 baseline, puts the “opportunity of the value pathways” at $300 billion to $450 billion, 12% to 17% of the $2.6 trillion baseline in U.S. healthcare costs.2,3

The adoption of big data by the pharma industry also has important implications for suppliers. Both consultants agree that for greater use of big data to yield value, pharma companies will need to collaborate more with trusted partners of all kinds by sharing information and working together to make sense of it.1,2

In Translation

So what does it all mean? The term ‘big data’ was coined in the 1990s, but its origins lie in the digitization of knowledge that began even earlier and was accelerated by the computer revolution. This has been accelerated by several orders of magnitude in recent years by mass networking, cloud computing and, specifically in healthcare, the move to electronic record-keeping.

Big data is generally applied to data sets too large or complex for traditional data-processing methods to cope with in terms of analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy. Big data is generally characterized by the ‘three Vs’ coined as volume, variety (structured, unstructured and semistructured) and velocity. It is also used to refer to the methods used to derive value from the data.4

Putting the Data in Pharma

The term has been used enough to become a cliché and be satirized. Yet the challenges big data pose are clearly real enough. The world’s technological percapita capacity to store information has roughly doubled every 40 months since the 1980s. By 2012, according to IBM, 2.5 exabytes (2.5×1018) of data was being generated every day. The amount of data is now doubling every 18 months. Some estimate that by 2020, it will double every three months.5,6,7

Every industry faces challenges from big data, but none more than the pharmaceuticals and wider healthcare industries. Why? Put simply, no industry is more awash in and dependent on data. With drug discovery programs, clinical trials, sales data, healthcare records, medical test results and now genomics research and social media, the industry has masses of data at hand. Indeed, it relied on analytics long before the concept of ‘big data’ was widely used.

Now the volume of data is growing exponentially, posing ever-greater challenges of managing it before analysis can even start. According to software development firm Informatica, 70% of any data project in pharma involves simply managing the data.8

At the same time, pharma has not really progressed with using data when compared to, say, the banking, electronics and retail sectors. Yet, given the massive and ever-growing cost and timelines of developing new drugs, the high odds against success, cost-reduction pressures and ever-more-stringent regulation it faces in bringing new products to market by comparison with these industries, no industry has more to gain from the cost-saving and growth-driving opportunities big data might bring.

The pharmaceutical supply chain is miles behind the curve in terms of knowing where its products actually are at any given point, something that the consumer goods industry would never tolerate.

According to a recent IMS Institute IT survey, large drug manufacturers need to make over $35 billion in savings from 2016 to 2017 just to maintain current R&D levels and operating margins. Nearly half are planning cuts over 10% in the next three years.9

To date, most of pharma’s big activity has focused on R&D. Observers generally agree that it is doing better here than further downstream, where it continues to use inexact methods — such as focus groups, one-on-one interviews and basic segmentation — on which to base decisions that can cost hundreds of millions of dollars. The pharmaceutical supply chain is miles behind the curve in terms of knowing where its products actually are at any given point, something that the consumer goods industry would never tolerate.

Understanding Value

Above all, information in the pharmaceutical industry, like the industry as a whole, is organized in silos and hard to integrate. As Shannon Fitzhugh-Mengers and Mark Diamond point out, pharma remains rooted in legacy systems, particularly when it comes to pharmacovigilance. It has a difficult balancing act to achieve, with patient safety on the one hand and the needs of doctors, regulators and other stakeholders on the other. Nonetheless, it has to address the problem that its traditional method of incremental improvements will not even begin to address the explosion of (mostly unstructured) data.7

Another part of the problem is that, with times seemingly good in recent years, there has not been a compelling reason to change the business model. Moreover, many pharma companies have outsourced data management along with so many other functions they deemed noncore. All too many simply do not understand the volume and value of the data they actually have.

Industry executives recognize the problem. A recent survey by PwC shows that two-thirds of pharma professionals believe that their companies could do better in leveraging big data, while in another by Capgemini, nearly all pharma executives ranked their firms as below average for personalization, analytics and responsiveness.10,11


Seeking a Chief Data Officer

So what can they actually do? Accenture recommends starting by appointing a Chief Data Officer (CDO) to work across traditional functions and champion data collection, prioritization, distribution, analysis and security. A CDO should have very wide-ranging management authority across multiple cross-functional tasks.12

This would change some power struc- tures within companies and, undoubtedly, feathers will be ruffled because optimizing analytics will require true collaboration across the whole enterprise. “Executives and management must encourage disruption of the status quo, moving beyond traditional organization- al boundaries. Accept no objections (and there will be plenty). The organizations that best harness and share data will be- come leaders with distinct competitive advantages,” says Valtech.13

To address the challenge of big data, says Bill Drummy, CEO of Heartbeat Ideas,   “pharma needs to cultivate the skills and — more critically — the courage to change its management and compensation systems so that risk-taking is rewarded, ‘playing it safe’ is penalized and latent talent is unleashed.” Be more like the “new value creators,” such as Google, in other words — something that is easier said than done.14

Fitzhugh-Mengers and Mark Diamond recommend the concept of ‘human-centered design,’ which is based around “understanding people’s needs using in-depth behavioral research.” It will use current technology and additional data from wearable devices and other sources to automate database translation and bring together all stakeholders “with common goals and a common approach to solutions,” including those from other sectors that translate to pharma.7

The Difference Maker

The big challenge will remain how to manage the data, how to decide what is and is not relevant, how to access it, and how to draw actionable insights. Many IT companies are touting specific solutions in the drug discovery fields, such as Integrichain, Medvivo, Medmeme, twoXAR, Cyclica and Schrodinger, to name but a few. But analysts concede that there will not be a one-size-fits-all answer to this.

Some pharma companies are shying away from acting due to incomprehension at the sheer scale of the challenge, wariness of the limitless potential costs, or fear of pioneer disadvantage. Tata Consulting’s Sanita Garg says that the challenges “are as much   cultural as technological.” Some are particular to pharma; others reflect the need for entirely new skill sets that “cannot be acquired in silos or through traditional training methods.”1

AstraZeneca has had a four-year part- nership with HealthCore — using HealthCore data alongside its own to guide in- vestment decisions relating to multiple chronic illnesses — and has also used the patient-driven data of PatientsLikeMe to guide its R&D on respiratory diseases, and worked with Practice Fusion to aggregate data on asthma patients. Roche has similarly worked with sequencing and diagnostics firm Foundation Medicine to use genetic data to guide its R&D.3,15

Roche has also teamed up with U.S. technology firm Qualcomm, as have GlaxoSmithKline and Novartis, which used the partnership in developing the Internet-connected emphysema inhaler, Breezhaler. Novartis has also worked with IBM on cloud-linked devices, while Sanofi is working with Google’s life science business. Nine major firms, including Pfizer, Bayer, Sanofi and AstraZeneca, are part of Project DataSphere —   built by analytics giant SAS — to share clinical data for cancer research.15,16

All of these are baby steps. Longer term, the stakes could hardly be higher. As Teradata says: “The bottom line: For pharmas, biotechs and other life science firms, the ability to handle Big Data is a difference maker — both on the bottom line and for the billions of people who rely on their products.” This may be the single challenge that will define the pharma industry’s whole future.17


  1. Garg, “The New Frontier for the Pharmaceutical & Life Sciences Industry: Real Big Value from Big Data.” Tata Consultancy Services. Web.
  2. Cattell, Jamie, Sastry Chilukuri, Michael “How Big Data Can Revolutionize Pharma R&D.” McKinsey & Company. Apr. 2013. Web.
  3. Kayyali, Basel, David Knott, Steve Van “The Big-Data Revolution in US Healthcare: Accelerating Value & Innovation.” McKinsey & Company. Apr. 2013. Web.
  4. Buytendijk, Harper Cycle for Big Data, 2014. Rep. Gartner. 4 Aug. 2014. Web.
  5. Hilbert, Martin, Priscila López. “The World’s Technological Capacity to Store, Communicate & Compute ” Science 332.6025 (2011): 60-65. Web.
  6. “What is Big Data?” IBM.
  7. Fitzhugh-Mengers, Shannon, Mark “Combining Human-Centered Design & Big Data in Pharma.” Life Science Leader. 31 Aug. 2016. Web.
  8. Millar, “Leveraging Big Data to Solve Pharma’s Hard to Cure Problems.” Pharmaceutical Technology. 16 June 2015. Web
  9. Gutierrez, “Bridging the Big Data Gap in Big Pharma with Healthcare Analytics.” Inside Big Data. 2 May 2016. Web.
  10. Dealing with Disruption: 16th Annual Global CEO Survey: Key Findings in the Pharmaceuticals & Life Sciences Rep. Pwc. Feb. 2016. Web.
  11. Moore, Tim, Hala Multi-Channel Closed-Loop Marketing: Digitally Transforming the Life Science Industry. Rep. Capgemini Consulting. 25 Oct. 2012. Web.
  12. O’Riordan, Anne, Sunil Rao, Raj Bhasin, Paul Technology Vision: Every Life Sciences Business Is a Digital Business. Rep. Accenture. Web.
  13. Geleden, “How Pharma Can Better Leverage Big Data to Support Business Strategies.” Valtech. Web.
  14. Drummy, “What Is Big Data and How Do We Use it?” PharmExec. 18 July 2012.
  15. Staton, “Roche CEO: Big Data Needs Big Pharma, and Vice-Versa.” Fierce Pharma. 6 Oct. 2015. Web.
  16. Munro, “Big Pharma Opens New Chapter on Big Data Collaboration.” Forbes. 8 Apr. 2014. Web.
  17. “Where Science Meets Data ” Teradata. Web.