Designing More Effective Therapeutic Proteins Using an Integrated Laboratory and Computational Platform

BigHat Biosciences was created to leverage integrated physical and advanced computational technologies to accelerate the design of safer and more effective protein-based therapeutics. The company’s platform has already proven successful in optimizing a challenging drug candidate under development by Amgen. CEO Mark DePristo, Ph.D., and CSO Peyton Greenside, Ph.D., discuss the founding of the company and its vision, the exciting initial success on the Amgen program, and where they see BigHat positioned in the coming years, with Pharma’s Almanac’s Editor in Chief David Alvaro, Ph.D.

David Alvaro (DA): Can you introduce BigHat Biosciences and discuss the company's genesis, the unmet needs or bottlenecks you identified that kickstarted your work, and the company’s vision?

Mark DePristo (MD): Peyton and I worked together at Google around five years ago and observed how technology companies were rapidly advancing their use of artificial intelligence (AI) and machine learning (ML). But it was very frustrating to not see that happening in the life sciences. Our final diagnosis was that bigger impact required AI/ML technology to be embedded earlier in the research pipeline. At the time, AI/ML was stuck as an analysis tool at the end of the research, as opposed to being integrated early, like it is in the technology sector.

Ultimately, we became so frustrated that we decided to start a company to address this crucial gap in the biopharma industry. So, we went looking for a problem that we could overcome using a unique wet lab designed for the high-speed data generation needed for AI/ML technologies. Protein and antibody engineering proved a perfect fit for this combination of a fit-for-purpose wet lab integrated with advanced AI/ML.

Peyton Greenside (PG): BigHat’s differentiation goes well beyond this combination of lab and ML, though. We always seek to get feedback as quickly as possible, so we work with small to medium-sized data pools rather than “big data,” which is a little bit of an overblown concept. What we view as more useful is agile or smart data that allow us to get feedback as quickly as possible once a hypothesis or design has been posited. The data sets are large but not massive. With this approach, we can completely change the scientific strategy — from one that is essentially akin to throwing a few darts at a couple hundred dartboards, with occasional success, to one that is intelligently and precisely targeted.

To achieve that, our computational models are continuously updated with lab-based feedback in terms of what is working and what isn’t, which is only possible with comprehensive integration between the laboratory and the data processing. That paradigm allows us to leverage the computational work beyond its more conventional use as a simple analysis tool; it becomes the driver, with immediate data feedback allowing continual improvement of the algorithms that then provide even more targeted searching. The built-in rapid iteration and focus on feedback are crucial elements.

MD: Our first focus for these technologies is making high-quality antibodies as quickly as possible. The two technologies — the wet lab and the AI/ML — enable us to realize that vision. Our platform, though, has a very broad surface area: it works not just for antibodies but for really most types of proteins and even nucleic acids, like DNA and RNA. In fact, our first grant came from the National Institute of Standards and Technology (NIST) to develop an optimized DNA template for cell-free protein expression.

With such a big space in which to work, we have adopted a business model combining partnering with the development of internal pipelines. Our first disclosed partnership is with Amgen to use our platform to significantly improve the properties of a biomolecule in a short period of time. In addition to our partnerships, we are pursuing a pipeline of novel antibodies and other protein therapeutics acrossa wide range of indications, leveraging our platform to create more functional, sophisticated biotherapeutics that will be better medicines for patients.

DA: When it comes to engineering increasingly sophisticated antibodies, is your approach more empirical or agnostic?

PG: We definitely take a more principled approach. The algorithms that we use are designed to explore areas of sequence space with the greatest potential for information gain. This contrasts with more commonplace approaches that look at mutations that occur within a tiny part of sequence space. That approach makes it impossible to determine how to generalize beyond that space. Our active learning algorithms use principled methods to look for areas of uncertainty where designing a sequence can best improve the model.

We define an objective and apply a model of sequences that predict that objective. At the beginning, the results are typically poor, but the system begins to understand the inherent uncertainty in the model. This mapping of what is certain versus uncertain is then used to drive further designs. We have spent a lot of time thinking about the best ways to do that, especially in a multi-objective setting, and are continuing to develop newer and better-informed methods. Our approach is therefore very principled algorithmically, even though it can be applied to any objective.

MD: We have a class of ML algorithms that are very data efficient and are able to guide the search for better sequences in an intelligent way. And better here means something we can measure in our lab. Today, all of the fundamental biophysics are online: binding affinity, stability, solubility, and the other essential assays that are in effect table stakes for antibody development today.

But ultimately we will use these low-N learning techniques to design molecules using the best proxies for the truly important properties of in vivo efficacy and safety. Since we can’t measure those directly today, we spend a lot of time evaluating and onboarding downstream functional assays to enable even better designs.

The contrast with the rest of the antibody development field is most clear here. Most antibody discovery techniques focus on identifying tight binders, because they use that property to experimentally isolate good molecules from a soup of roughly random ones. Then they screen for all of the other properties that are important for a clinical-grade drug. But at BigHat, our technologies allow us to directly design antibodies for these properties using only a small amount of data.

Doing this requires sophisticated algorithms that make the most use of these other types of data, because those data are always limited. It’s just hard to get functional data or to derive good proxies for efficacy and safety. Running those at microplate scale is already high-throughput for most assays. The exciting thing is that the rapid wet lab and AI/ML tech we’re developing finally allows us to tackle these challenges.

DA: It strikes me that optimization of a protein is an endeavor that could potentially continue indefinitely. For a given optimization project, how do you determine an end point?

PG: This is a question that we definitely have the pleasure of struggling with at BigHat. Unlike a more traditional funnel that screens a large but fixed set of molecules, our approach allows us to optimize toward any objective and then continue to do so in an effectively unbounded space of molecules. So, for all of our programs, it’s critical  that we know when a molecule is ready for downstream development versus continuing to search for a better molecule.

Setting these goals requires intense input from clinicians or experts in biomanufacturing. It all comes down to making sure the molecule will have the right effect in the therapeutic context while also being developable. Setting those quantitative benchmarks ahead of time, as was done in the collaboration with Amgen, is a critical exercise.

MD: Consider stability as an example. An amorphous goal would be to make the molecule as stable as possible. A more concrete goal would be enough stability to survive downstream purification, lyophilization, storage, and shipment — maybe even without the need for a cold chain. Those needs translate neatly into fairly quantitative thresholds. But, of course, the goalposts will be different for different products.

DA: Do you view these projects as tools to bring a given program rapidly to a high existing standard or to go beyond that and set a new standard for how optimized a molecule can be?

MD: The latter and that opens up real challenges on the drug development side for BigHat. What does it mean to set a real goal when you actually have a machine that can change your molecule to achieve these goals? It’s critical to really consider the different parameters and the tradeoffs that come with improving them. For instance, a certain level of affinity might be desired, but a lower level might be acceptable if a higher level of stability can be achieved. Our challenge is to make these types of decisions more explicit and quantitative.

It is an exciting challenge to face, and it dovetails into our other challenge at BigHat around measurements: more specifically, how to determine that the molecules we identify are actually good. We are obsessed with data quality in our platform and getting more and more proxies that are closer and closer to in vivo efficacy and safety.

DA: Can you tell me a little bit about your collaboration with Amgen?

MD: We reached out to Amgen because of a beautiful review that Philip Tagari, Amgen’s Vice President of Research, wrote about using AI in biologics. Those conversations eventually morphed into this first partnership, once BigHat had our lab online and could start the design–build–test cycle.

We’ve been working with the company to tackle a particularly challenging target and we were delighted to recently report improving this molecule 10x in less time than anticipated. These results really validate BigHat’s vision that it’s possible to iteratively improve antibodies by measuring their properties, learning from the feedback, and designing new rounds of variants based on that information.

The success of this first phase leads into a second phase aimed at delivering a very high-quality molecule; you can view the two phases as relative improvement versus absolute goals.

DA: How important is this first project and the relationship with Amgen in attracting attention from the industry more broadly?

MD: Amgen is a world-leading top-five biopharma, so there is clearly value for us to work with a company that everyone knows has tried everything out there to develop biotherapeutics. It’s a big reason why we’re so excited to publicly announce not only the collaboration but the successful achievement of our first milestone. It tells the industry that there’s a “there-there” at BigHat.

Stepping back, working with partners is central to our long-term strategy. With so many possible applications of our platform, it’s simply impossible for us to pursue them all on our own. If we want to deliver on all these opportunities, we need strong partners to help us carry the load.

Of course, BigHat is a young company — only just two years old now. The industry doesn’t yet know what it is possible to create using our integrated synthetic biology and machine learning technologies. The best way to do that, in our view, is to show them compelling demonstrations to draw potential partners to us. We are doing just that both through partnerships, like with Amgen, and with our in-house programs.

Let me share a bit about how our in-house programs support these efforts. Internal programs are selected, in part, to push the limits of what’s feasible to create today. For example, creating conditional antibodies, functionally optimized antibodies, multispecific antibodies, and inhalable antibodies are all outstanding challenges in the community. By pursuing these in-house programs, we create an evidence base that our platform can tackle the hardest design challenges out there. Which grows more compelling as the programs advance towards the clinical, first with in vitro data, then preclinical, then finally and hopefully validation in people. And all of these data flow back into our platform, improving our next campaigns. Internal or partnered. All while bringing novel medicines to patients. It’s win–win–win.

The key questions are which programs are worth pursuing, which ones are best partnered and, if so, when? And, of course, if partnering is the best way to go, then the next crucial factor is choosing the right partner.

PG: That last item is really important. I would like to say that Amgen has been a truly delightful partner, and there is tremendous value in that as well. They are incredible to work with, and we’ve really enjoyed working with their team.

DA: Given that a lot of the details in the work you do are confidential, is it a challenge to communicate to potential partners or customers exactly what your platform can do?

MD: We’ve both been pleasantly surprised by how easy it is. It boils down to the simple fact that we make real therapeutic molecules, and people know what they want to do with those molecules. So, while BigHat uses very different technologies to identify that molecule, there isn’t really a need to explain why all of those steps are the right steps. We can focus on the quality of the molecules themselves, which are independent of us and of the history of their creation.

PG: I feel really strongly about that. This is not an issue of “or;” It is about combining everything we know and then seeing how far we can go by using that knowledge and learning from it. The proof is in the molecule that is made and not necessarily the process.

DA: How do you decide which projects to take on? Do you look for fundamentally similar programs to keep refining your existing models or for diversity to broaden them?

PG: This is actually one of the nicest parts of the platform. It’s not a zero-sum game in the sense that learning can be transferred from project to project regardless of the specific nature of the molecule involved. What the platform learns about stability can be used for other projects, because the algorithmic approach we use learns about the properties in general, and that knowledge can be transferred regardless of the modality. The questions we are asking and the factors we take into consideration when determining if a candidate is a good clinical molecule are broad-reaching, and everything is really synergistic.

MD: It is also important to realize that BigHat’s platform is really a platform in the technology and engineering sense. For instance, we doubled our capacity from 200 to 400 antibodies in January, but that didn’t require a massive increase in personnel, because of our emphasis on engineering, automation, and process optimization. We scale more like a tech company than a biotech company. The ability to increase the platform 2x with essentially no increase in personnel allows us to focus on getting the platform as much scale as possible, doing more work, and learning more efficiently.

DA: Looking forward, what kinds of issues is BigHat looking to investigate, and how do you plan that to pursue them?

MD: At a very broad level, our pipeline is not indication-driven. We’re looking for situations where the therapeutic window is too small owing to the design of the currently available molecules, which is surprisingly common throughout drug development. Basically, you’re always trapped between safety and efficacy. If you want to broaden the therapeutic window, making a drug both safer and more efficacious, you will need more sophisticated designs.

One way to achieve this is with molecules that bind to multiple receptors rather than just one, aka multispecifics. For cancer, this likely means developing antibodies that are sensitive to the tumor microenvironment. Oncology and immune and inflammatory disorders are examples of diseases where it is critical to intervene just the right amount.

Another area of interest to us are problems requiring people to search broader and broader data sets to find rare molecules with unusual combinations of properties. Antivirals are a good example, where people are looking for ultra-rare, broad neutralizing antibodies. BigHat’s approach here, in contrast, is to simply transform good initial antibodies into one with the desired properties, regardless of how rare those properties might be in nature.

DA: Beyond antibodies, how generalizable is your platform toward other classes of therapeutic proteins and beyond?

PG: It is incredibly generalizable, which is both the beauty and the challenge of our technology. We can work on any protein and optimize towards any objective. The world of opportunities is humungous, which is creating a fun but daunting challenge for us. Our primary focus is to find areas where the molecular engineering that we excel at will solve an unmet medical need and do so efficiently and effectively. Therapeutic antibodies, all sorts of classes of monoclonal and multispecific antibodies — those are just the beginning. We are quite excited to have the opportunity to explore these next-generation biomolecules. Some of our initial R&D efforts have been directed at determining how to choose from all of these exciting design challenges.

MD: The broad applicability of the platform is another reason that we like to partner. It’s ideal to collaborate with organizations that are domain experts in newer areas we’re exploring. In-house, BigHat focuses on antibodies, but we know that there could be many other applications of our platform — e.g., peptides, capsids, and chimeric antigen receptor (CAR) designs — these are all areas where our platform could generate improved designs. We just need to find the right people to work with on those projects.

DA: At this point, is your ongoing development of the platform more a case of  refining it and updating your models or are there new technologies and new algorithms to incorporate and enhance the platform itself?

PG: It’s both. We are constantly expanding the number of assays, the throughput, the amount of automation, and the reliability of the data. As an end-to-end drug developer, we continue to expand our discovery efforts as much as on optimization efforts. It is pretty exciting to think about both upstream technical developments and the optimization that we can do on a given antibody, as well as closing the gap on what other assays we can run downstream that will ensure that our candidates will be good clinical molecules. We have a real end-to-end structure in that sense and continue adding to our platform to facilitate all of those efforts.

DA: When you look forward a bit further into the future — five or 10 years — how do you see BigHat Biosciences evolving?

MD: We expect a lot of exciting news over the coming two years: scaling up the platform and proving our end-to-end discovery capabilities. Today, we are starting to build a deep body of evidence showing that we can make better antibodies faster, and not just for toy examples but for real-world antibody development challenges. We’ll show this through our in-house programs and our partnerships.

Zooming out to the longer term, we see a world where our antibody platform  is 10–100 bigger and faster. We will supplement this increased capacity with more assays, from  biophysics to more downstream proxies for safety and efficacy. Our now massive internal data sets let us design and optimize molecules extraordinary quickly. And we’ll use these new capabilities to do even more partnerships, more internal programs, both for therapeutic antibodies and other proteins.

PG: I see BigHat as having the power to redefine what an antibody is. Currently, we have traditional antibodies plus next-generation antibodies — Frankenstein versions of traditional antibodies. But there is much more that can be done given the ability to engineer specific properties into any kind of protein. The protein could be small or large, have multiple heads, some of which are conditionally active, and so on. I am really excited about the idea that we’re ultimately not constrained by what an antibody has been until now. BigHat will play a big role in defining the next generation of the next generation and help bring them to market.

Mark DePristo

Mark is a leader in applying computational and statistical techniques to biomedical challenges in genomics and biochemistry. Before BigHat, he founded the Genomics team in Google Brain, was VP of Informatics at SynapDx, and was Co-Director of Medical and Population Genetics at the Broad Institute. He has a BA in CS and Math from Northwestern, a PhD in Biochemistry from Cambridge as a Marshall Scholar, and was a Damon Runyon Cancer Research Fellow at Harvard. Dr. DePristo's academic articles are widely published with more than 72,000 citations.