April 13, 2023 PAO-04-23-CL-02
Petrina Kamya (PK): Insilico Medicine is an AI (artificial intelligence) drug discovery company. We have two discrete but overlapping business models: we develop generative AI–driven software, and we use that software internally to develop our own assets. We license both the software and the assets that we create.
Insilico was established in 2014, having emerged from Johns Hopkins University in Maryland. Since then, we have grown to be a global company with headquarters in Hong Kong and New York and offices in Abu Dhabi and Montreal, Canada. Our R&D center is in Shanghai, and we have a robotics lab in Suzhou, China. We are now located literally all over the world; I believe that we cover all time zones.
In terms of our mission, we are focused on pursuing diseases for which there is high unmet need and accelerating the discovery of new targets and new therapeutics to get them to patients faster, primarily by using AI.
PK: Initially, we began developing deep learning algorithms to address many of the challenges associated with drug discovery and development. We put these algorithms together and built three specific platforms: PandaOmics, which focuses on target discovery; Chemistry42, which is involved in small molecule discovery and development optimization; and InClinico, the platform we launched most recently, which we developed for clinical trials planning and outcome predictions, to predict the probability that a program will transition from phase II to phase III.
That was the founding goal of the company. However, we quickly realized that we needed to validate the software platforms in order to show that they truly worked as we intended. To do so, we started to build out our own programs. Beyond successfully validating our platforms, we saw clear value in developing these programs to the point where they can be licensed out, since that becomes another revenue-generating business model.
PK: First and foremost, for target discovery and chemistry more broadly, there is a real need to overcome human bias in terms of identifying novel targets and candidate molecules that could become first-in-class therapeutics. In target discovery, AI will take multimodal data and tease out patterns that can help identify novel targets. Beyond the targets themselves, it can help us understand the pathways and the genes that are implicated in the disease, as well as other diseases that are linked to that disease. AI has a very strong ability to uncover patterns in multimodal data that are otherwise quite difficult for us to decipher on our own.
In chemistry, AI is very good at imagining things without the necessary human bias that we have. Many generative AI technologies have emerged that can be leveraged to discover novel chemical molecules. In addition, we are using other techniques to reinforce active learning to improve those molecules and to optimize them, so that they satisfy certain properties that are necessary for drugs.
In the realm of clinical trial outcomes predictions, the AI models underlying the InClinico platform allow us to again tease out features that affect the probability of the success of clinical programs during the critical transition from phase II to phase III that you would otherwise not be able to identify. Essentially, the core of all of these platforms is this ability to find patterns in multimodal data that are out of reach of human researchers.
PK: In the simplest terms, you can think of “generative AI” as any use of AI to create something new based on data on which it was trained –– essentially, any time AI is asked to generate something. You might be familiar with some of the popular applications that generate text (like ChatGPT), voices, or images; in the same way, you can use AI to generate molecules. It uses neural nets, deep learning algorithms, and so forth, but the aim is to generate something new.
PK: We have quite a few strong differentiators. For example, we have our time machine approach and our iPanda algorithm. Both of those are used to identify the relationships between a gene and a disease in a manner that is unique to PandaOmics. In addition, we recently added a transformer-based knowledge graph, which is a feature that takes available information related to a disease and maps out all of the connections that are found in the literature. A user can then use this knowledge graph to better understand the relationships among genes, diseases, and medications that are used, the pathways that connect them all, and other diseases as well.
Most recently, we have connected that knowledge graph to a chat functionality based on large language models. I believe that we’re one of the only companies if not the only company that has done this with a target discovery engine. This chat functionality, which we call ChatPandaGPT, allows the user to query that knowledge graph and identify what these relationships are, based on exactly what the user would like to know. ChatPandaGPT makes the knowledge graph more accessible and more user friendly, and it makes the information more understandable as well.
PK: Things were definitely somewhat disconnected. The knowledge map would be a beautiful image centered on the disease. There would be edges connecting different nodes, which would be a map of other little circles representing genes that are implicated, all of which would be connected. But you’d have to access this information in a piecemeal manner. You’d go: “Oh, this gene is interesting,” and you could click on that and find out more information. You’d be taken to a gene page and find out more information about that gene. And then, depending on what’s written on the edges that connect with the different nodes, the gene might either be upregulated or downregulated. So, the burden would to some extent be on the user to aggregate this piecemeal information and assemble it meaningfully at the end.
In contrast, with ChatPandaGPT, you can just type in as a prompt: “Show me the genes that are implicated in this disease and any other diseases and list the other diseases that are implicated.” And all the information that you are looking for will be listed out for you. Whatever sort of relationship you’re looking to find out more about, you can just type it into the prompt. And the ChatGPT functionality will talk to the knowledge graph, which is very specialized information, and then transform that into a form that is more informative to you and more comprehensive.
PK: In my personal experience using ChatPandaGPT, I was surprised at how useful it actually is. Initially, I thought that it would essentially ingest information and spit it back out in different ways that would be useful. But I found the interface to be so much more informative, and it simplified the whole process a lot more than I thought it would. In addition to that, we’re now looking into how we can incorporate this technology into our other platforms as well. We’ve found that it’s surprisingly useful, and we’ll see what additional uses and benefits evolve as we go.
PK: It should be relatively straightforward. It took our team about a week to do it for PandaOmics. That’s very fast, although I won’t say it was easy , and they were able to integrate this very, very cool technology into the platform that has shown to be incredibly useful.
PK: Absolutely. The work that goes behind creating the disease page and analyzing the data is not for everyone. But once the disease page has been analyzed and created, the ChatPandaGPT functionality definitely increases the usability and the accessibility of this information.
PK: I definitely think so. There is a lot of information out there already, and biology is still not very well understood. We are always trying to investigate diseases in a much more in-depth way, and the etiology, pathology, and epidemiology are still very much unknown for a lot of diseases. The heterogeneity of disease adds yet another layer of complexity.
All of that is data, and everything can be processed and hopefully be used to train a large language model that will help us better understand the biology of diseases. Ultimately, I think we are just at the very beginning of all of this.
PK: We primarily focus on a few therapeutic areas: fibrosis, oncology, CNS diseases, and immunology. Our CEO is particularly passionate about aging, and so a lot of the diseases and targets that we pursue are implicated in aging, such as fibrosis, inflammation, and some of the key pathways associated with aging. There is great synergy in that many of the diseases that we’re investigating, whether they are chronic diseases or diseases that people are suffering from now, involve targets are also implicated in aging. What would be really cool is to see whether these drugs that we’re developing have that dual effect on patients: on the disease itself but also on people’s lives and their quality of life. I think there is the potential to unlock a lot of interesting outcomes from our pipeline.
PK: That’s exactly right. Aging is not classified as a disease, but there are a lot of diseases that develop as the result of aging. In essence, if you’re investigating a disease, you’re looking into an underlying pathway that is probably linked to aging anyway, even though aging itself is not a disease. To date, as a pharma company, you still can’t really just say you’re targeting aging.
PK: We are very flexible and adaptable, and we work with different companies in different ways, depending on their needs. Every pharma company works in a unique way, and many are looking for a partner who can enable them to develop their own pipeline of therapeutics in their own way rather than take over their drug discovery programs. Those companies can license our software.
Other companies are more interested in bolstering their internal pipelines with additional programs without taking time away from their focus with their internal resources, so they outsource the entire process. We can nominate an initial target and develop everything up to a stage where they are ready to in-license it as a partner.
PK: Right now, I think that exactly what we set out to do is going to continue happening for some time. There are many, many stages of drug discovery and development, going all the way to commercialization. At the moment, we are just at the very beginning. At every single stage, there are definitely bottlenecks and challenges.
I think that what you’re going to see happening is that more and more of these challenges will be addressed using AI techniques. It’s just inevitable. In most processes there are certain things being done that are redundant, repetitive, or lacking in imagination –– not through anyone’s fault, but that’s just the way it is. For all those, you can adapt AI algorithms to help alleviate those bottlenecks, address challenges associated with insufficient imagination, and improve and streamline the process. I believe that’s what’s going to happen in our industry, piece by piece.
PK: Both! We have thought leaders in the company that are really, really passionate about the products that we’ve created and further elaborating them. We also have innovators who are always thinking about the next thing and pushing the envelope. I anticipate many developments on both fronts.
Petrina Kamya, Ph.D., is the Head of AI Platforms and President of Insilico Medicine Canada, overseeing Insilico's end-to-end generative AI-driven drug discovery platform, Pharma.AI, which includes target discovery (PandaOmics), small molecule generation (Chemistry42), and clinical trial outcomes prediction (inClinico). Prior to joining Insilico, Dr. Kamya was at Chemical Computing Group, where she led sales and business development of molecular modeling software for pharma and biotech companies, and then Certara, where she consulted for pharma companies. She holds a BS in biochemistry and a Ph.D. in chemistry.