Our experienced scientists and engineers optimize all phases of drug-discovery, including early hit identification, hit-to-lead, lead optimization, patent strategy, and preparation for IND filing, yielding shorter timelines to develop the highest-quality drug candidates.
We are using AI-driven drug discovery to ensure better molecules and faster cures. Our mission is to discover, develop and deliver to patients novel, small molecule drugs that were previously undruggable or undiscovered until our technology came along.
Binding Modeling
One of the most critical components of our drug discovery pipeline, and one that differentiates us from the rest of the AI drug discovery companies, is our ability to create causal, data-driven, human-interpretable hypotheses for a wide range of targets.
Discovering novel binding modes is the first step towards finding novel therapies
For most therapeutic targets of interest today, there is either no data, limited data, or biased data. For that reason, we start by generating a hypothesis from scratch, using only information about the approximate binding region that gives rise to the desired phenotype. Using PocketExpander™, our proprietary pocket mapping tool that leverages AI/ML as well as comp Chem methods, we are able to identify potential relevant binding modes. The output is then used as an initial binding hypothesis that is fed into MolGen™. The data obtained from synthesis and physical testing of molecules predicted by MolGen™ are then used to refine our hypothesis using causal analysis.
We fundamentally believe that discovering novel small molecules requires an understanding of the physics and chemistry of binding and therefore, requires 3D structural models of the protein-ligand complexes. More common methods that focus on the ligand chemical graph through SMILES strings or pharmacophores cannot sufficiently model the information relevant for binding.
To prepare a structure, we use a combination of publicly available crystal structures, internal structures, and predicted structures and run them through our proprietary structure preparation protocols. Using our Magneto platform, we automatically analyze the structures’ quality, align reference frames, build missing pieces, and then manually review them.
Machine Learning methods usually used for drug discovery are inherently focused on correlations. Since data sets around binding are often very small and highly biased, these methods will often get trapped by the bias and be unable to discover truly novel chemical matter. Our Causal ML methods for Hypothesis Generation analyzes the 3-dimensional structure from a physics-based perspective and uses causal analysis to find just the interactions relevant for binding in an unbiased way; thereby, unlocking new indications and safety profiles.
On top of the strength of the causal hypothesis, it is designed to be human-interpretable, allowing scientists to make updates or experiment further in novel binding/selectivity modes. Causal features can then be shown as heatmaps in 3D (or mapped to 2D) and can be reviewed by medicinal and computational chemists to identify irregularities or gaps and customize them to explore additional interesting regions or interactions. This allows for a feedback cycle that ensures we don’t blindly follow the ML algorithm or a chemist’s intuition.
TCP-Aware Drug Design
Our patent-pending molecular generation tool, MolGen™, is used to design novel, diverse compounds that are readily synthesizable and fit the desired TCP for a given drug program.
The molecular generation process leverages state of the art deep reinforcement learning (RL) to construct synthesizable molecules that efficiently capture all causal interactions for binding and selectivity while maintaining the desired ADME-tox profile.
MolGen™ utilizes hundreds of workers, choreographed in a unified pipeline (leveraging distributed training, hyperparameter optimization, GPU instances, and more), to improve the output quality of molecules as quickly as possible.
Prospective Evaluation of Accuracy
In parallel to our binding and selectivity hypothesis generation models, we have trained several state-of-the-art models (DeepPropR™), leveraging both public and proprietary data sources, to predict important pharmacological properties (ADME-Tox). Our DeepPropR™ models are frequently validated experimentally on molecules we synthesize to ensure an accurate measurement of the models’ performance and generalizability.
The figure (left) depicts the accuracy of our ten most used DeepPropR™ models from a prospective evaluation. Most of these accuracy data include results from at least 200 compounds that were not part of the training set but were generated from our internal pipeline and subsequently evaluated via physical testing. DeepPropR™models are also available for on-the-fly inference at scale using our custom-built Molecular Property Service (learn more in our relevant blog post, here).
Robotic Synthesis & Automated Physical Testing
A large number of potentially outstanding compounds designed for a given drug discovery program are tossed as a result of decisions that led to them either not being synthesized or only partially characterized due to long timelines and high costs of manual synthesis and assays.
DeepCure’s Automated Molecular Foundry seeks to reduce design-build-test-learn cycle times and cost without limiting novelty and diversity of compounds via a fully integrated platform that includes robotic automation software, robotic synthesis, and assay automation.
Using AI and cutting-edge automation, we can reduce cost and cut cycle times. The chemistry is based on robotic synthesizers, MS-triggered purification systems, and liquid-handling systems, while automated screening and in-vitro testing generates up to a 100,000 datapoints per month.
This AI-to-Bench integrated workflow will ensure that novel in-silico target discoveries are immediately and directly validated in-vitro and fed back into the AI pipeline to optimize for success. This streamlined process shortens cycle times, increases efficiency and reduces costs.
At DeepCure, automation isn’t just about reducing the cost and time of compound synthesis – it’s about going beyond the limits of manual synthesis to carry innovation all the way through the design-build-test-learn cycle. Our foundry unlocks the chemical space that AI drug design tools want to explore but can’t because it is not practicably available to most chemists. DeepCure’s Molecular Foundry is built to expand the usable chemical space for drug discovery through increased synthesis success rates, removal of human bias in synthesis, and greater efficiency of custom multi-step synthesis.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.
Our patent-pending molecular generation tool, MolGen™, designs novel, diverse compounds. Using state-of-the-art deep reinforcement learning (RL), MolGen™ constructs synthesizable compounds with features that capture the important molecular interactions for binding and selectivity, as well as deliver the desired ADME-tox profile of the target candidate profile (TCP).
MolGen™ is designed to generate leads, rather than hits, from Day 1, which is made possible by a proprietary set of ADME-tox models (DeepPropR™). The figure shows the accuracy of our 10 most used DeepPropR™ models from a prospective evaluation.
Output of PocketBlueprinter™
MolGen™ – building & iterating compounds
Novel, potent, & selective compound
Unlike other AI drug discovery companies, DeepCure does not use AI to simply match a library of compounds to a known pocket. Instead, we create causal, data-driven, human-interpretable hypotheses for binding to a given protein target. This enables us to go beyond known binding sites and ligands.
We prepare 3D structural models using a combination of publicly available crystal structures, non-public structures, and predicted structures. As part of our proprietary structure preparation protocols, our scientists review a set of data quality metrics for the structures, select reference structures, delete bad structures, and group structures. This ensures the structure(s) used for hypothesis generation is the best representation possible.
For most therapeutic targets, there is no data, limited data, or biased data. PocketBlueprinter™ allows us to generate novel hypotheses by leveraging AI/ML and computational chemistry methods to map the protein surface and identify novel binding modes. The outputs serve as an initial binding hypothesis for our molecular generation tool, i.e. MolGen™.
ML methods for drug discovery typically focus on correlations. However, these methods lead to biases for the types of compounds that have failed in discovery and are inadequate for finding truly novel compounds. To overcome these shortcomings, DeepCure uses causal ML to find binding interactions without the biases for failed binding modes and/or previous compound structures.
DeepCure’s platform is designed to be human-interpretable. Causal features can be shown as heatmaps in 3D (or mapped to 2D) for review by medicinal and computational chemists, enabling identification of irregularities or gaps in the models. By seeing how molecules are predicted to interact with the protein, scientists can make rational design changes to the molecule and explore interesting molecular interactions. Human interpretability allows for a feedback cycle that ensures scientists don’t blindly follow the ML algorithm or chemists’ intuition.