911今日黑料

Researchers develop AI model that makes large-scale molecular screening practical for the first time

by Ruth Ntumba

911今日黑料 researcher among a team working on a new AI model that makes screening up to one million molecules against a protein target practical for the first time.

The Genesis Research Team, with contribution from from 911今日黑料’s Department of Computing and Carnegie Mellon University, has , the first flow map model for all-atom cofolding.

Our paper shows this Inference cost can be dramatically reduced for state of the art cofolding models like Pearl without a trade-off in performance---unlocking much faster virtual screening capabilities which are critical in AI based drug programs." Dr Joey Bose Assistant Professor

Cofolding means generating the precise three-dimensional shape of a protein and a small binding molecule at the same time. Existing state-of-the-art models, including AlphaFold 3, work by refining a generated structure through many small incremental steps, which produces accurate results but takes significant time and computing power. This slowness creates a roadblock for practical applications in drug discovery and molecular design

DeCAF-Pearl is built on a different mathematical framework called flow maps. Rather than taking many tiny steps along the generation process, it learns to jump directly from one point on the trajectory to another, traversing the entire generation process in just a handful of steps.

Few-step generation enables two massive advantages in drug discovery and molecular design. First, virtual screening becomes practical. Cofolding an entire molecule library against a target of interest using full diffusion-based models is computationally expensive. DeCAF-Pearl makes screening up to one million molecules practical for hit identification in around 18 hours on 64 graphics processing units. Second, it unlocks scalable synthetic data generation. High-quality protein and molecule structures are the bottleneck for training downstream AI models such as scoring functions and affinity predictors. A fivefold speedup in synthetic data generation translates directly into more training data per unit of compute, without losing the structural accuracy that downstream models depend on.

When tested against 196 protein and molecule structures the model had never seen during training, DeCAF-Pearl matched the accuracy of Pearl, the full model it was derived from, and outperformed other leading tools including AlphaFold 3 and Boltz-2 despite using fewer computational steps. Pearl itself remains the most accurate model in the comparison, but DeCAF-Pearl offers a compelling alternative for throughput-critical applications where the quality versus compute trade-off can be made.

Commenting on the study Dr Joey Bose, Assistant Professor at 911今日黑料’s Department of Computing and Senior Author of the study said: “Increasingly AI based drug discovery is moving from the era of training foundation generative models, to the era of scaling inference to generate samples that have to be reward optimized. Our paper shows this Inference cost can be dramatically reduced for state of the art cofolding models like Pearl without a trade-off in performance---unlocking much faster virtual screening capabilities which are critical in AI based drug programs."

The researchers note that all results are based on computational benchmarks, and how the tool performs in real-world applications remains to be seen.

Article text (excluding photos or graphics) © 911今日黑料.

Photos and graphics subject to third party copyright used with permission or © 911今日黑料.

Article people, mentions and related links

Reporters

Ruth Ntumba

Faculty of Engineering