Rapid traversal of vast chemical space using machine learning-guided docking screens to accelerate drug discovery

November 21, 10:00 – 11:00
Event Tags:
AI DDLS NBIS SciLifeLab Data Centre
Zoom Link

Venue

  • Online event via Zoom

Rapid traversal of vast chemical space using machine learning-guided docking screens to accelerate drug discovery

Virtual Event

November 21, 2025 @ 10:00 11:00 CET

Speaker: Israel Cabeza de Vaca, Uppsala university (Jens Carlsson group).

NBIS and SciLifeLab Data Centre arrange an open SciLifeLab AI Seminar Series aimed at knowledge-sharing about Artificial Intelligence and applications in the Life Science community. The seminar series is open to everyone. The seminar is run over Zoom on the third Friday of the month during academic terms, typically between 10 and 11 am, with approx. 45 min presentation and 15 min discussion.

Abstract
The rapid expansion of make-on-demand chemical libraries now offers access to tens of billions of synthetically accessible molecules, creating unprecedented opportunities for structure-based drug discovery. However, screening such ultra-large libraries remains computationally prohibitive, even with state-of-the-art docking methods.
In this presentation, I will describe a hybrid virtual screening strategy that integrates molecular docking with machine learning to efficiently navigate chemical space at the billion-compound scale. The workflow involves docking approximately one million compounds to a target protein and training a classification model to recognize high-scoring molecules. Using conformal prediction, the model guides compound selection from multi-billion-scale libraries, drastically reducing the number of molecules requiring explicit docking.
Among several algorithms evaluated, CatBoost provided the best balance between accuracy and computational efficiency, enabling large-scale applications. When applied to a 3.5-billion-compound library, this approach reduced the computational cost of virtual screening by over three orders of magnitude. Experimental validation identified novel ligands for multiple G protein–coupled receptors, including compounds exhibiting designed multi-target activity.
These results demonstrate how combining AI-based prediction with physics-based docking can make ultra-large-scale virtual screening a practical and powerful tool for modern drug discovery.

Contact
For questions, contact bengt.sennblad@scilifelab.se

Last updated: 2025-11-17

Content Responsible: Bengt Sennblad(bengt.sennblad@scilifelab.se)