TuberDock: A Web-Based Platform for Machine Learning-Driven QSAR Modeling and pIC50 Prediction for Mycobacterium Tuberculosis Treatment

Date:

conference Tuberculosis (TB) is the world’s top infectious killer caused by Mycobacterium tuberculosis. In 2023, 10.8 million people fell ill with TB and 1.25 million died from it.. The goal of this study is to use computational drug discovery (CDD) techniques to discover lead compounds that target Enoyl ACP reductase (EAR) as a potential target for TB. Quantitative Structure Activity Relationship (QSAR) modeling is used in this study to compare different machine learning models to predict the potency (pIC50) of candidate compounds, which are then validated by molecular docking based on their binding affinity. A non-redundant dataset consisting of 373 compounds for EAR were retrieved from the ChEMBL database. Eight eighty substructure fingerprints were used to define these compounds, followed by multiple machine learning models building and comparison using LazyPredict. The Nu Support Vector Regression achieved the highest score, with a performance of 77.2%, as determined by its root mean square error (RMSE), and Gini index revealed the metric ranking importance features include, SubFP385, SubFP712, SubFP150 for predicting EAR inhibitors. Kennard–Stone algorithm was applied to select a diverse set of 10 compounds from the active inhibitors list for testing. The API, TuberDock, was developed and deployed using Streamlit. Molecular docking revealed the lead compound (CHEMBL571677) with pIC50 8.069M and exhibited the lowest binding energies of -7.4 kcal/mol. This ML-Based QSAR Modeling and docking approach is an effective strategy for accelerating design of novel and robust EAR inhibitors in drug discovery.

conference