Machine Learning in Leukemia Treatment: QSAR Modeling for Lead Compound Identification

Date: April 03, 2024

conf Aim: To utilize computational drug discovery techniques to identify lead compounds targeting Tyrosine Kinase, a potential target for acute myeloid leukemia. This study utilizes QSAR modeling by comparing various machine learning models to predict the potency of candidate compounds based on pIC50 values. Subsequently, molecular docking is employed to validate the lead compounds based on binding affinity. Materials and methods: Bioactivity data of TKIs’ were retrieved from the ChEMBL database, resulting in a curated dataset of 3000 compounds. PaDEL and Lipinski Descriptors were computed for the compounds, followed by machine learning model comparison via LazyPredict. Model performances were visualized using Seaborn and Matplotlib. Application interface was deployed using streamlit, while Molecular docking studies were conducted using PyRX to assess binding affinity, with visualizations generated using Chimera. ADMETlab2.0 was used to assess drug-likeness of the lead compound. Molecular Dynamic Simulation was performed using iMods Server. Results: Box plots were generated to visualize the relationship between Lipinski descriptors and the bioactivity class of compounds. Model evaluation involved the comparison of ML algorithms using RMSE, R-square, and time taken plots, providing a comprehensive assessment of a good model fit and accuracy. Additionally, a scatter plot was employed to explore the relationship between experimental and predicted pIC50 values. Discussion: The Random Forest Regressor demonstrated the highest accuracy at 92% among the ML algorithms tested for QSAR modeling. Conclusion: This study identifies a novel lead compound that can inhibit the activity of tyrosine kinase, which stops the spread of leukemia. Read more

conf2

Share on

Twitter Facebook LinkedIn

Sohith Reddy

Share on