Ensemble learning-guided discovery of anti-tuberculosis phytochemicals: computational prediction and mechanistic insights.

Ensemble learning-guided discovery of anti-tuberculosis phytochemicals: computational prediction and mechanistic insights.

Sajal, Harshit; Mohan, Aswin; Raju, Rajesh; Nadh, Anuroopa G
Journal of computer-aided molecular design 2026 Vol. 40
9
sajal2026ensemble

Abstract

Tuberculosis (TB) remains a major global health challenge driven by persistent Mycobacterium tuberculosis infection and increasing drug resistance. Phytochemicals represent a structurally diverse and underexplored chemical space for anti-TB drug discovery, yet systematic prioritization strategies integrating machine learning and structure-based validation are limited. A curated phenotypic anti-TB dataset of 425,180 compounds was used to train ensemble ExtraTrees models based on ECFP4 fingerprints and physicochemical descriptors. The models achieved strong predictive performance (ROC-AUC up to 0.983; MCC up to 0.871). SHAP analysis enabled mechanistic interpretation by identifying the key molecular descriptors and fingerprint features driving anti-TB activity predictions. The validated ensemble was applied to screen 4707 phytochemicals, yielding 3209 predicted actives, of which 778 satisfied applicability domain criteria. High-confidence candidates were subsequently evaluated by molecular docking against twelve structurally validated essential M. tuberculosis targets spanning cell wall biosynthesis, energy metabolism, nucleotide synthesis, and cofactor pathways. Docking analysis identified 486 phytochemicals with favorable predicted binding affinities, including 193 compounds exhibiting multi-target engagement. Several top-ranked candidates reproduced canonical interaction patterns of co-crystallized inhibitors, supporting mechanistic plausibility. This integrated chemoinformatics and structure-based framework enables robust prioritization of phytochemicals with biologically meaningful and multi-target antitubercular potential. The study provides a computationally grounded strategy for accelerating lead identification against drug-resistant TB.

Citation

ID: 7906
Ref Key: sajal2026ensemble
Use this key to autocite in SciMatic or Thesis Manager

References

Blockchain Verification

Account:
NFT Contract Address:
0x95644003c57E6F55A65596E3D9Eac6813e3566dA
Article ID:
7906
Unique Identifier:
10.1007/s10822-026-00833-2
Network:
Scimatic Chain (ID: 481)
Loading...
Blockchain Readiness Checklist
Authors
Abstract
Journal Name
Year
Title
5/5
Creates 1,000,000 NFT tokens for this article
Token Features:
  • ERC-1155 Standard NFT
  • 1 Million Supply per Article
  • Transferable via MetaMask
  • Permanent Blockchain Record
Scan with Saymatik Web3.0 Wallet

Saymatik Web3.0 Wallet