AI4Science loader
Authors
Milisavljević, Andrej, Pražnikar, Jure, Bren, Urban, Jukič, Marko
Publication
Future medicinal chemistry, 2025
Abstract

Aims Understanding protein–ligand binding site behavior is central to structure-based drug design. We analyzed amino acid composition and interactions in protein–ligand small-molecule binding sites and developed a novel method for binding site prediction. Materials and methods We analyzed the PDBBind+ database, which contains the largest protein–ligand binding site dataset known to us, using existing cheminformatics packages and in-house code. We used the resulting data to train a binding site prediction model. Results Within solvent-accessible binding regions, tryptophan, phenylalanine, tyrosine, methionine, and glycine, were enriched. Interaction analysis revealed hydrophobic contacts as the most frequent, followed by hydrogen bonds, water-bridged hydrogen bonds, salt bridges, π–π, π–cation, and occasional halogen interactions. We introduced the amino acid binding site enrichment index (ABSE), to support small-molecule binding site detection, and developed a model that discriminates binding site sequences from protein surface patches with 0.91 accuracy. Conclusions This work offers interpretable composition–interaction relationships and practical tool for binding site characterization. To facilitate application, we provide a free, open-source, fast, bindingsite identification tool (AABS), available at https://gitlab.com/Jukic/aabs. We anticipate that these findings and tool will advance binding site prediction and accelerate computationally intensive drug discovery within medicinal chemistry.