A marked difference between SIFt and IFP is that IFP differentiates aromatic interactions by their orientations (face-to-face vs. A similar implementation was published by Cao and Wang, containing 10 interactions per residue, and termed ligand-based interaction fingerprint (LIFt).Ī widely-applied variant, simply termed interaction fingerprint (IFP) was introduced by Marcou and Rognan, containing seven interactions per residue. This modified version is implemented in the popular Schrödinger molecular modeling suite, which we also applied in this work, see Table 1. This implementation contained originally seven interaction types (any contact, backbone contact, sidechain contact, polar contact, hydrophobic contact, H-bond donor and acceptor), and was later extended to include aromatic and charged interactions as well. The first interaction fingerprint was termed structural interaction fingerprint (SIFt) and was introduced by Deng et al. In the most common setting, the Tanimoto similarity is calculated between a reference fingerprint (usually belonging to a known active molecule) and many query fingerprints.ĭespite the straightforward definition, interaction fingerprints have been implemented by various research groups and commercial software developers with slight differences in the specifics. Two such fingerprints are most commonly compared with the Tanimoto similarity metric (taking a value between 0 and 1, with 1 corresponding to identical fingerprints, i.e. A value of 1 (“on”) denotes that the given interaction is established between the given amino acid and the small-molecule ligand (a 0, or “off” value denotes the lack of that specific interaction).
Each bit position of an interaction fingerprint corresponds to a specific amino acid of the protein and a specific interaction type. As molecular fingerprints are binary (or bitstring) representations of molecular structure, analogously, interaction fingerprints are binary (or bitstring) representations of 3D protein–ligand complexes. Interaction fingerprints are a relatively new concept in cheminformatics and molecular modeling. The open-source Python package FPKit was introduced for the similarity calculations and IFP filtering it is available at. A careful selection of the applied bits (interaction definitions) and IFP filtering rules can improve the results of virtual screening (in terms of their agreement with the consensus metric). Metrics that are viable alternatives to the commonly used Tanimoto coefficient were identified based on a comparison with an ideal reference metric (consensus). ConclusionĪ general approach is provided that can be applied for the reliable interpretation and usage of similarity measures with interaction fingerprints. Different aspects of IFP configurations and similarity metrics were examined based on SRD values with analysis of variance (ANOVA) tests. With SRD, we can evaluate the consistency (or concordance) of the various similarity metrics to an ideal reference metric, which is provided by data fusion from the existing metrics. The performances were primarily compared based on AUC values, but we have also used the original similarity data for the comparison of similarity metrics with several statistical tests and the novel, robust sum of ranking differences (SRD) algorithm.
Particularly, the effect of considering general interaction definitions (such as Any Contact, Backbone Interaction and Sidechain Interaction), the effect of filtering methods and the different groups of similarity metrics were studied.
In a large-scale comparison, we have assessed the effect of similarity metrics and IFP configurations to a number of virtual screening scenarios with ten different protein targets and thousands of molecules. For this purpose, a large number of similarity metrics can be applied, and various parameters of the IFPs themselves can be customized. As a complementary method to ligand docking, IFPs can be applied to quantify the similarity of predicted binding poses to a reference binding pose. Interaction fingerprints (IFP) have been repeatedly shown to be valuable tools in virtual screening to identify novel hit compounds that can subsequently be optimized to drug candidates.