mcs_similarity#
- skfp.distances.mcs_similarity(mol_a: Mol, mol_b: Mol, timeout: int = 3600) float #
MCS similarity between molecules.
Computes the Maximum Common Substructure (MCS) similarity [1] between two RDKit
Mol
objects, using the formula:\[sim(mol_a, mol_b) = \frac{numAtoms(MCS(mol_a, mol_b))} {numAtoms(mol_a) + numAtoms(mol_b) - numAtoms(MCS(mol_a, mol_b))}\]Number of atoms in MCS measures the structural overlap between molecules. FMCS algorithm [2] [3] [4] [5] is used for MCS computation. This measure penalizes the difference in size (number of atoms) between molecules.
The calculated similarity falls within the range \([0, 1]\).
- Parameters:
mol_a (RDKit
Mol
object) – First molecule.mol_b (RDKit
Mol
object) – Second molecule.timeout (int, default=3600) – MCS computation timeout.
- Returns:
similarity – MCS similarity between
mol_a
andmol_b
.- Return type:
float
References
Examples
>>> from rdkit.Chem import MolFromSmiles >>> from skfp.distances import mcs_similarity >>> mol_a = MolFromSmiles("COc1cc(CN2CCC(NC(=O)c3cncc(C)c3)CC2)c(OC)c2ccccc12") >>> mol_b = MolFromSmiles("COc1ccccc1") >>> sim = mcs_similarity(mol_a, mol_b) >>> sim 0.25806451612903225