mcs_distance#

skfp.distances.mcs_distance(mol_a: Mol, mol_b: Mol, timeout: int = 3600) float#

MCS distance between molecules.

Computes the Maximum Common Substructure (MCS) distance [1] between two RDKit Mol objects by subtracting similarity value from 1, using the formula:

\[dist(a, b) = 1 - sim(a, b)\]

Number of atoms in MCS measures the structural overlap between molecules. FMCS algorithm [2] [3] [4] [5] is used for MCS computation. This measure penalizes the difference in size (number of atoms) between molecules.

See also mcs_binary_similarity(). The calculated distance falls within the range \([0, 1]\).

Parameters:
  • mol_a (RDKit Mol object) – First molecule.

  • mol_b (RDKit Mol object) – Second molecule.

  • timeout (int, default=3600) – MCS computation timeout.

Returns:

similarity – MCS distance between mol_a and mol_b.

Return type:

float

References

Examples

>>> from rdkit.Chem import MolFromSmiles
>>> from skfp.distances import mcs_distance
>>> mol_a = MolFromSmiles("COc1cc(CN2CCC(NC(=O)c3cncc(C)c3)CC2)c(OC)c2ccccc12")
>>> mol_b = MolFromSmiles("COc1ccccc1")
>>> dist = mcs_distance(mol_a, mol_b)
>>> dist
0.7419354838709677