sokal_sneath_2_binary_similarity#
- skfp.distances.sokal_sneath_2_binary_similarity(vec_a: ndarray | csr_array, vec_b: ndarray | csr_array) float #
Sokal-Sneath similarity 2 for vectors of binary values.
Computes the Sokal-Sneath similarity 2 [1] [2] [3] for binary data between two input arrays or sparse matrices, using the formula:
\[sim(a, b) = \frac{|a \cap b|}{|a \cup b| + |a \Delta b} = \frac{|a \cap b|}{2 * |a| + 2 * |b| - 3 * |a \cap b|}\]where :|a Delta b| is the XOR operation (symmetric difference), i.e. number of bits that are “on” in one vector and “off” in another.
The calculated similarity falls within the range \([0, 1]\). Passing all-zero vectors to this function results in a similarity of 0.
- Parameters:
vec_a ({ndarray, sparse matrix}) – First binary input array or sparse matrix.
vec_b ({ndarray, sparse matrix}) – Second binary input array or sparse matrix.
- Returns:
similarity – Sokal-Sneath similarity 2 between
vec_a
andvec_b
.- Return type:
float
References
Examples
>>> from skfp.distances import sokal_sneath_2_binary_similarity >>> import numpy as np >>> vec_a = np.array([1, 1, 1, 1]) >>> vec_b = np.array([1, 1, 0, 0]) >>> sim = sokal_sneath_2_binary_similarity(vec_a, vec_b) >>> sim 0.3333333333333333
>>> from scipy.sparse import csr_array >>> vec_a = csr_array([[1, 1, 1, 1]]) >>> vec_b = csr_array([[1, 1, 0, 0]]) >>> sim = sokal_sneath_2_binary_similarity(vec_a, vec_b) >>> sim 0.3333333333333333