enrichment_factor#
- skfp.metrics.enrichment_factor(y_true: ndarray | list[int], y_score: ndarray | list[float], fraction: float = 0.05) float #
Enrichment factor (EF).
EF at fraction
X
is calculated as the number of actives found by the model, divided by the expected number of actives from a random ranking. See also [1] for details.We have
n
actives,N
test molecules, and percentagefraction
of the top molecules (e.g. 0.05). Takefraction * N
compounds with highesty_score
values, and mark the number of actives among them asa
. Random classifier would get on averagefraction * n
actives. Enrichment factor EF(X) is then defined as a ratio:\[EF(X) = \frac{a}{X*n}\]Minimal value is 0. Maximal value depends on the fraction of actives in the dataset, and is equal to
1/X
ifX >= n/N
, andN/n
otherwise. Model as good as random guessing would get 1. Note that values depend on the ratio of actives in the dataset andfraction
value.- Parameters:
y_true (array-like of shape (n_samples,)) – Ground truth (correct) target values.
y_score (array-like of shape (n_samples,)) – Target scores, e.g. probability of the positive class, or similarities to active compounds.
fraction (float, default=0.05) – Fraction of the dataset used for calculating the enrichment. Common values are 0.01 and 0.05. Note that this value affects the possible value range.
- Returns:
score – Enrichment factor value.
- Return type:
float
References
Examples
>>> import numpy as np >>> from skfp.metrics import enrichment_factor >>> y_true = [0, 0, 1] >>> y_score = [0.1, 0.2, 0.7] >>> enrichment_factor(y_true, y_score) 3.0