load_pampa_approved_drugs#
- skfp.datasets.tdc.adme.load_pampa_approved_drugs(data_dir: str | PathLike | None = None, as_frame: bool = False, verbose: bool = False) DataFrame | tuple[list[str]] | ndarray #
Load the approved drugs subset of PAMPA dataset.
PAMPA (parallel artificial membrane permeability assay) is an assay to evaluate drug permeability across the cellular membrane. The task models only the passive membrane diffusion [1] [2]. This is the “approved drugs” subset that includes 142 marketed-approved drugs assessed by NCATS [1].
This dataset is a part of “absorption” subset of ADME tasks.
Tasks
1
Task type
classification
Total samples
142
Recommended split
scaffold
Recommended metric
AUROC
- Parameters:
data_dir ({None, str, path-like}, default=None) – Path to the root data directory. If
None
, currently set scikit-learn directory is used, by default $HOME/scikit_learn_data.as_frame (bool, default=False) – If True, returns the raw DataFrame with columns: “SMILES”, “label”. Otherwise, returns SMILES as list of strings, and labels as a NumPy array (1D integer binary vector).
verbose (bool, default=False) – If True, progress bar will be shown for downloading or loading files.
- Returns:
data – Depending on the
as_frame
argument, one of: - Pandas DataFrame with columns: “SMILES”, “label” - tuple of: list of strings (SMILES), NumPy array (labels)- Return type:
pd.DataFrame or tuple(list[str], np.ndarray)
References