mosaicmpi.dataset.Dataset.cross_validate_imputation

mosaicmpi.dataset.Dataset.cross_validate_imputation#

Dataset.cross_validate_imputation(imputer: KNNImputer | SimpleImputer, n_folds: int = 100)#

Perform k-fold cross validation of imputation on the dataset without modifying the data.

Parameters:
  • imputer (Union[KNNImputer, SimpleImputer]) – imputer object from scikit-learn

  • n_folds (int, optional) – number of folds, defaults to 100

Returns:

Datafram with statistics for each gene, including preimputation log-mean and log-variance, and NRMSD mean and variance across all folds.

Return type:

pandas.DataFrame