mosaicmpi.dataset.Dataset.impute_knn#
- Dataset.impute_knn(n_neighbors: int = 5, weights: Literal['distance', 'uniform'] = 'distance', cross_validate: bool = True, n_folds: int = 100)#
- Imputation for completing missing values using k-Nearest Neighbors.
Each sample’s missing values are imputed using the mean value from n_neighbors nearest neighbors. Two samples are close if the features that neither is missing are close.
- Parameters:
n_neighbors (int, optional) – Number of neighboring samples to use for imputation, defaults to 5
weights (Literal["distance", "uniform"], optional) – Weight function used in prediction, defaults to ‘distance’. Possible values: - ‘uniform’ : uniform weights. All points in each neighborhood are weighted equally. - ‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.
cross_validate (bool, optional) – perform k-fold cross-validation, defaults to True
n_folds (int, optional) – number of folds for k-fold cross-validation, defaults to 100