cleands.Classification.knn module
k-Nearest Neighbors (kNN) classification models.
This module provides:
A standard kNN classifier that estimates class probabilities from the label frequencies of the k nearest training points.
A cross-validated kNN classifier that selects
kby maximizing average accuracy across K folds.
Both classes integrate with the cleands classification framework and
expose the usual classification API (for example, predict_proba,
accuracy).
- class cleands.Classification.knn.k_nearest_neighbors_classifier(x, y, k=1)[source]
Bases:
classification_model,k_nearest_neighbors_regressork-Nearest Neighbors (kNN) classifier.
Combines the classification interface with the kNN neighbor search implemented for regression. Class probabilities are computed as the empirical frequency of labels among the
knearest training samples.- Variables:
k (int) – Number of neighbors used for prediction.
norms_train (np.ndarray) – Precomputed squared norms of training rows for fast distance computation.
- Parameters:
x (ndarray)
y (ndarray)
k (int)
- predict_proba(target)[source]
Predict class probabilities for new samples.
Uses Euclidean distance in the original feature space. For each sample, the class probabilities are the label frequencies among the
knearest neighbors.- Parameters:
target (np.ndarray) – Feature matrix of shape
(n_samples, n_features).- Returns:
Predicted probabilities of shape
(n_samples, n_classes).- Return type:
np.ndarray
- class cleands.Classification.knn.k_nearest_neighbors_cross_validation_classifier(x, y, k_max=25, folds=5, seed=None)[source]
Bases:
k_nearest_neighbors_classifierkNN classifier with cross-validated
k.Selects the number of neighbors
kby K-fold cross-validation, maximizing mean accuracy over the validation folds, and then fits a final kNN classifier using the selectedk.- Variables:
k (int) – Selected number of neighbors after cross-validation.
- Parameters:
x (ndarray)
y (ndarray)
k_max (int)
folds (int)
seed (int | None)
- class cleands.Classification.knn.kNearestNeighborsClassifier(formula, data, *args, **kwargs)[source]
Bases:
ClassificationModelConvenience wrapper for k-nearest neighbors classification.
Provides a formula/DataFrame interface for the
k_nearest_neighbors_classifier, which predicts class labels based on the majority vote among the nearest neighbors.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
k_nearest_neighbors_classifier.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = kNearestNeighborsClassifier.from_formula("y ~ x1 + x2", data=df, k=5) >>> model.classify(df[["x1", "x2"]]) >>> model.predict_proba(df[["x1", "x2"]])
- class cleands.Classification.knn.kNearestNeighborsCrossValidationClassifier(formula, data, *args, **kwargs)[source]
Bases:
ClassificationModelConvenience wrapper for cross-validated k-nearest neighbors classification.
Selects the optimal number of neighbors via k-fold cross-validation and provides a formula/DataFrame interface for the resulting
k_nearest_neighbors_cross_validation_classifier.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
k_nearest_neighbors_cross_validation_classifier.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = kNearestNeighborsCrossValidationClassifier.from_formula("y ~ x1 + x2", data=df) >>> model.classify(df[["x1", "x2"]]) >>> model.predict_proba(df[["x1", "x2"]])