cleands.Classification.lda module
Linear and Quadratic Discriminant Analysis (LDA/QDA) classifiers.
This module implements classical discriminant analysis methods:
linear_discriminant_analysis (LDA): assumes shared covariance across classes. Provides a low-dimensional discriminant projection and class posterior probabilities under a Gaussian generative model with equal covariance.
quadratic_discriminant_analysis (QDA): allows class-specific covariance matrices with optional ridge-type regularization for numerical stability.
- Utility:
_quad_form_rows(X, A) efficiently computes row-wise quadratic forms xᵢᵀ A xᵢ using numpy.einsum, which is used by QDA.
- Factory Aliases:
- LinearDiscriminantAnalysis: Wrapper for linear_discriminant_analysis
via ClassificationModel.
- QuadraticDiscriminantAnalysis: Wrapper for quadratic_discriminant_analysis
via ClassificationModel.
Typical usage example:
>>> from cleands.Classification.lda import LinearDiscriminantAnalysis
>>> model = LinearDiscriminantAnalysis(x, y)
>>> model.tidy
>>> model.glance
- class cleands.Classification.lda.linear_discriminant_analysis(x, y)[source]
Bases:
classification_modelLinear Discriminant Analysis (LDA) classifier.
Fits class means and a pooled within-class covariance (shared across classes) to derive a linear discriminant subspace and compute class posterior probabilities under a Gaussian generative model.
- Variables:
mean_vectors (list[np.ndarray]) – Per-class mean vectors of shape (n_features, 1).
priors (np.ndarray) – Class prior probabilities of shape (n_classes,).
Sigma_within (np.ndarray) – Pooled within-class covariance matrix of shape (n_features, n_features).
overall_mean (np.ndarray) – Overall mean vector of shape (n_features, 1).
Sigma_between (np.ndarray) – Between-class scatter matrix of shape (n_features, n_features).
eigenvalues (np.ndarray) – Eigenvalues from generalized eigenproblem inv(S_w) S_b, sorted descending.
eigenvectors (np.ndarray) – Top n_classes - 1 eigenvectors forming the discriminant projection matrix of shape (n_features, n_classes-1).
- Parameters:
x (array)
y (array)
- discriminant(target)[source]
Project data into the discriminant space.
- Parameters:
target (np.ndarray or pd.DataFrame) – Feature matrix (n_samples, n_features).
- Returns:
Discriminant scores of shape (n_samples, n_classes-1).
- Return type:
np.ndarray
- predict_proba(target)[source]
Compute posterior probabilities for each class.
Implements the LDA log-posterior up to a constant and returns softmax-normalized probabilities.
- Parameters:
target (np.ndarray or pd.DataFrame) – Feature matrix (n_samples, n_features).
- Returns:
Class probabilities of shape (n_samples, n_classes).
- Return type:
np.ndarray
- class cleands.Classification.lda.quadratic_discriminant_analysis(x, y, priors=None, reg=1e-6, sample_weight=None)[source]
Bases:
classification_modelQuadratic Discriminant Analysis (QDA) classifier.
QDA models each class with its own Gaussian distribution: x | y = k ~ N(μ_k, Σ_k). Predictions use the quadratic log-density with class-specific covariance, allowing non-linear decision boundaries.
- Variables:
classes (np.ndarray) – Sorted unique class labels of shape (n_classes,).
n_classes (int) – Number of classes.
priors (np.ndarray) – Class prior probabilities of shape (n_classes,).
means (np.ndarray) – Per-class mean vectors, shape (n_classes, n_features).
covs (np.ndarray) – Per-class covariance matrices, shape (n_classes, n_features, n_features).
inv_covs (np.ndarray) – Inverses of covariance matrices, same shape as covs.
log_dets (np.ndarray) – Log-determinants of covariance matrices, shape (n_classes,).
- Parameters:
x (ndarray)
y (ndarray)
priors (ndarray | None)
reg (float)
sample_weight (ndarray | None)
- decision_function(x)[source]
Compute unnormalized class scores (log-posterior up to a constant).
- For each class k:
score_k(x) = log π_k - 0.5 * log|Σ_k| - 0.5 * (x - μ_k)ᵀ Σ_k⁻¹ (x - μ_k)
- Parameters:
x (np.ndarray or pd.DataFrame) – Feature matrix (n_samples, n_features).
- Returns:
Scores of shape (n_samples, n_classes).
- Return type:
np.ndarray
- class cleands.Classification.lda.LinearDiscriminantAnalysis(formula, data, *args, **kwargs)[source]
Bases:
ClassificationModelConvenience wrapper for Linear Discriminant Analysis (LDA).
LDA projects data into a lower-dimensional space that maximizes class separability, assuming normally distributed features with equal covariance matrices. Provides a formula/DataFrame interface for the
linear_discriminant_analysis.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
linear_discriminant_analysis.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = LinearDiscriminantAnalysis.from_formula("y ~ x1 + x2", data=df) >>> model.classify(df[["x1", "x2"]]) >>> model.predict_proba(df[["x1", "x2"]])
- class cleands.Classification.lda.QuadraticDiscriminantAnalysis(formula, data, *args, **kwargs)[source]
Bases:
ClassificationModelConvenience wrapper for Quadratic Discriminant Analysis (QDA).
QDA is similar to LDA but allows each class to have its own covariance matrix, resulting in quadratic rather than linear decision boundaries. Provides a formula/DataFrame interface for the
quadratic_discriminant_analysis.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
quadratic_discriminant_analysis.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = QuadraticDiscriminantAnalysis.from_formula("y ~ x1 + x2 + x3", data=df) >>> model.classify(df[["x1", "x2", "x3"]]) >>> model.predict_proba(df[["x1", "x2", "x3"]])