cleands.Classification.glm module
Classification models based on generalized linear models (GLMs).
This module implements classifiers built on top of GLM regression models: logistic regression for binary classification and multinomial logistic regression for multi-class classification. Both models integrate with the broader cleands framework and expose tidy and glance methods for summarized model output.
- Classes:
- logistic_classifier:
Binary classification model using logistic regression.
- multinomial_classifier:
Multi-class classification model using a one-vs-rest approach with multiple logistic regressions.
- Factory Aliases:
- LogisticClassifier:
Wrapper for constructing a logistic_classifier via ClassificationModel.
- MultinomialClassifier:
Wrapper for constructing a multinomial_classifier via ClassificationModel.
Typical usage example:
>>> from cleands.Classification.glm import LogisticClassifier
>>> import numpy as np
>>> x = np.random.randn(100, 3)
>>> y = (x[:, 0] + x[:, 1] > 0).astype(int)
>>> model = LogisticClassifier(x, y)
>>> model.tidy
>>> model.glance
- class cleands.Classification.glm.logistic_classifier(x, y, probability=0.5)[source]
Bases:
classification_modelLogistic regression classifier.
Wraps a logistic regression model for binary classification tasks.
- Variables:
model (logistic_regressor) – Underlying logistic regression estimator.
params (np.ndarray) – Estimated model parameters.
probability (float) – Threshold probability for classification.
- Parameters:
probability (float)
- predict_proba(target)[source]
Predict class probabilities for new observations.
- Parameters:
target (np.ndarray or pd.DataFrame) – Feature matrix for predictions.
- Returns:
- Predicted probabilities with shape (n_samples, 2).
Column 0 = probability of class 0, Column 1 = probability of class 1.
- Return type:
np.ndarray
- property tidy: DataFrame
Return parameter estimates in tidy format without confidence intervals.
- Returns:
Table of variables, estimates, standard errors, test statistics, and p-values.
- Return type:
pd.DataFrame
- tidyci(level=0.95, ci=True)[source]
Return parameter estimates with optional confidence intervals.
- Parameters:
level (float, optional) – Confidence level. Defaults to 0.95.
ci (bool, optional) – If True, include confidence intervals. Defaults to True.
- Returns:
Table of coefficient estimates and statistics.
- Return type:
pd.DataFrame
- property vcov_params: ndarray
Variance–covariance matrix of parameters.
- Returns:
Estimated covariance matrix.
- Return type:
np.ndarray
- property glance: DataFrame
Return model-level summary statistics.
- Returns:
Summary with fit metrics, information criteria, and classification accuracy.
- Return type:
pd.DataFrame
- class cleands.Classification.glm.multinomial_classifier(x, y)[source]
Bases:
classification_modelMultinomial logistic regression classifier.
Fits a one-vs-rest set of logistic regressions for multi-class classification.
- predict_proba(target)[source]
Predict class probabilities for new observations.
- Parameters:
target (np.ndarray or pd.DataFrame) – Feature matrix for predictions.
- Returns:
Predicted probabilities of shape (n_samples, n_classes).
- Return type:
np.ndarray
- tidyci(level=0.95, ci=True)[source]
Return parameter estimates for each logistic regression.
- Parameters:
level (float, optional) – Confidence level. Defaults to 0.95.
ci (bool, optional) – If True, include confidence intervals. Defaults to True.
- Returns:
Stacked table of coefficient estimates for all models.
- Return type:
pd.DataFrame
- property tidy: DataFrame
Return parameter estimates without confidence intervals.
- Returns:
Table of estimates for all models.
- Return type:
pd.DataFrame
- property glance: DataFrame
Return model-level summary statistics for each logistic regression.
- Returns:
Concatenated DataFrame of summaries.
- Return type:
pd.DataFrame
- class cleands.Classification.glm.LogisticClassifier(formula, data, *args, **kwargs)[source]
Bases:
ClassificationModelConvenience wrapper for binary logistic regression classifier.
Provides a formula/DataFrame interface for the
logistic_classifier, which fits a logistic regression model for binary outcomes.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
logistic_classifier.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = LogisticClassifier.from_formula("y ~ x1 + x2", data=df) >>> model.classify(df[["x1", "x2"]]) >>> model.predict_proba(df[["x1", "x2"]])
- class cleands.Classification.glm.MultinomialClassifier(formula, data, *args, **kwargs)[source]
Bases:
ClassificationModelConvenience wrapper for multinomial logistic regression classifier.
Provides a formula/DataFrame interface for the
multinomial_classifier, which fits a multinomial logistic regression model for multi-class classification problems.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
multinomial_classifier.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = MultinomialClassifier.from_formula("y ~ x1 + x2 + x3", data=df) >>> model.classify(df[["x1", "x2", "x3"]]) >>> model.predict_proba(df[["x1", "x2", "x3"]])