cleands.Classification.glm module

Classification models based on generalized linear models (GLMs).

This module implements classifiers built on top of GLM regression models: logistic regression for binary classification and multinomial logistic regression for multi-class classification. Both models integrate with the broader cleands framework and expose tidy and glance methods for summarized model output.

Classes:
logistic_classifier:

Binary classification model using logistic regression.

multinomial_classifier:

Multi-class classification model using a one-vs-rest approach with multiple logistic regressions.

Factory Aliases:
LogisticClassifier:

Wrapper for constructing a logistic_classifier via ClassificationModel.

MultinomialClassifier:

Wrapper for constructing a multinomial_classifier via ClassificationModel.

Typical usage example:

>>> from cleands.Classification.glm import LogisticClassifier
>>> import numpy as np
>>> x = np.random.randn(100, 3)
>>> y = (x[:, 0] + x[:, 1] > 0).astype(int)
>>> model = LogisticClassifier(x, y)
>>> model.tidy
>>> model.glance
class cleands.Classification.glm.logistic_classifier(x, y, probability=0.5)[source]

Bases: classification_model

Logistic regression classifier.

Wraps a logistic regression model for binary classification tasks.

Variables:
  • model (logistic_regressor) – Underlying logistic regression estimator.

  • params (np.ndarray) – Estimated model parameters.

  • probability (float) – Threshold probability for classification.

Parameters:

probability (float)

predict_proba(target)[source]

Predict class probabilities for new observations.

Parameters:

target (np.ndarray or pd.DataFrame) – Feature matrix for predictions.

Returns:

Predicted probabilities with shape (n_samples, 2).

Column 0 = probability of class 0, Column 1 = probability of class 1.

Return type:

np.ndarray

property tidy: DataFrame

Return parameter estimates in tidy format without confidence intervals.

Returns:

Table of variables, estimates, standard errors, test statistics, and p-values.

Return type:

pd.DataFrame

tidyci(level=0.95, ci=True)[source]

Return parameter estimates with optional confidence intervals.

Parameters:
  • level (float, optional) – Confidence level. Defaults to 0.95.

  • ci (bool, optional) – If True, include confidence intervals. Defaults to True.

Returns:

Table of coefficient estimates and statistics.

Return type:

pd.DataFrame

property vcov_params: ndarray

Variance–covariance matrix of parameters.

Returns:

Estimated covariance matrix.

Return type:

np.ndarray

property glance: DataFrame

Return model-level summary statistics.

Returns:

Summary with fit metrics, information criteria, and classification accuracy.

Return type:

pd.DataFrame

class cleands.Classification.glm.multinomial_classifier(x, y)[source]

Bases: classification_model

Multinomial logistic regression classifier.

Fits a one-vs-rest set of logistic regressions for multi-class classification.

predict_proba(target)[source]

Predict class probabilities for new observations.

Parameters:

target (np.ndarray or pd.DataFrame) – Feature matrix for predictions.

Returns:

Predicted probabilities of shape (n_samples, n_classes).

Return type:

np.ndarray

tidyci(level=0.95, ci=True)[source]

Return parameter estimates for each logistic regression.

Parameters:
  • level (float, optional) – Confidence level. Defaults to 0.95.

  • ci (bool, optional) – If True, include confidence intervals. Defaults to True.

Returns:

Stacked table of coefficient estimates for all models.

Return type:

pd.DataFrame

property tidy: DataFrame

Return parameter estimates without confidence intervals.

Returns:

Table of estimates for all models.

Return type:

pd.DataFrame

property glance: DataFrame

Return model-level summary statistics for each logistic regression.

Returns:

Concatenated DataFrame of summaries.

Return type:

pd.DataFrame

class cleands.Classification.glm.LogisticClassifier(formula, data, *args, **kwargs)[source]

Bases: ClassificationModel

Convenience wrapper for binary logistic regression classifier.

Provides a formula/DataFrame interface for the logistic_classifier, which fits a logistic regression model for binary outcomes.

Variables:

MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to logistic_classifier.

Parameters:
  • formula (str)

  • data (DataFrame)

Example

>>> model = LogisticClassifier.from_formula("y ~ x1 + x2", data=df)
>>> model.classify(df[["x1", "x2"]])
>>> model.predict_proba(df[["x1", "x2"]])
class cleands.Classification.glm.MultinomialClassifier(formula, data, *args, **kwargs)[source]

Bases: ClassificationModel

Convenience wrapper for multinomial logistic regression classifier.

Provides a formula/DataFrame interface for the multinomial_classifier, which fits a multinomial logistic regression model for multi-class classification problems.

Variables:

MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to multinomial_classifier.

Parameters:
  • formula (str)

  • data (DataFrame)

Example

>>> model = MultinomialClassifier.from_formula("y ~ x1 + x2 + x3", data=df)
>>> model.classify(df[["x1", "x2", "x3"]])
>>> model.predict_proba(df[["x1", "x2", "x3"]])