cleands.Prediction.ldv module
ldv.py — Limited Dependent Variable (LDV) Models
This module implements models for limited dependent variables where the outcome is only partially observed due to censoring, truncation, or selection processes. These models extend standard regression by explicitly accounting for restricted observability of the dependent variable.
Models currently implemented
- Tobit regression (two-limit censored normal)
- Latent variable:
y* = Xβ + ε, ε ~ N(0, σ²)
- Observed variable:
y = L if y* ≤ L (left-censored) y = y* if L < y* < R (uncensored) y = R if y* ≥ R (right-censored)
- Features:
Supports left- and/or right-censoring (finite or infinite).
Fits parameters by maximum likelihood (L-BFGS-B).
Returns estimates of β and σ with variance-covariance matrix.
Provides log-likelihood, AIC, BIC, deviance, convergence info.
- Includes:
predict() for latent mean μ = Xβ
expected_observed() for E[y | X] under censoring
censoring_probs() for P_left, P_uncensored, P_right.
Planned models
- Truncated regression
Similar to Tobit but assumes data outside [L, R] are unobserved (not censored).
Log-likelihood excludes truncated cases entirely.
Useful for survey data where only responses in a restricted range are collected.
- Heckman selection model (two-step / full MLE)
Jointly models outcome and selection equations to correct for sample selection bias.
Outcome observed only if selection variable exceeds threshold.
Widely applied in labor economics, health economics, and marketing.
Classes
- tobit_regressor
Core implementation of two-limit Tobit regression with MLE fitting, prediction, and inference utilities.
Factory Aliases
- TobitRegressor
Partial wrapper that exposes tobit_regressor through the PredictionModel interface for pandas DataFrame/formula use.
Examples
>>> import numpy as np, pandas as pd
>>> from cleands.Prediction.ldv import TobitRegressor
>>> df = pd.DataFrame({"x1": np.random.randn(100), "y": np.random.randn(100)})
>>> model = TobitRegressor(x_vars=["x1"], y_var="y", data=df, L=0.0)
>>> model.glance
- class cleands.Prediction.ldv.tobit_regressor(x, y, L=0.0, R=None, add_intercept=False, start=None, tol=1e-8)[source]
Bases:
prediction_model,prediction_likelihood_model,variance_modelTwo-limit Tobit (censored normal) regression model.
- Latent model:
y* = Xβ + ε, ε ~ N(0, σ²)
- Observed:
y = L if y* ≤ L (left-censored)
y = y* if L < y* < R (uncensored)
y = R if y* ≥ R (right-censored)
- Parameters are stored as:
params = [β, σ] with σ > 0
predict(X) returns the latent mean μ = Xβ. Use expected_observed(X) to compute E[y | X] under censoring.
- Parameters:
x (ndarray)
y (ndarray)
L (float | None)
R (float | None)
add_intercept (bool)
start (Tuple[ndarray, float] | None)
tol (float)
- predict(target)[source]
Predict latent mean μ = Xβ.
- Parameters:
target (np.ndarray) – New design matrix.
- Returns:
Latent means.
- Return type:
np.ndarray
- expected_observed(target)[source]
Predict expected observed y under censoring.
- Parameters:
target (np.ndarray) – New design matrix.
- Returns:
Expected observed values, E[y | X].
- Return type:
np.ndarray
- censoring_probs(target)[source]
Compute probabilities of being left-censored, uncensored, or right-censored.
- Parameters:
target (np.ndarray) – New design matrix.
- Returns:
(P_left, P_uncensored, P_right) for each observation.
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray]
- evaluate_lnL(pred)[source]
Evaluate the log-likelihood at a given latent mean μ.
- Parameters:
pred (np.ndarray) – Latent mean predictions μ = Xβ.
- Returns:
Log-likelihood value.
- Return type:
float
- property vcov_params: ndarray
Variance-covariance matrix of parameter estimates.
- Returns:
(r+1) x (r+1) covariance matrix for [β, σ].
- Return type:
np.ndarray
- class cleands.Prediction.ldv.TobitRegressor(formula, data, *args, **kwargs)[source]
Bases:
PredictionModelConvenience wrapper for Tobit regression.
The Tobit model is used for censored dependent variables, where observations below (or above) a threshold are censored rather than fully observed. This wrapper provides a formula/DataFrame interface for the
tobit_regressor.- Variables:
MODEL_TYPE (ClassVar[Type[cleands.base.supervised_model]]) – Underlying model type, fixed to
tobit_regressor.- Parameters:
formula (str)
data (DataFrame)
Example
>>> model = TobitRegressor.from_formula("y ~ x1 + x2", data=df) >>> model.predict(df[["x1", "x2"]])