cleands.Distribution.dist module

dist.py

Implements parametric probability distribution models and a two-sample wrapper.

Classes:
two_sample:

Generic wrapper to fit and compare the same distribution type on two samples.

multinomial:

Multinomial distribution model for categorical data.

normal:

Normal (Gaussian) distribution model with optional weighting.

uniform:

Uniform distribution model.

Notes

  • Each distribution extends parametric_distribution_model from base.

  • Models expose .params (fitted parameters), .pdf, .cdf, and log-likelihood utilities.

class cleands.Distribution.dist.multinomial(x, w_x=None, classes=None)[source]

Bases: parametric_distribution_model

Multinomial distribution model.

Parameters:
  • x (np.ndarray) – Discrete class labels (integers 0..C-1).

  • w_x (np.ndarray, optional) – Weights. Defaults to None.

  • classes (int, optional) – Number of classes. Defaults to max(x)+1.

Variables:
  • n_classes (int) – Number of classes.

  • bins (np.ndarray) – Bin counts.

  • params (np.ndarray) – Class probabilities.

pdf(x)[source]

Multinomial probability mass function.

Parameters:

x (ndarray)

Return type:

ndarray

cdf(x)[source]

Multinomial cumulative distribution function.

Parameters:

x (ndarray)

Return type:

ndarray

out_of_sample_log_likelihood(target)[source]

Log-likelihood for out-of-sample targets under fitted params.

Parameters:

target (ndarray)

Return type:

ndarray

out_of_sample_null_likelihood(target)[source]

Log-likelihood for out-of-sample targets under uniform null model.

Parameters:

target (ndarray)

Return type:

ndarray

likelihood_helper(target, probs)[source]

Helper for computing multinomial log-likelihoods.

Parameters:
  • target (ndarray)

  • probs (ndarray)

Return type:

ndarray

class cleands.Distribution.dist.normal(x, w_x=None)[source]

Bases: parametric_distribution_model

Normal (Gaussian) distribution model.

Parameters:
  • x (np.ndarray) – Sample data.

  • w_x (np.ndarray, optional) – Weights for weighted mean/std. Defaults to None.

Variables:

params (np.ndarray) – [mean, std].

pdf(target)[source]

Normal probability density function.

Parameters:

target (ndarray)

Return type:

ndarray

cdf(target)[source]

Normal cumulative distribution function.

Parameters:

target (ndarray)

Return type:

ndarray

out_of_sample_log_likelihood(target)[source]

Log-likelihood of target under fitted parameters.

Parameters:

target (ndarray)

Return type:

float

out_of_sample_null_likelihood(target)[source]

Log-likelihood under null model (target mean/std).

Parameters:

target (ndarray)

Return type:

float

likelihood_helper(target, mu, sigma)[source]

Helper for normal log-likelihood computation.

Parameters:
  • target (ndarray)

  • mu (float)

  • sigma (float)

Return type:

float

class cleands.Distribution.dist.uniform(x)[source]

Bases: parametric_distribution_model

Uniform distribution model.

Parameters:

x (np.ndarray) – Sample data.

Variables:

params (np.ndarray) – [lower, upper].

pdf(target)[source]

Uniform probability density function.

Parameters:

target (ndarray)

Return type:

ndarray

cdf(target)[source]

Uniform cumulative distribution function.

Parameters:

target (ndarray)

Return type:

ndarray

out_of_sample_log_likelihood(target)[source]

Log-likelihood of target under fitted parameters.

Parameters:

target (ndarray)

Return type:

float

out_of_sample_null_likelihood(target)[source]

Log-likelihood under null model (target min/max).

Parameters:

target (ndarray)

Return type:

float

likelihood_helper(target, lower, upper)[source]

Helper for uniform log-likelihood computation.

Parameters:
  • target (ndarray)

  • lower (float)

  • upper (float)

Return type:

float