doubt.models.glm package

Submodules

doubt.models.glm.quantile_loss module

Implementation of the quantile loss function

doubt.models.glm.quantile_loss.quantile_loss(predictions: Sequence[float], targets: Sequence[float], quantile: float) float

Quantile loss function.

Parameters
  • predictions (sequence of floats) – Model predictions, of shape [n_samples,].

  • targets (sequence of floats) – Target values, of shape [n_samples,].

  • quantile (float) – The quantile we are seeking. Must be between 0 and 1.

Returns

The quantile loss.

Return type

float

doubt.models.glm.quantile_loss.smooth_quantile_loss(predictions: Sequence[float], targets: Sequence[float], quantile: float, alpha: float = 0.4) float

The smooth quantile loss function from [1].

Parameters
  • predictions (sequence of floats) – Model predictions, of shape [n_samples,].

  • targets (sequence of floats) – Target values, of shape [n_samples,].

  • quantile (float) – The quantile we are seeking. Must be between 0 and 1.

  • alpha (float, optional) – Smoothing parameter. Defaults to 0.4.

Returns

The smooth quantile loss.

Return type

float

Sources:
[1]: Songfeng Zheng (2011). Gradient Descent Algorithms for

Quantile Regression With Smooth Approximation. International Journal of Machine Learning and Cybernetics.

doubt.models.glm.quantile_regressor module

Quantile regression for generalised linear models

class doubt.models.glm.quantile_regressor.QuantileRegressor(model: Union[sklearn.linear_model._base.LinearRegression, sklearn.linear_model._glm.glm.GeneralizedLinearRegressor], max_iter: Optional[int] = None, uncertainty: float = 0.05, quantiles: Optional[Sequence[float]] = None, alpha: float = 0.4)

Bases: doubt.models._model.BaseModel

Quantile regression for generalised linear models.

This uses BFGS optimisation of the smooth quantile loss from [1].

Parameters
  • max_iter (int) – The maximal number of iterations to train the model for. Defaults to 10,000.

  • uncertainty (float) – The uncertainty in the prediction intervals. Must be between 0 and 1. Defaults to 0.05.

  • quantiles (sequence of floats or None, optional) – List of quantiles to output, as an alternative to the uncertainty argument, and will not be used if that argument is set. If None then uncertainty is used. Defaults to None.

  • alpha (float, optional) – Smoothing parameter. Defaults to 0.4.

Examples

Fitting and predicting follows scikit-learn syntax:

>>> from doubt.datasets import Concrete
>>> from sklearn.linear_model import PoissonRegressor
>>> X, y = Concrete().split(random_seed=42)
>>> model = QuantileRegressor(PoissonRegressor(max_iter=10_000),
...                           uncertainty=0.05)
>>> model.fit(X, y).predict(X)[0].shape
(1030,)
>>> x = [500, 0, 0, 100, 2, 1000, 500, 20]
>>> pred, interval = model.predict(x)
>>> pred, interval
(78.50224243713622, array([ 19.27889844, 172.71408196]))
Sources:
[1]: Songfeng Zheng (2011). Gradient Descent Algorithms for

Quantile Regression With Smooth Approximation. International Journal of Machine Learning and Cybernetics.

fit(X: Sequence[Sequence[float]], y: Sequence[float], random_seed: Optional[int] = None)

Fit the model.

Parameters
  • X (float matrix) – The array containing the data set, either of shape (n,) or (n, f), with n being the number of samples and f being the number of features.

  • y (float array) – The target array, of shape (n,).

predict(X: Sequence[Sequence[float]]) Tuple[Union[float, numpy.ndarray], numpy.ndarray]

Compute model predictions.

Parameters

X (float matrix) – The array containing the data set, either of shape (n,) or (n, f), with n being the number of samples and f being the number of features.

Returns

The predictions, of shape (n,), and the prediction intervals, of shape (n, 2).

Return type

pair of float arrays

score(X: Sequence[float], y: Sequence[float]) float

Compute either the R^2 value or the negative pinball loss.

If uncertainty is not set in the constructor then the R^2 value will be returned, and otherwise the mean of the two negative pinball losses corresponding to the two quantiles will be returned.

The pinball loss is computed as quantile * (target - prediction) if target >= prediction, and (1 - quantile)(prediction - target) otherwise.

Parameters
  • X (float array) – The array containing the data set, either of shape (n,) or (n, f), with n being the number of samples and f being the number of features.

  • y (float array) – The target array, of shape (n,).

Returns

The negative pinball loss.

Return type

float

Module contents