mapie.regression
.ConformalizedQuantileRegressor¶
- class mapie.regression.ConformalizedQuantileRegressor(estimator: Optional[Union[RegressorMixin, Pipeline, List[Union[RegressorMixin, Pipeline]]]] = None, confidence_level: float = 0.9, prefit: bool = False)[source]¶
Computes prediction intervals using the conformalized quantile regression technique:
The
fit
method fits three models to the training data using the provided regressor: a model to predict the target, and models to predict upper and lower quantiles around the target.The
conformalize
method estimates the uncertainty of the quantile models using the conformalization set.The
predict_interval
computes prediction points and intervals.
- Parameters
- estimatorUnion[
RegressorMixin
,Pipeline
,List[Union[RegressorMixin, Pipeline]]
] The regressor used to predict points and quantiles.
When
prefit=False
(default), a single regressor that supports the quantile loss must be passed. Valid options:sklearn.linear_model.QuantileRegressor
sklearn.ensemble.GradientBoostingRegressor
sklearn.ensemble.HistGradientBoostingRegressor
lightgbm.LGBMRegressor
When
prefit=True
, a list of three fitted quantile regressors predicting the lower, upper, and median quantiles must be passed (in that order). These quantiles must be:lower quantile = (1 - confidence_level) / 2
upper quantile = (1 + confidence_level) / 2
median quantile = 0.5
- confidence_levelfloat default=0.9
The confidence level for the prediction intervals, indicating the desired coverage probability of the prediction intervals.
- prefitbool, default=False
If True, three fitted quantile regressors must be provided, and the
fit
method must be skipped.If False, the three regressors will be fitted during the
fit
method.
- estimatorUnion[
Examples
>>> from mapie.regression import ConformalizedQuantileRegressor >>> from mapie.utils import train_conformalize_test_split >>> from sklearn.datasets import make_regression >>> from sklearn.model_selection import train_test_split >>> from sklearn.linear_model import QuantileRegressor
>>> X, y = make_regression(n_samples=500, n_features=2, noise=1.0) >>> ( ... X_train, X_conformalize, X_test, ... y_train, y_conformalize, y_test ... ) = train_conformalize_test_split( ... X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=1 ... )
>>> mapie_regressor = ConformalizedQuantileRegressor( ... estimator=QuantileRegressor(), ... confidence_level=0.95, ... ).fit(X_train, y_train).conformalize(X_conformalize, y_conformalize)
>>> predicted_points, predicted_intervals = mapie_regressor.predict_interval(X_test)
- __init__(estimator: Optional[Union[RegressorMixin, Pipeline, List[Union[RegressorMixin, Pipeline]]]] = None, confidence_level: float = 0.9, prefit: bool = False) None [source]¶
- conformalize(X_conformalize: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_conformalize: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], predict_params: Optional[dict] = None) ConformalizedQuantileRegressor [source]¶
Estimates the uncertainty of the quantile regressors by computing conformity scores on the conformalization set.
- Parameters
- X_conformalizeArrayLike
Features of the conformalization set.
- y_conformalizeArrayLike
Targets of the conformalization set.
- predict_paramsOptional[dict], default=None
Parameters to pass to the
predict
method of the regressors. These parameters will also be used in thepredict_interval
andpredict
methods of this SplitConformalRegressor.
- Returns
- Self
The ConformalizedQuantileRegressor instance.
- fit(X_train: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_train: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], fit_params: Optional[dict] = None) ConformalizedQuantileRegressor [source]¶
Fits three models using the regressor provided at initialisation:
a model to predict the target
a model to predict the upper quantile of the target
a model to predict the lower quantile of the target
- Parameters
- X_trainArrayLike
Training data features.
- y_trainArrayLike
Training data targets.
- fit_paramsOptional[dict], default=None
Parameters to pass to the
fit
method of the regressors.
- Returns
- Self
The fitted ConformalizedQuantileRegressor instance.
- predict(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]) ndarray[Any, dtype[_ScalarType_co]] [source]¶
Predicts points.
- Parameters
- XArrayLike
Features
- Returns
- NDArray
Array of point predictions with shape
(n_samples,)
.
- predict_interval(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], minimize_interval_width: bool = False, allow_infinite_bounds: bool = False, symmetric_correction: bool = False) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]]] [source]¶
Predicts points (using the base regressor) and intervals.
The returned NDArray containing the prediction intervals is of shape (n_samples, 2, 1). The third dimension is unnecessary, but kept for consistency with the other conformal regression methods available in MAPIE.
- Parameters
- XArrayLike
Features
- minimize_interval_widthbool, default=False
If True, attempts to minimize the intervals width.
- allow_infinite_boundsbool, default=False
If True, allows prediction intervals with infinite bounds.
- symmetric_correctionbool, default=False
To produce prediction intervals, the conformalized quantile regression technique corrects the predictions of the upper and lower quantile regressors by adding a constant.
If
symmetric_correction
is set toFalse
, this constant is different for the upper and the lower quantile predictions. If set toTrue
, this constant is the same for both.
- Returns
- Tuple[NDArray, NDArray]
Two arrays:
Prediction points, of shape
(n_samples,)
Prediction intervals, of shape
(n_samples, 2, 1)