mapie.regression.SplitConformalRegressor

class mapie.regression.SplitConformalRegressor(estimator: RegressorMixin = LinearRegression(), confidence_level: Union[float, Iterable[float]] = 0.9, conformity_score: Union[str, BaseRegressionScore] = 'absolute', prefit: bool = True, n_jobs: Optional[int] = None, verbose: int = 0)[source]

Computes prediction intervals using the split conformal regression technique:

  1. The fit method (optional) fits the base regressor to the training data.

  2. The conformalize method estimates the uncertainty of the base regressor by computing conformity scores on the conformalization set.

  3. The predict_interval method predicts points and intervals.

Parameters
estimatorRegressorMixin, default=LinearRegression()

The base regressor used to predict points.

confidence_levelUnion[float, List[float]], default=0.9

The confidence level(s) for the prediction intervals, indicating the desired coverage probability of the prediction intervals. If a float is provided, it represents a single confidence level. If a list, multiple prediction intervals for each specified confidence level are returned.

conformity_scoreUnion[str, BaseRegressionScore], default=”absolute”

The method used to compute conformity scores

Valid options:

  • “absolute”

  • “gamma”

  • “residual_normalized”

  • Any subclass of BaseRegressionScore

A custom score function inheriting from BaseRegressionScore may also be provided.

See Theoretical Description for Conformity Scores.

prefitbool, default=False

If True, the base regressor must be fitted, and the fit method must be skipped.

If False, the base regressor will be fitted during the fit method.

n_jobsOptional[int], default=None

The number of jobs to run in parallel when applicable.

verboseint, default=0

Controls the verbosity level. Higher values increase the output details.

Examples

>>> from mapie.regression import SplitConformalRegressor
>>> from mapie.utils import train_conformalize_test_split
>>> from sklearn.datasets import make_regression
>>> from sklearn.linear_model import Ridge
>>> X, y = make_regression(n_samples=500, n_features=2, noise=1.0)
>>> (
...     X_train, X_conformalize, X_test,
...     y_train, y_conformalize, y_test
... ) = train_conformalize_test_split(
...     X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=1
... )
>>> mapie_regressor = SplitConformalRegressor(
...     estimator=Ridge(),
...     confidence_level=0.95,
...     prefit=False,
... ).fit(X_train, y_train).conformalize(X_conformalize, y_conformalize)
>>> predicted_points, predicted_intervals = mapie_regressor.predict_interval(X_test)
__init__(estimator: RegressorMixin = LinearRegression(), confidence_level: Union[float, Iterable[float]] = 0.9, conformity_score: Union[str, BaseRegressionScore] = 'absolute', prefit: bool = True, n_jobs: Optional[int] = None, verbose: int = 0) None[source]
conformalize(X_conformalize: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_conformalize: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], predict_params: Optional[dict] = None) SplitConformalRegressor[source]

Estimates the uncertainty of the base regressor by computing conformity scores on the conformalization set.

Parameters
X_conformalizeArrayLike

Features of the conformalization set.

y_conformalizeArrayLike

Targets of the conformalization set.

predict_paramsOptional[dict], default=None

Parameters to pass to the predict method of the base regressor. These parameters will also be used in the predict_interval and predict methods of this SplitConformalRegressor.

Returns
Self

The conformalized SplitConformalRegressor instance.

fit(X_train: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_train: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], fit_params: Optional[dict] = None) SplitConformalRegressor[source]

Fits the base regressor to the training data.

Parameters
X_trainArrayLike

Training data features.

y_trainArrayLike

Training data targets.

fit_paramsOptional[dict], default=None

Parameters to pass to the fit method of the base regressor.

Returns
Self

The fitted SplitConformalRegressor instance.

predict(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]) ndarray[Any, dtype[_ScalarType_co]][source]

Predicts points.

Parameters
XArrayLike

Features

Returns
NDArray

Array of point predictions, with shape (n_samples,).

predict_interval(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], minimize_interval_width: bool = False, allow_infinite_bounds: bool = False) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]]][source]

Predicts points (using the base regressor) and intervals.

If several confidence levels were provided during initialisation, several intervals will be predicted for each sample. See the return signature.

Parameters
XArrayLike

Features

minimize_interval_widthbool, default=False

If True, attempts to minimize the intervals width.

allow_infinite_boundsbool, default=False

If True, allows prediction intervals with infinite bounds.

Returns
Tuple[NDArray, NDArray]

Two arrays:

  • Prediction points, of shape (n_samples,)

  • Prediction intervals, of shape (n_samples, 2, n_confidence_levels)