mapie.conformity_scores.ResidualNormalisedScore

class mapie.conformity_scores.ResidualNormalisedScore(residual_estimator: Optional[sklearn.base.RegressorMixin] = None, prefit: bool = False, split_size: Optional[Union[int, float]] = None, random_state: Optional[Union[int, numpy.random.mtrand.RandomState]] = None, sym: bool = True, consistency_check: bool = False)[source]

Residual Normalised score.

The signed conformity score = (|y - y_pred|) / r_pred. r_pred being the predicted residual (|y - y_pred|) of the base estimator. It is calculated by a model that learns to predict these residuals. The learning is done with the log of the residual and we use the exponential of the prediction to avoid negative values.

The conformity score is symmetrical and allows the calculation of adaptive prediction intervals (taking X into account). It is possible to use it only with split and prefit methods (not with cross methods).

Warning : if the estimator provided is not fitted a subset of the calibration data will be used to fit the model (20% by default).

Parameters
residual_estimator: Optional[RegressorMixin]

The model that learns to predict the residuals of the base estimator. It can be any regressor with scikit-learn API (i.e. with fit and predict methods). If None, estimator defaults to a LinearRegression instance.

prefit: bool

Specify if the residual_estimator is already fitted or not. By default False.

split_size: Optional[Union[int, float]]

The proportion of data that is used to fit the residual_estimator. By default it is the default value of sklearn.model_selection.train_test_split ie 0.2.

random_state: Optional[Union[int, np.random.RandomState]]

Pseudo random number used for random sampling. Pass an int for reproducible output across multiple function calls. By default None.

__init__(residual_estimator: Optional[sklearn.base.RegressorMixin] = None, prefit: bool = False, split_size: Optional[Union[int, float]] = None, random_state: Optional[Union[int, numpy.random.mtrand.RandomState]] = None, sym: bool = True, consistency_check: bool = False) None[source]
get_estimation_distribution(X: Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_pred: Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]], conformity_scores: Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]) numpy.ndarray[Any, numpy.dtype[numpy._typing._array_like._ScalarType_co]][source]

Compute samples of the estimation distribution from the predicted values and the conformity scores, from the following formula: y_pred + conformity_scores * r_pred.

The learning has been done with the log of the residual so we use the exponential of the prediction to avoid negative values.

conformity_scores can be either the conformity scores or the quantile of the conformity scores.

get_signed_conformity_scores(X: Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]], y: Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_pred: Union[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]], numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy._typing._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]) numpy.ndarray[Any, numpy.dtype[numpy._typing._array_like._ScalarType_co]][source]

Computes the signed conformity score = (y - y_pred) / r_pred. r_pred being the predicted residual (y - y_pred) of the estimator. It is calculated by a model (residual_estimator_) that learns to predict this residual.

The learning is done with the log of the residual and later we use the exponential of the prediction to avoid negative values.