mapie.conformity_scores
.ResidualNormalisedScore¶
- class mapie.conformity_scores.ResidualNormalisedScore(residual_estimator: Optional[RegressorMixin] = None, prefit: bool = False, split_size: Optional[Union[int, float]] = None, random_state: Optional[Union[int, RandomState]] = None, sym: bool = True, consistency_check: bool = False)[source]¶
Residual Normalised score.
The signed conformity score = (|y - y_pred|) / r_pred. r_pred being the predicted residual (|y - y_pred|) of the base estimator. It is calculated by a model that learns to predict these residuals. The learning is done with the log of the residual and we use the exponential of the prediction to avoid negative values.
The conformity score is symmetrical and allows the calculation of adaptive prediction intervals (taking X into account). It is possible to use it only with split and prefit methods (not with cross methods).
Warning : if the estimator provided is not fitted a subset of the calibration data will be used to fit the model (20% by default).
- Parameters
- residual_estimator: Optional[RegressorMixin]
The model that learns to predict the residuals of the base estimator. It can be any regressor with scikit-learn API (i.e. with
fit
andpredict
methods). IfNone
, estimator defaults to aLinearRegression
instance.- prefit: bool
Specify if the
residual_estimator
is already fitted or not. By defaultFalse
.- split_size: Optional[Union[int, float]]
The proportion of data that is used to fit the
residual_estimator
. By default it is the default value ofsklearn.model_selection.train_test_split
ie 0.2.- random_state: Optional[Union[int, np.random.RandomState]]
Pseudo random number used for random sampling. Pass an int for reproducible output across multiple function calls. By default
None
.
- __init__(residual_estimator: Optional[RegressorMixin] = None, prefit: bool = False, split_size: Optional[Union[int, float]] = None, random_state: Optional[Union[int, RandomState]] = None, sym: bool = True, consistency_check: bool = False) None [source]¶
- get_estimation_distribution(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_pred: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], conformity_scores: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]) ndarray[Any, dtype[_ScalarType_co]] [source]¶
Compute samples of the estimation distribution from the predicted values and the conformity scores, from the following formula:
y_pred + conformity_scores * r_pred
.The learning has been done with the log of the residual so we use the exponential of the prediction to avoid negative values.
conformity_scores
can be either the conformity scores or the quantile of the conformity scores.
- get_signed_conformity_scores(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_pred: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]) ndarray[Any, dtype[_ScalarType_co]] [source]¶
Computes the signed conformity score = (y - y_pred) / r_pred. r_pred being the predicted residual (y - y_pred) of the estimator. It is calculated by a model (
residual_estimator_
) that learns to predict this residual.The learning is done with the log of the residual and later we use the exponential of the prediction to avoid negative values.