mapie.conformity_scores.APSConformityScore
- class mapie.conformity_scores.APSConformityScore[source]
Adaptive Prediction Sets (APS) method-based non-conformity score. It is based on the sum of the softmax outputs of the labels until the true label is reached, on the conformalization set. See [1] for more details.
- Attributes:
- classes: Optional[ArrayLike]
Names of the classes.
- random_state: Optional[Union[int, np.random.RandomState]]
Pseudo random number generator state.
- quantiles_: ArrayLike of shape (n_alpha)
The quantiles estimated from get_sets method.
References
[1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès. “Classification with Valid and Adaptive Coverage.” NeurIPS 202 (spotlight) 2020.
- get_conformity_score_quantiles(conformity_scores: ndarray[tuple[Any, ...], dtype[_ScalarT]], alpha_np: ndarray[tuple[Any, ...], dtype[_ScalarT]], cv: int | str | BaseCrossValidator | None, agg_scores: str | None = 'mean', **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Get the quantiles of the conformity scores for each uncertainty level.
- Parameters:
- conformity_scores: NDArray of shape (n_samples,)
Conformity scores for each sample.
- alpha_np: NDArray of shape (n_alpha,)
NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval.
- cv: Optional[Union[int, str, BaseCrossValidator]]
Cross-validation strategy used by the estimator.
- agg_scores: Optional[str]
Method to aggregate the scores from the base estimators. If “mean”, the scores are averaged. If “crossval”, the scores are obtained from cross-validation.
By default “mean”.
- Returns:
- NDArray
Array of quantiles with respect to alpha_np.
- get_conformity_scores(y: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_pred: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_enc: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None, **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Get the conformity score.
- Parameters:
- y: NDArray of shape (n_samples,)
Observed target values.
- y_pred: NDArray of shape (n_samples,)
Predicted target values.
- y_enc: Optional[NDArray] of shape (n_samples,)
Target values as normalized encodings.
- Returns:
- NDArray of shape (n_samples,)
Conformity scores.
- get_prediction_sets(y_pred_proba: ndarray[tuple[Any, ...], dtype[_ScalarT]], conformity_scores: ndarray[tuple[Any, ...], dtype[_ScalarT]], alpha_np: ndarray[tuple[Any, ...], dtype[_ScalarT]], cv: int | str | BaseCrossValidator | None, agg_scores: str | None = 'mean', include_last_label: bool | str | None = True, **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.
- Parameters:
- y_pred_proba: NDArray of shape (n_samples, n_classes)
Target prediction.
- conformity_scores: NDArray of shape (n_samples,)
Conformity scores for each sample.
- alpha_np: NDArray of shape (n_alpha,)
NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval (not used here).
- cv: Optional[Union[int, str, BaseCrossValidator]]
Cross-validation strategy used by the estimator.
- agg_scores: Optional[str]
Method to aggregate the scores from the base estimators. If “mean”, the scores are averaged. If “crossval”, the scores are obtained from cross-validation.
By default “mean”.
- include_last_label: Optional[Union[bool, str]]
Whether or not to include last label in prediction sets for the “aps” method. Choose among:
False, does not include label whose cumulated score is just over the quantile.
True, includes label whose cumulated score is just over the quantile, unless there is only one label in the prediction set.
“randomized”, randomly includes label whose cumulated score is just over the quantile based on the comparison of a uniform number and the difference between the cumulated score of the last label and the quantile.
When set to True or False, it may result in a coverage higher than 1 - alpha (because contrary to the “randomized” setting, none of these methods create empty prediction sets). See [1] and [2] for more details.
By default True.
- Returns:
- NDArray
Array of quantiles with respect to alpha_np.
References
[1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès. “Classification with Valid and Adaptive Coverage.” NeurIPS 202 (spotlight) 2020.
[2] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan and Jitendra Malik. “Uncertainty Sets for Image Classifiers using Conformal Prediction.” International Conference on Learning Representations 2021.
- get_predictions(X: ndarray[tuple[Any, ...], dtype[_ScalarT]], alpha_np: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_pred_proba: ndarray[tuple[Any, ...], dtype[_ScalarT]], cv: int | str | BaseCrossValidator | None, agg_scores: str | None = 'mean', **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]
Just processes the passed y_pred_proba.
- Parameters:
- X: NDArray of shape (n_samples, n_features)
Observed feature values (not used since predictions are passed).
- alpha_np: NDArray of shape (n_alpha,)
NDArray of floats between 0 and 1, represents the uncertainty of the confidence interval.
- y_pred_proba: NDArray
Predicted probabilities from the estimator.
- cv: Optional[Union[int, str, BaseCrossValidator]]
Cross-validation strategy used by the estimator (not used here).
- agg_scores: Optional[str]
Method to aggregate the scores from the base estimators. If “mean”, the scores are averaged. If “crossval”, the scores are obtained from cross-validation.
By default “mean”.
- Returns:
- NDArray
Array of predictions.
- static get_true_label_cumsum_proba(y: ArrayLike, y_pred_proba: ndarray[tuple[Any, ...], dtype[_ScalarT]], classes: ArrayLike) Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]
Compute the cumsumed probability of the true label.
- Parameters:
- y: ArrayLike of shape (n_samples, )
Array with the labels.
- y_pred_proba: NDArray of shape (n_samples, n_classes)
Predictions of the model.
- classes: ArrayLike of shape (n_classes, )
Array with the classes.
- Returns:
- Tuple[NDArray, NDArray] of shapes (n_samples, 1) and (n_samples, ).
The first element is the cumsum probability of the true label. The second is the 1-based rank of the true label in the sorted probabilities.