mapie.conformity_scores.APSConformityScore

class mapie.conformity_scores.APSConformityScore[source]

Adaptive Prediction Sets (APS) method-based non-conformity score. It is based on the sum of the softmax outputs of the labels until the true label is reached, on the conformalization set. See [1] for more details.

Attributes:
classes: Optional[ArrayLike]

Names of the classes.

random_state: Optional[Union[int, np.random.RandomState]]

Pseudo random number generator state.

quantiles_: ArrayLike of shape (n_alpha)

The quantiles estimated from get_sets method.

References

[1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès. “Classification with Valid and Adaptive Coverage.” NeurIPS 202 (spotlight) 2020.

__init__() None[source]
get_conformity_score_quantiles(conformity_scores: ndarray[tuple[Any, ...], dtype[_ScalarT]], alpha_np: ndarray[tuple[Any, ...], dtype[_ScalarT]], cv: int | str | BaseCrossValidator | None, agg_scores: str | None = 'mean', **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Get the quantiles of the conformity scores for each uncertainty level.

Parameters:
conformity_scores: NDArray of shape (n_samples,)

Conformity scores for each sample.

alpha_np: NDArray of shape (n_alpha,)

NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval.

cv: Optional[Union[int, str, BaseCrossValidator]]

Cross-validation strategy used by the estimator.

agg_scores: Optional[str]

Method to aggregate the scores from the base estimators. If “mean”, the scores are averaged. If “crossval”, the scores are obtained from cross-validation.

By default “mean”.

Returns:
NDArray

Array of quantiles with respect to alpha_np.

get_conformity_scores(y: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_pred: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_enc: ndarray[tuple[Any, ...], dtype[_ScalarT]] | None = None, **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Get the conformity score.

Parameters:
y: NDArray of shape (n_samples,)

Observed target values.

y_pred: NDArray of shape (n_samples,)

Predicted target values.

y_enc: Optional[NDArray] of shape (n_samples,)

Target values as normalized encodings.

Returns:
NDArray of shape (n_samples,)

Conformity scores.

get_prediction_sets(y_pred_proba: ndarray[tuple[Any, ...], dtype[_ScalarT]], conformity_scores: ndarray[tuple[Any, ...], dtype[_ScalarT]], alpha_np: ndarray[tuple[Any, ...], dtype[_ScalarT]], cv: int | str | BaseCrossValidator | None, agg_scores: str | None = 'mean', include_last_label: bool | str | None = True, **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Generate prediction sets based on the probability predictions, the conformity scores and the uncertainty level.

Parameters:
y_pred_proba: NDArray of shape (n_samples, n_classes)

Target prediction.

conformity_scores: NDArray of shape (n_samples,)

Conformity scores for each sample.

alpha_np: NDArray of shape (n_alpha,)

NDArray of floats between 0 and 1, representing the uncertainty of the confidence interval (not used here).

cv: Optional[Union[int, str, BaseCrossValidator]]

Cross-validation strategy used by the estimator.

agg_scores: Optional[str]

Method to aggregate the scores from the base estimators. If “mean”, the scores are averaged. If “crossval”, the scores are obtained from cross-validation.

By default “mean”.

include_last_label: Optional[Union[bool, str]]

Whether or not to include last label in prediction sets for the “aps” method. Choose among:

  • False, does not include label whose cumulated score is just over the quantile.

  • True, includes label whose cumulated score is just over the quantile, unless there is only one label in the prediction set.

  • “randomized”, randomly includes label whose cumulated score is just over the quantile based on the comparison of a uniform number and the difference between the cumulated score of the last label and the quantile.

When set to True or False, it may result in a coverage higher than 1 - alpha (because contrary to the “randomized” setting, none of these methods create empty prediction sets). See [1] and [2] for more details.

By default True.

Returns:
NDArray

Array of quantiles with respect to alpha_np.

References

[1] Yaniv Romano, Matteo Sesia and Emmanuel J. Candès. “Classification with Valid and Adaptive Coverage.” NeurIPS 202 (spotlight) 2020.

[2] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan and Jitendra Malik. “Uncertainty Sets for Image Classifiers using Conformal Prediction.” International Conference on Learning Representations 2021.

get_predictions(X: ndarray[tuple[Any, ...], dtype[_ScalarT]], alpha_np: ndarray[tuple[Any, ...], dtype[_ScalarT]], y_pred_proba: ndarray[tuple[Any, ...], dtype[_ScalarT]], cv: int | str | BaseCrossValidator | None, agg_scores: str | None = 'mean', **kwargs) ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Just processes the passed y_pred_proba.

Parameters:
X: NDArray of shape (n_samples, n_features)

Observed feature values (not used since predictions are passed).

alpha_np: NDArray of shape (n_alpha,)

NDArray of floats between 0 and 1, represents the uncertainty of the confidence interval.

y_pred_proba: NDArray

Predicted probabilities from the estimator.

cv: Optional[Union[int, str, BaseCrossValidator]]

Cross-validation strategy used by the estimator (not used here).

agg_scores: Optional[str]

Method to aggregate the scores from the base estimators. If “mean”, the scores are averaged. If “crossval”, the scores are obtained from cross-validation.

By default “mean”.

Returns:
NDArray

Array of predictions.

static get_true_label_cumsum_proba(y: ArrayLike, y_pred_proba: ndarray[tuple[Any, ...], dtype[_ScalarT]], classes: ArrayLike) Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Compute the cumsumed probability of the true label.

Parameters:
y: ArrayLike of shape (n_samples, )

Array with the labels.

y_pred_proba: NDArray of shape (n_samples, n_classes)

Predictions of the model.

classes: ArrayLike of shape (n_classes, )

Array with the classes.

Returns:
Tuple[NDArray, NDArray] of shapes (n_samples, 1) and (n_samples, ).

The first element is the cumsum probability of the true label. The second is the 1-based rank of the true label in the sorted probabilities.