`mapie.risk_control`.MultiLabelClassificationController

class mapie.risk_control.MultiLabelClassificationController(predict_function: Callable[[ArrayLike], list[ndarray[tuple[Any, ...], dtype[_ScalarT]]] | ndarray[tuple[Any, ...], dtype[_ScalarT]]], risk: str = 'recall', method: str | None = None, target_level: float | Iterable[float] = 0.9, confidence_level: float | None = None, rcps_bound: str | None = None, predict_params: ArrayLike = np.arange(0, 1, 0.01), n_jobs: int | None = None, random_state: int | RandomState | None = None, verbose: int = 0)[source]

Prediction sets for multilabel-classification.

This class implements two conformal prediction methods for estimating prediction sets for multilabel-classification. It guarantees (under the hypothesis of exchangeability) that a risk is at least 1 - alpha (alpha is a user-specified parameter). For now, we consider the recall as risk.

Parameters:

predict_functionCallable[[ArrayLike], Union[list[NDArray], NDArray]]

predict_proba method of a fitted multi-label classifier. It can return either: - a list of arrays of length n_classes where each array is of shape (n_samples, 2) with probabilities of the negative and positive class (as output by MultiOutputClassifier), or - an ndarray of shape (n_samples, n_classes) or (n_samples, n_classes, 2) containing positive probabilities, or positive and negative probabilities (assuming last dimension is [neg, pos]).

riskstr

The risk metric to control (“precision” or “recall”). The selected risk determines which conformal prediction methods are valid: - “precision” implies that method must be “ltt” - “recall” implies that method can be “crc” (default) or “rcps”

methodOptional[str]

Method to use for the prediction . If risk is “recall”, the method can be either “crc” (default) or “rcps”. If risk is “precision”, the method used is “ltt”. If None, the default is “crc” for recall and “ltt” for precision.

target_levelOptional[Union[float, Iterable[float]]]

The minimum performance level for the metric. Must be between 0 and 1. Can be a float or any iterable of floats. By default 0.9.

confidence_levelOptional[float]

Can be a float, or None. If using method=”rcps” or method=”ltt” (precision control), then it cannot be set to None and must lie in (0, 1). Between 0 and 1, the level of certainty at which we compute the Upper Confidence Bound of the average risk. Higher confidence_level produce larger (more conservative) prediction sets. By default None.

rcps_boundOptional[Union[str, None]]

Method used to compute the Upper Confidence Bound of the average risk. Only necessary with the RCPS method. If provided when using CRC or LTT it is ignored and a warning is raised. By default None.

predict_paramsOptional[ArrayLike]

Array of parameters (thresholds λ) to consider for controlling the risk. Defaults to np.arange(0, 1, 0.01). Length is used to set n_predict_params.

n_jobs: Optional[int]

Number of jobs for parallel processing using joblib via the “locky” backend. For this moment, parallel processing is disabled. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. “None” is a marker for unset that will be interpreted as n_jobs=1 (sequential execution).

By default None.

random_state: Optional[Union[int, RandomState]]

Pseudo random number generator state used for random uniform sampling to evaluate quantiles and prediction sets. Pass an int for reproducible output across multiple function calls.

By default 1.

verboseint, optional

The verbosity level, used with joblib for parallel processing. For the moment, parallel processing is disabled. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported. Above 50, the output is sent to stdout.

By default 0.

Attributes:

valid_methods: List[str]

List of all valid methods. Either CRC or RCPS

valid_bounds: List[Union[str, ``None``]]

List of all valid bounds computation for RCPS only.

n_predict_params: int

Number of thresholds on which we compute the risk.

predict_params: NDArray

Array of parameters (noted λ in [3]) to consider for controlling the risk.

risksArrayLike of shape (n_samples_cal, n_predict_params)

The risk for each observation for each threshold

r_hatArrayLike of shape (n_predict_params)

Average risk for each predict_param

r_hat_plus: ArrayLike of shape (n_predict_params)

Upper confidence bound for each predict_param, computed with different bounds. Only relevant when method=”rcps”.

best_predict_param: NDArray of shape (n_alpha)

Optimal threshold for a given alpha.

valid_index: List[List[Any]]

List of list of all index that satisfy fwer controlling. This attribute is computed when the user wants to control precision score. Only relevant when risk=”precision” as it uses learn then test (ltt) procedure. Contains n_alpha lists.

valid_predict_params: List[List[Any]]

List of list of all thresholds that satisfy fwer controlling. This attribute is computed when the user wants to control precision score. Only relevant when risk=”precision” as it uses learn then test (ltt) procedure. Contains n_alpha lists.

sigma_initOptional[float]: First variance in the sigma_hat array. The default value is the same as in the paper implementation [1].

References

[1] Stephen Bates, Anastasios Angelopoulos, Lihua Lei, Jitendra Malik, and Michael I. Jordan. Distribution-free, risk-controlling prediction sets. CoRR, abs/2101.02703, 2021. URL https://arxiv.org/abs/2101.02703

[2] Angelopoulos, Anastasios N., Stephen, Bates, Adam, Fisch, Lihua, Lei, and Tal, Schuster. “Conformal Risk Control.” (2022).

[3] Angelopoulos, A. N., Bates, S., Candès, E. J., Jordan, M. I., & Lei, L. (2021). Learn then test: “Calibrating predictive algorithms to achieve risk control”.

Examples

>>> import numpy as np
>>> from sklearn.multioutput import MultiOutputClassifier
>>> from sklearn.linear_model import LogisticRegression
>>> from mapie.risk_control import MultiLabelClassificationController
>>> X_toy = np.arange(4).reshape(-1, 1)
>>> y_toy = np.stack([[1, 0, 1], [1, 0, 0], [0, 1, 1], [0, 1, 0]])
>>> clf = MultiOutputClassifier(LogisticRegression()).fit(X_toy, y_toy)
>>> mapie_clf = MultiLabelClassificationController(predict_function=clf.predict_proba, target_level=0.7).calibrate(X_toy, y_toy)
>>> y_pi_mapie = mapie_clf.predict(X_toy)
>>> print(y_pi_mapie[:, :, 0])
[[ True False  True]
 [ True False  True]
 [False  True  True]
 [False  True False]]

__init__(predict_function: Callable[[ArrayLike], list[ndarray[tuple[Any, ...], dtype[_ScalarT]]] | ndarray[tuple[Any, ...], dtype[_ScalarT]]], risk: str = 'recall', method: str | None = None, target_level: float | Iterable[float] = 0.9, confidence_level: float | None = None, rcps_bound: str | None = None, predict_params: ArrayLike = np.arange(0, 1, 0.01), n_jobs: int | None = None, random_state: int | RandomState | None = None, verbose: int = 0) → None[source]

calibrate(X: ArrayLike, y: ArrayLike) → MultiLabelClassificationController[source]

Use the fitted base estimator to compute risks and predict_params. Note that for high dimensional data, you can instead use the compute_risks method to compute risks batch by batch, followed by compute_best_predict_param.

Parameters:

X: ArrayLike of shape (n_samples, n_features): Training data.
y: NDArray of shape (n_samples, n_classes): Training labels.

Returns:

MultiLabelClassificationController: The model itself.

compute_best_predict_param() → MultiLabelClassificationController[source]: Compute optimal predict_params based on the computed risks.

compute_risks(X: ArrayLike, y: ArrayLike, _refit: bool | None = False) → MultiLabelClassificationController[source]

Fit the base estimator or use the fitted base estimator on batch data to compute risks. All the computed risks will be concatenated each time the compute_risks method is called.

Parameters:

XArrayLike of shape (n_samples, n_features)

Training data.

yNDArray of shape (n_samples, n_classes)

Training labels.

_refit: bool

Whether or not refit from scratch.

By default False

Returns: