mapie.classification
.SplitConformalClassifier¶
- class mapie.classification.SplitConformalClassifier(estimator: ClassifierMixin = LogisticRegression(), confidence_level: Union[float, Iterable[float]] = 0.9, conformity_score: Union[str, BaseClassificationScore] = 'lac', prefit: bool = True, n_jobs: Optional[int] = None, verbose: int = 0, random_state: Optional[Union[int, RandomState]] = None)[source]¶
Computes prediction sets using the split conformal classification technique:
The
fit
method (optional) fits the base classifier to the training data.The
conformalize
method estimates the uncertainty of the base classifier by computing conformity scores on the conformalization set.The
predict_set
method predicts labels and sets of labels.
- Parameters
- estimatorClassifierMixin, default=LogisticRegression()
The base classifier used to predict labels.
- confidence_levelUnion[float, List[float]], default=0.9
The confidence level(s) for the prediction sets, indicating the desired coverage probability of the prediction sets. If a float is provided, it represents a single confidence level. If a list, multiple prediction sets for each specified confidence level are returned.
- conformity_scoreUnion[str, BaseClassificationScore], default=”lac”
The method used to compute conformity scores.
Valid options:
“lac”
“top_k”
“aps”
“raps”
Any subclass of BaseClassificationScore
A custom score function inheriting from BaseClassificationScore may also be provided.
- prefitbool, default=False
If True, the base classifier must be fitted, and the
fit
method must be skipped.If False, the base classifier will be fitted during the
fit
method.- n_jobsOptional[int], default=None
The number of jobs to run in parallel when applicable.
- verboseint, default=0
Controls the verbosity level. Higher values increase the output details.
Examples
>>> from mapie.classification import SplitConformalClassifier >>> from mapie.utils import train_conformalize_test_split >>> from sklearn.datasets import make_classification >>> from sklearn.neighbors import KNeighborsClassifier
>>> X, y = make_classification(n_samples=500) >>> ( ... X_train, X_conformalize, X_test, ... y_train, y_conformalize, y_test ... ) = train_conformalize_test_split( ... X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=1 ... )
>>> mapie_classifier = SplitConformalClassifier( ... estimator=KNeighborsClassifier(), ... confidence_level=0.95, ... prefit=False, ... ).fit(X_train, y_train).conformalize(X_conformalize, y_conformalize)
>>> predicted_labels, predicted_sets = mapie_classifier.predict_set(X_test)
- __init__(estimator: ClassifierMixin = LogisticRegression(), confidence_level: Union[float, Iterable[float]] = 0.9, conformity_score: Union[str, BaseClassificationScore] = 'lac', prefit: bool = True, n_jobs: Optional[int] = None, verbose: int = 0, random_state: Optional[Union[int, RandomState]] = None) None [source]¶
- conformalize(X_conformalize: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_conformalize: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], predict_params: Optional[dict] = None) SplitConformalClassifier [source]¶
Estimates the uncertainty of the base classifier by computing conformity scores on the conformalization set.
- Parameters
- X_conformalizeArrayLike
Features of the conformalization set.
- y_conformalizeArrayLike
Targets of the conformalization set.
- predict_paramsOptional[dict], default=None
Parameters to pass to the
predict
andpredict_proba
methods of the base classifier. These parameters will also be used in thepredict_set
andpredict
methods of this SplitConformalClassifier.
- Returns
- Self
The conformalized SplitConformalClassifier instance.
- fit(X_train: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_train: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], fit_params: Optional[dict] = None) SplitConformalClassifier [source]¶
Fits the base classifier to the training data.
- Parameters
- X_trainArrayLike
Training data features.
- y_trainArrayLike
Training data targets.
- fit_paramsOptional[dict], default=None
Parameters to pass to the
fit
method of the base classifier.
- Returns
- Self
The fitted SplitConformalClassifier instance.
- predict(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]) ndarray[Any, dtype[_ScalarType_co]] [source]¶
For each sample in X, returns the predicted label by the base classifier.
- Parameters
- XArrayLike
Features
- Returns
- NDArray
Array of predicted labels, with shape
(n_samples,)
.
- predict_set(X: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], conformity_score_params: Optional[dict] = None) Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]]] [source]¶
For each sample in X, predicts a label (using the base classifier), and a set of labels.
If several confidence levels were provided during initialisation, several sets will be predicted for each sample. See the return signature.
- Parameters
- XArrayLike
Features
- conformity_score_paramsOptional[dict], default=None
Parameters specific to conformity scores, used at prediction time.
The only example for now is
include_last_label
, available for aps and raps conformity scores. For detailed information oninclude_last_label
, see the docstring ofconformity_scores.sets.aps.APSConformityScore.get_prediction_sets()
.
- Returns
- Tuple[NDArray, NDArray]
Two arrays:
Prediction labels, of shape
(n_samples,)
Prediction sets, of shape
(n_samples, n_class, n_confidence_levels)