.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples_calibration/1-quickstart/plot_calibration_venn_abers_binary.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_calibration_1-quickstart_plot_calibration_venn_abers_binary.py: ================================================= Calibrating binary classifier with Venn-ABERS ================================================= This example shows how to calibrate a binary classifier with :class:`~mapie.calibration.VennAbersCalibrator` and visualize the impact on predicted probabilities. We compare an uncalibrated model to its Venn-ABERS calibrated version using reliability diagrams and Brier scores. .. GENERATED FROM PYTHON SOURCE LINES 12-24 .. code-block:: Python from __future__ import annotations import matplotlib.pyplot as plt from sklearn.calibration import CalibrationDisplay from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import brier_score_loss from mapie.calibration import VennAbersCalibrator from mapie.utils import train_conformalize_test_split .. GENERATED FROM PYTHON SOURCE LINES 25-31 1. Build a miscalibrated binary classifier --------------------------------------------------- We generate a toy binary dataset and fit a random forest model which is known to be miscalibrated out of the box (produces probabilities too close to 0 or 1). We use a larger dataset to ensure sufficient data for proper calibration. .. GENERATED FROM PYTHON SOURCE LINES 31-53 .. code-block:: Python X, y = make_classification( n_samples=5000, n_features=20, n_informative=10, n_redundant=2, class_sep=0.8, random_state=42, ) # Split into train, calibration, and test sets (X_train, X_calib, X_test, y_train, y_calib, y_test) = train_conformalize_test_split( X, y, train_size=0.5, conformalize_size=0.2, test_size=0.3, random_state=42 ) # Use Random Forest which tends to be miscalibrated base_model = RandomForestClassifier(n_estimators=100, max_depth=10, random_state=42) base_model.fit(X_train, y_train) probs_raw = base_model.predict_proba(X_test)[:, 1] raw_brier = brier_score_loss(y_test, probs_raw) .. GENERATED FROM PYTHON SOURCE LINES 54-59 2. Calibrate with Venn-ABERS ---------------------------- We wrap the same base model in :class:`~mapie.calibration.VennAbersCalibrator` using the inductive mode (default). The calibrator uses the calibration set to learn a calibration mapping that will improve probability estimates. .. GENERATED FROM PYTHON SOURCE LINES 59-69 .. code-block:: Python va_calibrator = VennAbersCalibrator( estimator=RandomForestClassifier(n_estimators=100, max_depth=10, random_state=42), inductive=True, random_state=42, ) va_calibrator.fit(X_train, y_train, X_calib=X_calib, y_calib=y_calib) probs_va = va_calibrator.predict_proba(X_test)[:, 1] va_brier = brier_score_loss(y_test, probs_va) .. GENERATED FROM PYTHON SOURCE LINES 70-75 3. Reliability diagrams and Brier scores ---------------------------------------- Reliability diagrams show how predicted probabilities compare to observed frequencies. Perfect calibration lies on the diagonal. We also display Brier scores to quantify the improvement. .. GENERATED FROM PYTHON SOURCE LINES 75-95 .. code-block:: Python fig, axes = plt.subplots(1, 2, figsize=(12, 5)) CalibrationDisplay.from_predictions( y_test, probs_raw, name=f"Uncalibrated (Brier={raw_brier:.3f})", n_bins=10, ax=axes[0], ) CalibrationDisplay.from_predictions( y_test, probs_va, name=f"Venn-ABERS (Brier={va_brier:.3f})", n_bins=10, ax=axes[1], ) axes[0].set_title("Before calibration") axes[1].set_title("After Venn-ABERS calibration") plt.tight_layout() plt.show() .. image-sg:: /examples_calibration/1-quickstart/images/sphx_glr_plot_calibration_venn_abers_binary_001.png :alt: Before calibration, After Venn-ABERS calibration :srcset: /examples_calibration/1-quickstart/images/sphx_glr_plot_calibration_venn_abers_binary_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.487 seconds) .. _sphx_glr_download_examples_calibration_1-quickstart_plot_calibration_venn_abers_binary.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_calibration_venn_abers_binary.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_calibration_venn_abers_binary.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_calibration_venn_abers_binary.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_