.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples_exchangeability_testing/1-quickstart/plot_exchangeability_fixed_dataset.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_exchangeability_testing_1-quickstart_plot_exchangeability_fixed_dataset.py: Exchangeability testing on a fixed dataset ========================================== This quickstart demonstrates how to test exchangeability on a fixed dataset. Guarantees provided by conformal prediction and risk control depend on the hypothesis that data is exchangeable. Verifying exchangeability before applying methods from MAPIE is therefore important. Typically, (split) conformal prediction, risk control, and calibration require data not seen during training, such as a split of the test data. Note that for the exchangeability test to be valid, the order of samples in the fixed dataset must be representative of what will happen after deployment. Shuffling the data beforehand would trivially render the dataset exchangeable and hide any potential distribution shift. .. GENERATED FROM PYTHON SOURCE LINES 20-24 Prepare the data ---------------- We first prepare the data and fit a classifier on the training data. .. GENERATED FROM PYTHON SOURCE LINES 24-51 .. code-block:: Python from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from mapie._example_utils import generate_gaussian_stream, plot_dataset from mapie.classification import SplitConformalClassifier from mapie.exchangeability_testing import FixedDatasetExchangeabilityTest random_state = 42 X, y = generate_gaussian_stream( shift_type="stable", random_state=random_state, ) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=random_state, shuffle=False ) plot_dataset( X_test, y_test, title="Exchangeable fixed dataset", ) classifier = LogisticRegression(random_state=random_state) classifier.fit(X_train, y_train) .. image-sg:: /examples_exchangeability_testing/1-quickstart/images/sphx_glr_plot_exchangeability_fixed_dataset_001.png :alt: Exchangeable fixed dataset :srcset: /examples_exchangeability_testing/1-quickstart/images/sphx_glr_plot_exchangeability_fixed_dataset_001.png :class: sphx-glr-single-img .. raw:: html
LogisticRegression(random_state=42)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


.. GENERATED FROM PYTHON SOURCE LINES 52-58 Run the exchangeability test ---------------------------- Now we can test the exchangeability of the test dataset. By default, we use all available test methods. Method-specific parameters can be passed as a dictionary. .. GENERATED FROM PYTHON SOURCE LINES 58-66 .. code-block:: Python exchangeability_test = FixedDatasetExchangeabilityTest() exchangeability_test.run(X_test, y_test) print("Is the test dataset exchangeable?") for test_name, is_exchangeable in exchangeability_test.is_exchangeable.items(): print(f"{test_name}: {is_exchangeable}") .. rst-class:: sphx-glr-script-out .. code-block:: none Is the test dataset exchangeable? pvalue_permutation: True permutation_binomial: True permutation_binomial_mixture: True permutation_aggressive: True plugin_martingale: True jumper_martingale: True .. GENERATED FROM PYTHON SOURCE LINES 67-73 Continue with MAPIE ------------------- The test dataset is exchangeable. We can continue with MAPIE. Conformalization with a split of the test dataset will provide coverage guarantees on the remaining test data. .. GENERATED FROM PYTHON SOURCE LINES 73-85 .. code-block:: Python X_conformalize, X_test_new, y_conformalize, y_test_new = train_test_split( X_test, y_test, test_size=0.5, random_state=random_state ) confidence_level = 0.95 mapie_classifier = SplitConformalClassifier( estimator=classifier, confidence_level=confidence_level, prefit=True ) mapie_classifier.conformalize(X_conformalize, y_conformalize) y_pred, y_pred_set = mapie_classifier.predict_set(X_test_new) .. GENERATED FROM PYTHON SOURCE LINES 86-91 Create a non-exchangeable dataset --------------------------------- Now let us see what happens for a non-exchangeable fixed dataset. Here, an abrupt shift happens in the second part of the dataset. .. GENERATED FROM PYTHON SOURCE LINES 91-113 .. code-block:: Python X_test_abrupt, y_test_abrupt = generate_gaussian_stream( n_samples=len(X_test), shift_type="abrupt", prop_shift=0.5, random_state=random_state + 1, ) shift_start_abrupt = int(len(y_test_abrupt) * 0.5) plot_dataset( X_test_abrupt, y_test_abrupt, title="Non-exchangeable fixed dataset", shift_start=shift_start_abrupt, ) exchangeability_test = FixedDatasetExchangeabilityTest() exchangeability_test.run(X_test_abrupt, y_test_abrupt) print("Is the shifted dataset exchangeable?") for test_name, is_exchangeable in exchangeability_test.is_exchangeable.items(): print(f"{test_name}: {is_exchangeable}") .. image-sg:: /examples_exchangeability_testing/1-quickstart/images/sphx_glr_plot_exchangeability_fixed_dataset_002.png :alt: Non-exchangeable fixed dataset :srcset: /examples_exchangeability_testing/1-quickstart/images/sphx_glr_plot_exchangeability_fixed_dataset_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Is the shifted dataset exchangeable? pvalue_permutation: False permutation_binomial: False permutation_binomial_mixture: False permutation_aggressive: False plugin_martingale: False jumper_martingale: True .. GENERATED FROM PYTHON SOURCE LINES 114-128 Interpret the result -------------------- The shifted test dataset is not exchangeable: MAPIE cannot provide statistical guarantees on it, and more generally the classifier itself should not be trusted without further investigation. Note that the jumper martingale fails to detect the non-exchangeability in this case. Itmostly reacts to *one-sided* p-value distortions (many consistently small p-values, or many consistently large ones). The shift creates both many very low and many very high p-values, the effects cancel out for the jumper martingale. More generally, this illustrates that no single test is perfect, and that it is important to use multiple tests to get a complete picture. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 5.384 seconds) .. _sphx_glr_download_examples_exchangeability_testing_1-quickstart_plot_exchangeability_fixed_dataset.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_exchangeability_fixed_dataset.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_exchangeability_fixed_dataset.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_exchangeability_fixed_dataset.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_