.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples_regression/2-advanced-analysis/plot_conformal_predictive_distribution.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_regression_2-advanced-analysis_plot_conformal_predictive_distribution.py: ================================================================================ Conformal Predictive Distribution with MAPIE ================================================================================ .. GENERATED FROM PYTHON SOURCE LINES 9-19 In this advanced analysis, we propose to use MAPIE for Conformal Predictive Distribution (CPD) in few steps. Here are some reference papers for more information about CPD: [1] Schweder, T., & Hjort, N. L. (2016). Confidence, likelihood, probability (Vol. 41). Cambridge University Press. [2] Vovk, V., Shen, J., Manokhin, V., & Xie, M. G. (2017, May). Nonparametric predictive distributions based on conformal prediction. In Conformal and probabilistic prediction and applications (pp. 82-102). PMLR. .. GENERATED FROM PYTHON SOURCE LINES 19-37 .. code-block:: default import warnings import numpy as np from matplotlib import pyplot as plt from sklearn.datasets import make_regression from sklearn.linear_model import LinearRegression from mapie.conformity_scores import (AbsoluteConformityScore, ResidualNormalisedScore) from mapie.regression import SplitConformalRegressor from mapie.utils import train_conformalize_test_split warnings.filterwarnings('ignore') RANDOM_STATE = 15 .. GENERATED FROM PYTHON SOURCE LINES 38-42 1. Generating toy dataset ------------------------- Here, we propose just to generate data for regression task, then split it. .. GENERATED FROM PYTHON SOURCE LINES 42-62 .. code-block:: default X, y = make_regression( n_samples=1000, n_features=1, noise=20, random_state=RANDOM_STATE ) ( X_train, X_conformalize, X_test, y_train, y_conformalize, y_test ) = train_conformalize_test_split( X, y, train_size=0.6, conformalize_size=0.2, test_size=0.2, random_state=RANDOM_STATE ) plt.xlabel("x") plt.ylabel("y") plt.scatter(X_train, y_train, alpha=0.3) plt.show() .. image-sg:: /examples_regression/2-advanced-analysis/images/sphx_glr_plot_conformal_predictive_distribution_001.png :alt: plot conformal predictive distribution :srcset: /examples_regression/2-advanced-analysis/images/sphx_glr_plot_conformal_predictive_distribution_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 63-70 2. Defining a Conformal Predictive Distribution class with MAPIE ------------------------------------------------------------------ To be able to obtain the cumulative distribution function of a prediction with MAPIE, we propose here to wrap the :class:`~mapie.regression.SplitConformalRegressor` to add a new method named `get_cumulative_distribution_function`. .. GENERATED FROM PYTHON SOURCE LINES 70-86 .. code-block:: default class MapieConformalPredictiveDistribution(SplitConformalRegressor): def __init__(self, **kwargs) -> None: super().__init__(**kwargs) def get_cumulative_distribution_function(self, X): y_pred, _ = self.predict_interval(X) cs = self._mapie_regressor.conformity_scores_[ ~np.isnan(self._mapie_regressor.conformity_scores_)] res = self._conformity_score.get_estimation_distribution( y_pred.reshape((-1, 1)), cs, X=X ) return res .. GENERATED FROM PYTHON SOURCE LINES 87-91 Now, we propose to use it with two different conformity scores - :class:`~mapie.conformity_scores.AbsoluteConformityScore` and :class:`~mapie.conformity_scores.ResidualNormalisedScore` - in split-conformal inference. .. GENERATED FROM PYTHON SOURCE LINES 91-121 .. code-block:: default mapie_regressor_1 = MapieConformalPredictiveDistribution( estimator=LinearRegression(), conformity_score=AbsoluteConformityScore(sym=False), prefit=False ) mapie_regressor_1.fit(X_train, y_train) mapie_regressor_1.conformalize(X_conformalize, y_conformalize) y_pred_1, _ = mapie_regressor_1.predict_interval(X_test) y_cdf_1 = mapie_regressor_1.get_cumulative_distribution_function(X_test) mapie_regressor_2 = MapieConformalPredictiveDistribution( estimator=LinearRegression(), conformity_score=ResidualNormalisedScore(sym=False, random_state=RANDOM_STATE), prefit=False ) mapie_regressor_2.fit(X_train, y_train) mapie_regressor_2.conformalize(X_conformalize, y_conformalize) y_pred_2, _ = mapie_regressor_2.predict_interval(X_test) y_cdf_2 = mapie_regressor_2.get_cumulative_distribution_function(X_test) plt.xlabel("x") plt.ylabel("y") plt.scatter(X_test, y_test, alpha=0.3) plt.plot(X_test, y_pred_1, color="C1") plt.show() .. image-sg:: /examples_regression/2-advanced-analysis/images/sphx_glr_plot_conformal_predictive_distribution_002.png :alt: plot conformal predictive distribution :srcset: /examples_regression/2-advanced-analysis/images/sphx_glr_plot_conformal_predictive_distribution_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 122-127 3. Visualizing the cumulative distribution function --------------------------------------------------- We now propose to visualize the cumulative distribution functions of the predictive distribution in a graph in order to compare the two methods. .. GENERATED FROM PYTHON SOURCE LINES 127-154 .. code-block:: default nb_bins = 100 def plot_cdf(data, bins, **kwargs): counts, bins = np.histogram(data, bins=bins) cdf = np.cumsum(counts)/np.sum(counts) plt.plot( np.vstack((bins, np.roll(bins, -1))).T.flatten()[:-2], np.vstack((cdf, cdf)).T.flatten(), **kwargs ) plot_cdf( y_cdf_1[0], bins=nb_bins, label='Absolute Residual Score', alpha=0.8 ) plot_cdf( y_cdf_2[0], bins=nb_bins, label='Normalized Residual Score', alpha=0.8 ) plt.vlines( y_pred_1[0], 0, 1, label='Prediction', color="C2", linestyles='dashed' ) plt.legend(loc=2) plt.show() .. image-sg:: /examples_regression/2-advanced-analysis/images/sphx_glr_plot_conformal_predictive_distribution_003.png :alt: plot conformal predictive distribution :srcset: /examples_regression/2-advanced-analysis/images/sphx_glr_plot_conformal_predictive_distribution_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.142 seconds) .. _sphx_glr_download_examples_regression_2-advanced-analysis_plot_conformal_predictive_distribution.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_conformal_predictive_distribution.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_conformal_predictive_distribution.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_