mapie.subsample.Subsample

class mapie.subsample.Subsample(n_resamplings: int = 30, n_samples: Optional[Union[int, float]] = None, replace: bool = True, random_state: Optional[Union[int, RandomState]] = None)[source]

Generate a sampling method, that resamples the training set with possible bootstraps. It can replace KFold or LeaveOneOut as cv argument in the MAPIE class.

Parameters
n_resamplingsint

Number of resamplings. By default 30.

n_samples: Union[int, float]

Number of samples in each resampling. By default None, the size of the training set. If it is between 0 and 1, it becomes the fraction of samples

replace: bool

Whether to replace samples in resamplings or not. By default True.

random_state: Optional[Union[int, RandomState]]

int or RandomState instance. By default None

Examples

>>> import numpy as np
>>> from mapie.subsample import Subsample
>>> cv = Subsample(n_resamplings=2,random_state=0)
>>> X = np.array([1,2,3,4,5,6,7,8,9,10])
>>> for train_index, test_index in cv.split(X):
...    print(f"train index is {train_index}, test index is {test_index}")
train index is [5 0 3 3 7 9 3 5 2 4], test index is [1 6 8]
train index is [7 6 8 8 1 6 7 7 8 1], test index is [0 2 3 4 5 9]
__init__(n_resamplings: int = 30, n_samples: Optional[Union[int, float]] = None, replace: bool = True, random_state: Optional[Union[int, RandomState]] = None) None[source]
get_n_splits(*args: Any, **kargs: Any) int[source]

Returns the number of splitting iterations in the cross-validator.

Returns
int

Returns the number of splitting iterations in the cross-validator.

split(X: ndarray[Any, dtype[_ScalarType_co]], *args: Any, **kargs: Any) Generator[Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]]], None, None][source]

Generate indices to split data into training and test sets.

Parameters
XNDArray of shape (n_samples, n_features)

Training data.

Yields
trainNDArray of shape (n_indices_training,)

The training set indices for that split.

testNDArray of shape (n_indices_test,)

The testing set indices for that split.