mapie.subsample.BlockBootstrap

class mapie.subsample.BlockBootstrap(n_resamplings: int = 30, length: Optional[int] = None, n_blocks: Optional[int] = None, overlapping: bool = False, random_state: Optional[Union[int, RandomState]] = None)[source]

Generate a sampling method, that block bootstraps the training set. It can replace KFold, LeaveOneOut or SubSample as cv argument in the MapieRegressor class.

Parameters
n_resamplingsint

Number of resamplings. By default 30.

length: int

Length of the blocks. By default None, the length of the training set divided by n_blocks.

overlapping: bool

Whether the blocks can overlap or not. By default False.

n_blocks: int

Number of blocks in each resampling. By default None, the size of the training set divided by length.

random_state: Optional

int or RandomState instance.

Raises
ValueError

If both length and n_blocks are None.

Examples

>>> import numpy as np
>>> from mapie.subsample import BlockBootstrap
>>> cv = BlockBootstrap(n_resamplings=2, length=3, random_state=0)
>>> X = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>> for train_index, test_index in cv.split(X):
...    print(f"train index is {train_index}, test index is {test_index}")
train index is [1 2 3 4 5 6 1 2 3 4 5 6], test index is [8 9 7]
train index is [4 5 6 7 8 9 1 2 3 7 8 9], test index is []
__init__(n_resamplings: int = 30, length: Optional[int] = None, n_blocks: Optional[int] = None, overlapping: bool = False, random_state: Optional[Union[int, RandomState]] = None) None[source]
get_n_splits(*args: Any, **kargs: Any) int[source]

Returns the number of splitting iterations in the cross-validator.

Returns
int

Returns the number of splitting iterations in the cross-validator.

split(X: ndarray[Any, dtype[_ScalarType_co]], *args: Any, **kargs: Any) Generator[Tuple[ndarray[Any, dtype[_ScalarType_co]], ndarray[Any, dtype[_ScalarType_co]]], None, None][source]

Generate indices to split data into training and test sets.

Parameters
XNDArray of shape (n_samples, n_features)

Training data.

Yields
trainNDArray of shape (n_indices_training,)

The training set indices for that split.

testNDArray of shape (n_indices_test,)

The testing set indices for that split.

Raises
ValueError

If length is not positive or greater than the train set size.