mapie.metrics.regression
.coverage_width_based¶
- mapie.metrics.regression.coverage_width_based(y_true: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_pred_low: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], y_pred_up: Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], eta: float, confidence_level: float) float [source]¶
Coverage Width-based Criterion (CWC) obtained by the prediction intervals.
The effective coverage score is a criterion used to evaluate the quality of prediction intervals (PIs) based on their coverage and width.
Khosravi, Abbas, Saeid Nahavandi, and Doug Creighton. “Construction of optimal prediction intervals for load forecasting problems.” IEEE Transactions on Power Systems 25.3 (2010): 1496-1503.
- Parameters
- Coverage scorefloat
Prediction interval coverage probability (Coverage score), which is the estimated fraction of true labels that lie within the prediction intervals.
- Mean Width Scorefloat
Prediction interval normalized average width (Mean Width Score), calculated as the average width of the prediction intervals.
- etaint
A user-defined parameter that balances the contributions of Mean Width Score and Coverage score in the CWC calculation.
- confidence_levelfloat
A user-defined parameter representing the designed confidence level of the PI.
- Returns
- float
Effective coverage score (CWC) obtained by the prediction intervals.
Notes
The effective coverage score (CWC) is calculated using the following formula: CWC = (1 - Mean Width Score) * exp(-eta * (Coverage score - (1-alpha))**2)
The CWC penalizes under- and overcoverage in the same way and summarizes the quality of the prediction intervals in a single value.
High Eta (Large Positive Value):
When eta is a high positive value, it will strongly emphasize the contribution of (1-Mean Width Score). This means that the algorithm will prioritize reducing the average width of the prediction intervals (Mean Width Score) over achieving a high coverage probability (Coverage score). The exponential term np.exp(-eta*(Coverage score - (1-alpha))**2) will have a sharp decline as Coverage score deviates from (1-alpha). So, achieving a high Coverage score becomes less important compared to minimizing Mean Width Score. The impact will be narrower prediction intervals on average, which may result in more precise but less conservative predictions.
Low Eta (Small Positive Value):
When eta is a low positive value, it will still prioritize reducing the average width of the prediction intervals (Mean Width Score) but with less emphasis compared to higher eta values. The exponential term will be less steep, meaning that deviations of Coverage score from (1-alpha) will have a moderate impact. You’ll get a balance between prediction precision and coverage, but the exact balance will depend on the specific value of eta.
Negative Eta (Any Negative Value):
When eta is negative, it will have a different effect on the formula. Negative values of eta will cause the exponential term np.exp(-eta*(Coverage score - (1-alpha))**2) to become larger as Coverage score deviates from (1-alpha). This means that a negative eta prioritizes achieving a high coverage probability (Coverage score) over minimizing Mean Width Score. In this case, the algorithm will aim to produce wider prediction intervals to ensure a higher likelihood of capturing the true values within those intervals, even if it sacrifices precision. Negative eta values might be used in scenarios where avoiding errors or outliers is critical.
Null Eta (Eta = 0):
Specifically, when eta is zero, the CWC score becomes equal to (1 - Mean Width Score), which is equivalent to (1 - average width of the prediction intervals). Therefore, in this case, the CWC score is primarily based on the size of the prediction interval.
Examples
>>> from mapie.metrics.regression import coverage_width_based >>> import numpy as np >>> y_true = np.array([5, 7.5, 9.5, 10.5, 12.5]) >>> y_preds_low = np.array([4, 6, 9, 8.5, 10.5]) >>> y_preds_up = np.array([6, 9, 10, 12.5, 12]) >>> eta = 0.01 >>> confidence_level = 0.9 >>> cwb = coverage_width_based( ... y_true, y_preds_low, y_preds_up, eta, confidence_level ... ) >>> print(np.round(cwb ,2)) 0.69