autots.evaluator package¶
Submodules¶
autots.evaluator.anomaly_detector module¶
autots.evaluator.auto_model module¶
autots.evaluator.auto_ts module¶
autots.evaluator.benchmark module¶
autots.evaluator.event_forecasting module¶
autots.evaluator.feature_detector module¶
autots.evaluator.metrics module¶
Tools for calculating forecast errors.
- Some common args:
A or actual (np.array): actuals ndim 2 (timesteps, series) F or forecast (np.array): forecast values ndim 2 (timesteps, series) ae (np.array): precalculated np.abs(A - F)
- autots.evaluator.metrics.array_last_val(arr)¶
- autots.evaluator.metrics.chi_squared_hist_distribution_loss(F, A, bins='auto', plot=False)¶
Distribution loss, chi-squared distance from histograms.
- autots.evaluator.metrics.containment(lower_forecast, upper_forecast, actual)¶
Expects two, 2-D numpy arrays of forecast_length * n series.
Returns a 1-D array of results in len n series
- Parameters:
actual (numpy.array) – known true values
forecast (numpy.array) – predicted values
- autots.evaluator.metrics.contour(A, F)¶
A measure of how well the actual and forecast follow the same pattern of change. Note: If actual values are unchanging, will match positive changing forecasts. This is faster, and because if actuals are a flat line, contour probably isn’t a concern regardless.
# bluff tops follow the shape of the river below, at different elevation
Expects two, 2-D numpy arrays of forecast_length * n series Returns a 1-D array of results in len n series
NaNs diffs are filled with 0, essentially equiavelent to assuming a forward fill of NaN
Concat the last row of history to head of both A and F (req for 1 step)
- Parameters:
A (numpy.array) – known true values
F (numpy.array) – predicted values
- autots.evaluator.metrics.default_scaler(df_train)¶
- autots.evaluator.metrics.dwae(A, F, last_of_array)¶
Direcitonal Weighted Absolute Error, the accuracy of growth or decline relative to most recent data.
- autots.evaluator.metrics.full_metric_evaluation(A, F, upper_forecast, lower_forecast, df_train, prediction_interval, columns=None, scaler=None, return_components=False, cumsum_A=None, diff_A=None, last_of_array=None, custom_metric=None, **kwargs)¶
Create a pd.DataFrame of metrics per series given actuals, forecast, and precalculated errors. There are some extra args which are precomputed metrics for efficiency in loops, don’t worry about them.
- Parameters:
A (np.array) – array or df of actuals
F (np.array) – array or df of forecasts
return_components (bool) – if True, return tuple of detailed errors
metric (custom) – a function to generate a custom metric. Expects func(A, F, df_train, prediction_interval) where the first three are np arrays of wide style 2d.
- autots.evaluator.metrics.kde(actuals, forecasts, bandwidth, x)¶
- autots.evaluator.metrics.kde_kl_distance(F, A, bandwidth=0.5, x=None)¶
Distribution loss by means of KDE and KL Divergence.
- autots.evaluator.metrics.kl_divergence(p, q, epsilon=1e-10)¶
Compute KL Divergence between two distributions.
- autots.evaluator.metrics.linearity(arr)¶
Score perecentage of a np.array with linear progression, along the index (0) axis.
- autots.evaluator.metrics.mae(ae)¶
Accepting abs error already calculated
- autots.evaluator.metrics.mda(A, F)¶
A measure of how well the actual and forecast follow the same pattern of change. Expects two, 2-D numpy arrays of forecast_length * n series Returns a 1-D array of results in len n series
NaNs diffs are filled with 0, essentially equiavelent to assuming a forward fill of NaN
Concat the last row of history to head of both A and F (req for 1 step)
- Parameters:
A (numpy.array) – known true values
F (numpy.array) – predicted values
- autots.evaluator.metrics.mean_absolute_differential_error(A, F, order: int = 1, df_train=None, scaler=None)¶
Expects two, 2-D numpy arrays of forecast_length * n series.
Returns a 1-D array of results in len n series
- Parameters:
A (numpy.array) – known true values
F (numpy.array) – predicted values
order (int) – order of differential
df_train (np.array) – if provided, uses this as starting point for first diff step. Tail(1) must be most recent historical point before forecast. Must be numpy Array not DataFrame. Highly recommended if using this as the sole optimization metric. Without, it is an “unanchored” shape fitting metric. This will also allow this to work on forecast_length = 1 forecasts
scaler (np.array) – if provided, metrics are scaled by this. 1d array of shape (num_series,)
- autots.evaluator.metrics.mean_absolute_error(A, F)¶
Expects two, 2-D numpy arrays of forecast_length * n series.
Returns a 1-D array of results in len n series
- Parameters:
A (numpy.array) – known true values
F (numpy.array) – predicted values
- autots.evaluator.metrics.medae(ae, nan_flag=True)¶
Accepting abs error already calculated
- autots.evaluator.metrics.median_absolute_error(A, F)¶
Expects two, 2-D numpy arrays of forecast_length * n series.
Returns a 1-D array of results in len n series
- Parameters:
A (numpy.array) – known true values
F (numpy.array) – predicted values
- autots.evaluator.metrics.mlvb(A, F, last_of_array)¶
Mean last value baseline, the % difference of forecast vs last value naive forecast. Does poorly with near-zero values.
- Parameters:
A (np.array) – actuals
F (np.array) – forecast values
last_of_array (np.array) – the last row of the historic training data, most recent values
- autots.evaluator.metrics.mqae(ae, q=0.85, nan_flag=True)¶
Return the mean of errors less than q quantile of the errors per series. np.nans count as largest values, and so are removed as part of the > q group.
- autots.evaluator.metrics.msle(full_errors, ae, le, nan_flag=True)¶
input is array of y_pred - y_true to over-penalize underestimate. Use instead y_true - y_pred to over-penalize overestimate. AE used here for the log just to avoid divide by zero warnings (values aren’t used either way)
- autots.evaluator.metrics.numpy_ffill(arr)¶
Fill np.nan forward down the zero axis.
- autots.evaluator.metrics.oda(A, F, last_of_array)¶
Origin Directional Accuracy, the accuracy of growth or decline relative to most recent data.
- autots.evaluator.metrics.pinball_loss(A, F, quantile)¶
Bigger is bad-er.
- autots.evaluator.metrics.precomp_wasserstein(F, cumsum_A)¶
- autots.evaluator.metrics.qae(ae, q=0.9, nan_flag=True)¶
Return the q quantile of the errors per series. np.nans count as smallest values and will push more values into the exclusion group.
- autots.evaluator.metrics.rmse(sqe)¶
Accepting squared error already calculated
- autots.evaluator.metrics.root_mean_square_error(actual, forecast)¶
Expects two, 2-D numpy arrays of forecast_length * n series.
Returns a 1-D array of results in len n series
- Parameters:
actual (numpy.array) – known true values
forecast (numpy.array) – predicted values
- autots.evaluator.metrics.rps(predictions, observed)¶
Vectorized version of Ranked Probability Score. A lower value is a better score. From: Colin Catlin, https://syllepsis.live/2022/01/22/ranked-probability-score-in-python/
- Parameters:
predictions (pd.DataFrame) – each column is an outcome category, with values as the 0 to 1 probability of that category
observed (pd.DataFrame) – each column is an outcome category, with values of 0 OR 1 with 1 being that category occurred
- autots.evaluator.metrics.scaled_pinball_loss(A, F, df_train, quantile)¶
Scaled pinball loss.
- Parameters:
A (np.array) – actual values
F (np.array) – forecast values
df_train (np.array) – values of historic data for scaling
quantile (float) – which bound of upper/lower forecast this is
- autots.evaluator.metrics.smape(actual, forecast, ae, nan_flag=True)¶
Accepting abs error already calculated
- autots.evaluator.metrics.smoothness(arr)¶
A gradient measure of linearity, where 0 is linear and larger values are more volatile.
- autots.evaluator.metrics.spl(precomputed_spl, scaler)¶
Accepting most of it already calculated
- autots.evaluator.metrics.symmetric_mean_absolute_percentage_error(actual, forecast)¶
Expect two, 2-D numpy arrays of forecast_length * n series. Allows NaN in actuals, and corresponding NaN in forecast, but not unmatched NaN in forecast Also doesn’t like zeroes in either forecast or actual - results in poor error value even if forecast is accurate
Returns a 1-D array of results in len n series
- Parameters:
actual (numpy.array) – known true values
forecast (numpy.array) – predicted values
References
https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error
- autots.evaluator.metrics.threshold_loss(actual, forecast, threshold, penalty_threshold=None)¶
Run once for overestimate then again for underestimate. Add both for combined view.
- Parameters:
actual/forecast – 2D wide style data DataFrame or np.array
threshold – (0, 2), 0.9 (penalize 10% and greater underestimates) and 1.1 (penalize overestimate over 10%)
penalty_threshold – defaults to same as threshold, adjust strength of penalty
- autots.evaluator.metrics.unsorted_wasserstein(F, A)¶
Also known as earth moving distance.
- autots.evaluator.metrics.wasserstein(F, A)¶
This version has sorting, which is perhaps less relevant on average than the unsorted.
autots.evaluator.validation module¶
Module contents¶
Model Evaluators