autots.evaluator package

Submodules

autots.evaluator.anomaly_detector module

autots.evaluator.auto_model module

autots.evaluator.auto_ts module

autots.evaluator.benchmark module

autots.evaluator.event_forecasting module

autots.evaluator.feature_detector module

autots.evaluator.metrics module

Tools for calculating forecast errors.

Some common args:

A or actual (np.array): actuals ndim 2 (timesteps, series) F or forecast (np.array): forecast values ndim 2 (timesteps, series) ae (np.array): precalculated np.abs(A - F)

autots.evaluator.metrics.array_last_val(arr)
autots.evaluator.metrics.chi_squared_hist_distribution_loss(F, A, bins='auto', plot=False)

Distribution loss, chi-squared distance from histograms.

autots.evaluator.metrics.containment(lower_forecast, upper_forecast, actual)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters:
  • actual (numpy.array) – known true values

  • forecast (numpy.array) – predicted values

autots.evaluator.metrics.contour(A, F)

A measure of how well the actual and forecast follow the same pattern of change. Note: If actual values are unchanging, will match positive changing forecasts. This is faster, and because if actuals are a flat line, contour probably isn’t a concern regardless.

# bluff tops follow the shape of the river below, at different elevation

Expects two, 2-D numpy arrays of forecast_length * n series Returns a 1-D array of results in len n series

NaNs diffs are filled with 0, essentially equiavelent to assuming a forward fill of NaN

Concat the last row of history to head of both A and F (req for 1 step)

Parameters:
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

autots.evaluator.metrics.default_scaler(df_train)
autots.evaluator.metrics.dwae(A, F, last_of_array)

Direcitonal Weighted Absolute Error, the accuracy of growth or decline relative to most recent data.

autots.evaluator.metrics.full_metric_evaluation(A, F, upper_forecast, lower_forecast, df_train, prediction_interval, columns=None, scaler=None, return_components=False, cumsum_A=None, diff_A=None, last_of_array=None, custom_metric=None, **kwargs)

Create a pd.DataFrame of metrics per series given actuals, forecast, and precalculated errors. There are some extra args which are precomputed metrics for efficiency in loops, don’t worry about them.

Parameters:
  • A (np.array) – array or df of actuals

  • F (np.array) – array or df of forecasts

  • return_components (bool) – if True, return tuple of detailed errors

  • metric (custom) – a function to generate a custom metric. Expects func(A, F, df_train, prediction_interval) where the first three are np arrays of wide style 2d.

autots.evaluator.metrics.kde(actuals, forecasts, bandwidth, x)
autots.evaluator.metrics.kde_kl_distance(F, A, bandwidth=0.5, x=None)

Distribution loss by means of KDE and KL Divergence.

autots.evaluator.metrics.kl_divergence(p, q, epsilon=1e-10)

Compute KL Divergence between two distributions.

autots.evaluator.metrics.linearity(arr)

Score perecentage of a np.array with linear progression, along the index (0) axis.

autots.evaluator.metrics.mae(ae)

Accepting abs error already calculated

autots.evaluator.metrics.mda(A, F)

A measure of how well the actual and forecast follow the same pattern of change. Expects two, 2-D numpy arrays of forecast_length * n series Returns a 1-D array of results in len n series

NaNs diffs are filled with 0, essentially equiavelent to assuming a forward fill of NaN

Concat the last row of history to head of both A and F (req for 1 step)

Parameters:
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

autots.evaluator.metrics.mean_absolute_differential_error(A, F, order: int = 1, df_train=None, scaler=None)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters:
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

  • order (int) – order of differential

  • df_train (np.array) – if provided, uses this as starting point for first diff step. Tail(1) must be most recent historical point before forecast. Must be numpy Array not DataFrame. Highly recommended if using this as the sole optimization metric. Without, it is an “unanchored” shape fitting metric. This will also allow this to work on forecast_length = 1 forecasts

  • scaler (np.array) – if provided, metrics are scaled by this. 1d array of shape (num_series,)

autots.evaluator.metrics.mean_absolute_error(A, F)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters:
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

autots.evaluator.metrics.medae(ae, nan_flag=True)

Accepting abs error already calculated

autots.evaluator.metrics.median_absolute_error(A, F)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters:
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

autots.evaluator.metrics.mlvb(A, F, last_of_array)

Mean last value baseline, the % difference of forecast vs last value naive forecast. Does poorly with near-zero values.

Parameters:
  • A (np.array) – actuals

  • F (np.array) – forecast values

  • last_of_array (np.array) – the last row of the historic training data, most recent values

autots.evaluator.metrics.mqae(ae, q=0.85, nan_flag=True)

Return the mean of errors less than q quantile of the errors per series. np.nans count as largest values, and so are removed as part of the > q group.

autots.evaluator.metrics.msle(full_errors, ae, le, nan_flag=True)

input is array of y_pred - y_true to over-penalize underestimate. Use instead y_true - y_pred to over-penalize overestimate. AE used here for the log just to avoid divide by zero warnings (values aren’t used either way)

autots.evaluator.metrics.numpy_ffill(arr)

Fill np.nan forward down the zero axis.

autots.evaluator.metrics.oda(A, F, last_of_array)

Origin Directional Accuracy, the accuracy of growth or decline relative to most recent data.

autots.evaluator.metrics.pinball_loss(A, F, quantile)

Bigger is bad-er.

autots.evaluator.metrics.precomp_wasserstein(F, cumsum_A)
autots.evaluator.metrics.qae(ae, q=0.9, nan_flag=True)

Return the q quantile of the errors per series. np.nans count as smallest values and will push more values into the exclusion group.

autots.evaluator.metrics.rmse(sqe)

Accepting squared error already calculated

autots.evaluator.metrics.root_mean_square_error(actual, forecast)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters:
  • actual (numpy.array) – known true values

  • forecast (numpy.array) – predicted values

autots.evaluator.metrics.rps(predictions, observed)

Vectorized version of Ranked Probability Score. A lower value is a better score. From: Colin Catlin, https://syllepsis.live/2022/01/22/ranked-probability-score-in-python/

Parameters:
  • predictions (pd.DataFrame) – each column is an outcome category, with values as the 0 to 1 probability of that category

  • observed (pd.DataFrame) – each column is an outcome category, with values of 0 OR 1 with 1 being that category occurred

autots.evaluator.metrics.scaled_pinball_loss(A, F, df_train, quantile)

Scaled pinball loss.

Parameters:
  • A (np.array) – actual values

  • F (np.array) – forecast values

  • df_train (np.array) – values of historic data for scaling

  • quantile (float) – which bound of upper/lower forecast this is

autots.evaluator.metrics.smape(actual, forecast, ae, nan_flag=True)

Accepting abs error already calculated

autots.evaluator.metrics.smoothness(arr)

A gradient measure of linearity, where 0 is linear and larger values are more volatile.

autots.evaluator.metrics.spl(precomputed_spl, scaler)

Accepting most of it already calculated

autots.evaluator.metrics.symmetric_mean_absolute_percentage_error(actual, forecast)

Expect two, 2-D numpy arrays of forecast_length * n series. Allows NaN in actuals, and corresponding NaN in forecast, but not unmatched NaN in forecast Also doesn’t like zeroes in either forecast or actual - results in poor error value even if forecast is accurate

Returns a 1-D array of results in len n series

Parameters:
  • actual (numpy.array) – known true values

  • forecast (numpy.array) – predicted values

References

https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error

autots.evaluator.metrics.threshold_loss(actual, forecast, threshold, penalty_threshold=None)

Run once for overestimate then again for underestimate. Add both for combined view.

Parameters:
  • actual/forecast – 2D wide style data DataFrame or np.array

  • threshold – (0, 2), 0.9 (penalize 10% and greater underestimates) and 1.1 (penalize overestimate over 10%)

  • penalty_threshold – defaults to same as threshold, adjust strength of penalty

autots.evaluator.metrics.unsorted_wasserstein(F, A)

Also known as earth moving distance.

autots.evaluator.metrics.wasserstein(F, A)

This version has sorting, which is perhaps less relevant on average than the unsorted.

autots.evaluator.validation module

Module contents

Model Evaluators