autots.evaluator.feature_detector package

Subpackages

Submodules

autots.evaluator.feature_detector.detector module

TimeSeriesFeatureDetector - Main orchestrator class.

Composes functionality from component mixins for decomposition, seasonality, trend, holidays, anomalies, rescaling, and formatting.

class autots.evaluator.feature_detector.detector.TimeSeriesFeatureDetector(seasonality_params=None, rough_seasonality_params=None, holiday_params=None, anomaly_params=None, changepoint_params=None, level_shift_params=None, level_shift_validation=None, general_transformer_params=None, smoothing_window=None, standardize=True, detection_mode='multivariate', global_holiday_anomaly_suppression=True, extended_anomaly_params=None, event_dag_params=None, holiday_country=None, holiday_countries=None)

Bases: DecompositionMixin, SeasonalityMixin, TrendMixin, HolidayMixin, AnomalyMixin, ExtendedAnomalyMixin, RescalingMixin, FormattingMixin

Comprehensive feature detection pipeline for time series.

TODO: upstream more of this code into the component classes (e.g., HolidayDetector, AnomalyRemoval, ChangepointDetector) TODO: Handle multiplicative seasonality TODO: Handle time varying seasonality using fast_kalman TODO: Improve holiday “splash” effect and weekend interactions TODO: Support identifying regressor impacts and granger lag impacts TODO: Build upon the JSON template so that it can be converted to a fixed size embedding (probably a 2d embedding). The fixed size may vary by parameters, but for a given parameter set should always be the same size. The embedding does not need to be capable of fully reconstructing the time series, just representing it. TODO: Support for modeling the trend with a fast kalman state space approach, ideally aligned with changepoints in some way if possible. TODO: consider also having “deviation from group” type anomaly detection for multivariate series TODO: Improve anomaly typing in univariate mode (currently defaults to point_outlier) and incorporate detector scores into type confidence. TODO: Detect and expose non-holiday regressor impacts (not just holiday coefficients), and persist them in template/features output.

Parameters

rough_seasonality_paramsdict, optional

Parameters for DatepartRegressionTransformer used in initial rough seasonality decomposition (to improve holiday and anomaly detection).

holiday_paramsdict, optional

Parameters for HolidayDetector

anomaly_paramsdict, optional

Parameters for AnomalyRemoval

changepoint_paramsdict, optional

Parameters for ChangepointDetector

level_shift_paramsdict, optional

Parameters for LevelShiftMagic

level_shift_validationdict, optional

Validation parameters for level shifts

general_transformer_paramsdict, optional

Parameters for GeneralTransformer applied before trend detection

smoothing_windowint, optional

Window size for smoothing before trend detection

standardizebool, default=True

Whether to standardize series before processing

detection_modestr, default=’multivariate’

Controls whether detections are unique per series (‘multivariate’) or shared across all series (‘univariate’). - ‘multivariate’: Each series gets unique anomalies, holidays, changepoints, and level shifts - ‘univariate’: All series share common anomalies, holidays, changepoints, and level shifts

(level shifts are detected on aggregated signal and scaled appropriately per series)

global_holiday_anomaly_suppressionbool, default=True

If True, anomaly detection suppresses holiday-proximate flags using a merged holiday date set from all series. Set False to disable this suppression.

TEMPLATE_VERSION = '1.2'
fit(df)

Fit the feature detector to time series data.

Decomposition follows this sequential removal strategy:

  1. INITIAL DECOMPOSITION (for detection only): - Remove rough seasonality → rough_residual - Detect holidays on rough_residual - Detect anomalies on rough_residual

  2. FINAL SEASONALITY FIT: - Fit on: original - anomalies - Holidays fitted simultaneously as regressors - Output: final_residual (has seasonality + holidays removed)

  3. LEVEL SHIFT DETECTION: - Detect on: original - anomalies - seasonality - holidays - (This is final_residual)

  4. TREND DETECTION: - Detect on: original - anomalies - seasonality - holidays - level_shifts

  5. NOISE & ANOMALY COMPONENTS: - Noise: original - trend - level_shifts - seasonality - holidays - anomalies - Anomalies: difference between original and de-anomalied version

forecast(forecast_length, frequency=None)

Generate a simple forward projection similar to BasicLinearModel. This detector is not optimized for forecasting; dedicated forecasting models may provide better results.

get_cleaned_data(series_name=None)

Return cleaned time series data with anomalies, noise, and level shifts removed.

The cleaned data consists of: - Trend (with mean included) - Seasonality - Holiday effects

Level shifts are corrected by removing the cumulative shift effect, returning the data to its baseline level. Anomalies and noise are excluded entirely.

Parameters:

series_name (str, optional) – If provided, return cleaned data for only this series. If None, return cleaned data for all series.

Returns:

Cleaned time series data with the same index as the original data. If series_name is specified, returns a DataFrame with a single column.

Return type:

pd.DataFrame

Raises:
  • RuntimeError – If fit() has not been called yet.

  • ValueError – If series_name is provided but not found in the original data.

Examples

>>> detector = TimeSeriesFeatureDetector()
>>> detector.fit(df)
>>> cleaned = detector.get_cleaned_data()
>>> cleaned_single = detector.get_cleaned_data('series_1')
get_detected_features(series_name=None, include_components=False, include_metadata=True)
get_event_dag(deep=True)

Return Event DAG metadata derived from detector outputs.

static get_new_params(method='random')

Sample random parameters for detector optimization.

get_template(deep=True)
plot(series_name=None, figsize=(16, 14), save_path=None, show=True, separate_noise_anomaly_panels=True, dual_axis_seasonality_holidays=True, dual_axis_trend_level_shift=True)
plot_event_dag(series=None, start_date=None, end_date=None, show_members=False, figsize=(14, 6), save_path=None, show=True)

Plot Event DAG macro-events on a timeline-first layout.

query_features(dates=None, series=None, include_components=False, include_metadata=False, include_event_dag=False, include_event_members=False, return_json=False)

Query a specific slice of detected features with minimal token usage.

Designed for LLM-friendly output with compact representation.

Parameters:
  • dates (str, datetime, list, slice) – Date(s) to query for features. - Single date: “2024-01-15” or datetime object - Date range: slice(“2024-01-01”, “2024-01-31”) - List of dates: [“2024-01-15”, “2024-01-20”] - None: return all features (not filtered by date)

  • series (str, list) – Series name(s) to query. - Single series: “sales” - Multiple series: [“sales”, “revenue”] - None: all series

  • include_components (bool) – Include component time series values for the date range

  • include_metadata (bool) – Include metadata like noise levels, scales, etc.

  • include_event_dag (bool) – Include Event DAG cluster and family metadata

  • include_event_members (bool) – Include raw Event DAG member events

  • return_json (bool) – Return JSON string instead of dict

Returns:

Compact feature data including anomalies, changepoints,

level shifts, holidays, and optionally components

Return type:

dict or str

Examples

>>> # Get all features for one series
>>> detector.query_features(series="sales")
>>> # Get features occurring in a date range
>>> detector.query_features(
...     dates=slice("2024-01-01", "2024-01-31"),
...     series=["sales", "revenue"]
... )
>>> # Get components for specific dates
>>> detector.query_features(
...     dates=["2024-01-15", "2024-01-16"],
...     series="sales",
...     include_components=True
... )
classmethod render_template(template, return_components=False)

Render a feature detection template back into time series data.

summary()
tune_with_synthetic(real_df, n_synthetic_series=16, n_tune_iterations=25, n_detector_iterations=30, tune_seed=42, loss_params=None, loss_weights=None, synthetic_starting_params=None, starting_params=None, verbose=True)

Tune synthetic data to a real dataset, optimize detector params, and fit self.

After completion, this instance is fitted on real_df with the optimized detector parameters and stores optimization artifacts on the instance.

autots.evaluator.feature_detector.event_dag module

Event DAG utilities for TimeSeriesFeatureDetector.

autots.evaluator.feature_detector.event_dag.build_event_dag_from_detector(detector)

Build an Event DAG from detector public event outputs.

autots.evaluator.feature_detector.event_dag.empty_event_dag(params=None, detection_mode='multivariate', construction_mode='full', series_names=None)

Return a valid empty Event DAG container.

autots.evaluator.feature_detector.event_dag.resolve_event_dag_params(params=None)

Return normalized Event DAG params.

autots.evaluator.feature_detector.event_dag_view module

Event DAG filtering and plotting helpers.

autots.evaluator.feature_detector.event_dag_view.filter_event_dag(event_dag, series=None, start_date=None, end_date=None, include_members=True)

Filter an Event DAG view by series and date range.

autots.evaluator.feature_detector.event_dag_view.plot_event_dag_timeline(event_dag, series=None, start_date=None, end_date=None, show_members=False, figsize=(14, 6), save_path=None, show=True)

Render a timeline-first Event DAG view.

autots.evaluator.feature_detector.extended_anomaly module

ExtendedAnomalyDetector - Two-pass anomaly detector for multi-day patterns.

Pass 1: Point anomaly proposals via AnomalyRemoval. Pass 2: Extended/multi-day pattern detection:

  • CUSUM (sustained mean shift for noisy_burst / transient_change)

  • Cumulative-sum template matching (slope_reversion onset + hold + reversion)

  • Decay extension of pass-1 points (impulse_decay / linear_decay)

  • Segmented sliding-window mean shift (noisy_burst / transient_change)

Can be used standalone or embedded in TimeSeriesFeatureDetector via ExtendedAnomalyMixin.

class autots.evaluator.feature_detector.extended_anomaly.ExtendedAnomalyDetector(point_anomaly_params=None, sustained_window=7, sustained_baseline=60, sustained_threshold=2.5, cusum_k=0.5, cusum_h=5.0, slope_reversion_min_hold=5, slope_reversion_min_reversion=7, slope_reversion_cumsum_threshold=3.0, slope_reversion_max_duration=84, decay_lookahead=14, decay_fit_min_r2=0.5, min_segment_run=2, sustained_hysteresis=0.7, segment_max_gap=1, merge_distance_days=3, max_anomalies_per_series=25)

Bases: object

Two-pass anomaly detector combining point detection with multi-day pattern detection.

Pass 1 produces point-level proposals via AnomalyRemoval (optional if pass1_records are provided externally).

Pass 2 runs four independent detection methods on a clean residual: - CUSUM: cumulative-sum alarm for sustained mean shifts. - Slope-reversion template: cumulative-sum peak/trough analysis for

slow onset → hold → reversion patterns.

  • Decay extension: extends pass-1 point detections with exponential or linear decay tails.

  • Segmented mean shift: sliding-window mean comparison against a rolling baseline.

All per-series events are then merged and de-duplicated into a final list that preserves start_date, end_date, duration, type, magnitude, and score.

Parameters:
  • point_anomaly_params (dict, optional) – Keyword arguments forwarded to AnomalyRemoval for pass-1 point detection. If None a conservative rolling-zscore detector is used.

  • sustained_window (int) – Short rolling window (days) used by the segmented-shift and CUSUM detectors to compute local means.

  • sustained_baseline (int) – Longer window used to estimate the baseline mean and standard deviation.

  • sustained_threshold (float) – Standardized deviation threshold (in units of baseline σ) above which a window is considered anomalous.

  • cusum_k (float) – CUSUM allowance parameter (slack / half-width) in standardized units. Smaller values are more sensitive.

  • cusum_h (float) – CUSUM decision threshold. An alarm fires when the accumulated statistic exceeds this value.

  • slope_reversion_min_hold (int) – Minimum number of days that the cumulative sum must stay elevated before a slope-reversion event is flagged.

  • slope_reversion_min_reversion (int) – Minimum number of days the reversion phase must last.

  • slope_reversion_cumsum_threshold (float) – Minimum peak z-score of the cumulative sum (relative to its rolling σ) required to trigger a slope-reversion candidate.

  • slope_reversion_max_duration (int) – Maximum allowed duration (days) for slope-reversion events. Longer candidates are treated as structural drift and ignored.

  • decay_lookahead (int) – Number of days to inspect after a pass-1 peak for a decay tail.

  • decay_fit_min_r2 (float) – Minimum R² required for a decay template fit to extend a point event.

  • min_segment_run (int) – Minimum number of consecutive elevated windows required by the segmented-shift detector to form an event.

  • sustained_hysteresis (float) – Fraction of sustained_threshold used to keep an in-progress segmented run active after it starts. Values in (0, 1] reduce fragmentation from brief dips.

  • segment_max_gap (int) – Maximum number of non-flagged days allowed between two segmented runs before stitching them into one event.

  • merge_distance_days (int) – Events within this many days of each other (or overlapping) are merged.

  • max_anomalies_per_series (int) – Cap on the number of events returned per series.

fit(residual_df, pass1_records=None)

Detect extended anomalies in residual_df.

Parameters:
  • residual_df (pd.DataFrame) – Clean residual DataFrame (original minus all structured components). Each column is treated as an independent series.

  • pass1_records (dict, optional) – Pre-computed point anomaly records keyed by series name ({series: [{‘date’: …, ‘magnitude’: …, ‘type’: …, ‘score’: …}]}). When provided the internal pass-1 AnomalyRemoval run is skipped.

Return type:

self

get_events(series_name=None)

Return detected events.

Parameters:

series_name (str, optional) – If given, return events for that series only (as a list). Otherwise return the full dict.

static get_new_params(method='random')

Sample random parameters for optimizer search.

class autots.evaluator.feature_detector.extended_anomaly.ExtendedAnomalyMixin

Bases: object

Mixin for TimeSeriesFeatureDetector that adds a second extended anomaly detection pass after the main decomposition is complete.

The extended pass operates on the cleanest available residual: noise_component + anomaly_component (= original minus all structured components), so that structured effects do not contaminate the extended anomaly detection.

Requires the host class to expose: - self.extended_anomaly_params (dict or falsy to disable) - self._anomaly_records_temp (populated by pass-1 in DecompositionMixin)

autots.evaluator.feature_detector.optimizer module

FeatureDetectionOptimizer - Hyperparameter optimization using synthetic data.

class autots.evaluator.feature_detector.optimizer.FeatureDetectionOptimizer(synthetic_generator, loss_calculator=None, n_iterations=50, random_seed=42, starting_params=None, search_strategy='random', selection_strategy='recovery_lexicographic', stage_budget=None)

Bases: object

Optimize TimeSeriesFeatureDetector parameters using synthetic labeled data.

Defaults to a broad random/genetic search with recovery-first selection.

fine_tune_changepoints(starting_params, n_per_stage=200, curriculum_sigmas=None, tversky_alpha=0.3, tversky_beta=0.7, tversky_gamma=2.0, level_shift_weight=0.35, exclude_changepoint_methods=None, over_prediction_penalty=0.1, location_weight=0.35, count_weight=0.25, slope_match_weight=0.15)

Focused fine-tuning pass that freezes every parameter group except changepoint_params and level_shift_params.

All other parameters (seasonality, anomaly, holiday, etc.) are held fixed so the optimizer can zero in on changepoint quality without interference.

The loss function is a statistical translation of techniques designed for neural changepoint training:

Gaussian Label Smoothing

Instead of a hard ±tolerance binary label, each true changepoint is represented as a Gaussian probability distribution centred on its date with standard deviation sigma. This converts the step-function loss landscape into smooth, convex basins and ensures that detections that are “close but not exact” receive a constructive gradient signal.

Focal Tversky Loss (statistical translation)

The metric used for scoring is the Focal Tversky index with alpha < beta (default 0.3 / 0.7), which heavily penalises false negatives over false positives, directly preventing the zero-prediction collapse that plagues changepoint tuning. The focal exponent gamma=2.0 concentrates the gradient on partially-matched changepoints rather than already-correct ones.

Curriculum Learning (sigma annealing)

Three stages with decreasing sigma drive the search from coarse to fine sensitivity:

Stage 1: sigma=14 days — wide window builds initial recall Stage 2: sigma=7 days — medium window matches ±7-day tolerance Stage 3: sigma=3.5 days — tight window polishes placement precision

Parameters:
  • starting_params (dict) – Full detector parameter dict to use as the frozen baseline. All keys except changepoint_params and level_shift_params are immutably frozen throughout the run.

  • n_per_stage (int) – Number of candidate configurations evaluated per curriculum stage.

  • curriculum_sigmas (list of float, optional) – Sigma values (in days) for each curriculum stage. Defaults to [14.0, 7.0, 3.5].

  • tversky_alpha (float) – FP weight in Tversky denominator (keep < tversky_beta).

  • tversky_beta (float) – FN weight in Tversky denominator (keep > tversky_alpha).

  • tversky_gamma (float) – Focal exponent applied to (1 - Tversky_index).

  • level_shift_weight (float) – Blend weight for level-shift Tversky loss in the final score (trend changepoints get 1 - weight). Defaults below 0.5 so the fine-tune remains changepoint-first while still rewarding cleaner level-shift separation.

  • exclude_changepoint_methods (list of str, optional) – Changepoint method names to exclude from the search. Defaults to ['basic'], which prevents the evenly-spaced pseudo-detector from being selected (it cannot be used for analytic purposes). Pass an empty list [] to allow all methods including ‘basic’.

  • over_prediction_penalty (float) – Scales how quickly the count penalty ramps once detections exceed the slight-over buffer. Higher values curb severe over-segmentation without removing the mild recall bias near the target count.

  • location_weight (float) – Weight on an explicit symmetric distance penalty. This makes a count-correct but badly misplaced solution score worse than a nearby over-detected one, which is the balance needed for downstream trend fitting.

  • count_weight (float) – Weight on count calibration. Slight over-detection is tolerated more than under-detection, but the penalty ramps quickly once excess changepoints move beyond the preferred buffer.

  • slope_match_weight (float) – Weight on slope-change alignment for nearby trend changepoints. This favors candidates that place changepoints where the underlying trend change is directionally and numerically similar to ground truth.

Returns:

Best full parameter dict found, with only changepoint/level-shift params potentially changed from starting_params.

Return type:

dict

get_optimization_summary()

Return summary of optimization results.

optimize(starting_params=None)

Run genetic-style optimization to find best detector parameters.

Parameters:

starting_params (dict, optional) – Optional seed parameter configuration. Overrides constructor value when provided.

Returns:

Best parameters found

Return type:

dict

Module contents

Time Series Feature Detection and Optimization package.

class autots.evaluator.feature_detector.ExtendedAnomalyDetector(point_anomaly_params=None, sustained_window=7, sustained_baseline=60, sustained_threshold=2.5, cusum_k=0.5, cusum_h=5.0, slope_reversion_min_hold=5, slope_reversion_min_reversion=7, slope_reversion_cumsum_threshold=3.0, slope_reversion_max_duration=84, decay_lookahead=14, decay_fit_min_r2=0.5, min_segment_run=2, sustained_hysteresis=0.7, segment_max_gap=1, merge_distance_days=3, max_anomalies_per_series=25)

Bases: object

Two-pass anomaly detector combining point detection with multi-day pattern detection.

Pass 1 produces point-level proposals via AnomalyRemoval (optional if pass1_records are provided externally).

Pass 2 runs four independent detection methods on a clean residual: - CUSUM: cumulative-sum alarm for sustained mean shifts. - Slope-reversion template: cumulative-sum peak/trough analysis for

slow onset → hold → reversion patterns.

  • Decay extension: extends pass-1 point detections with exponential or linear decay tails.

  • Segmented mean shift: sliding-window mean comparison against a rolling baseline.

All per-series events are then merged and de-duplicated into a final list that preserves start_date, end_date, duration, type, magnitude, and score.

Parameters:
  • point_anomaly_params (dict, optional) – Keyword arguments forwarded to AnomalyRemoval for pass-1 point detection. If None a conservative rolling-zscore detector is used.

  • sustained_window (int) – Short rolling window (days) used by the segmented-shift and CUSUM detectors to compute local means.

  • sustained_baseline (int) – Longer window used to estimate the baseline mean and standard deviation.

  • sustained_threshold (float) – Standardized deviation threshold (in units of baseline σ) above which a window is considered anomalous.

  • cusum_k (float) – CUSUM allowance parameter (slack / half-width) in standardized units. Smaller values are more sensitive.

  • cusum_h (float) – CUSUM decision threshold. An alarm fires when the accumulated statistic exceeds this value.

  • slope_reversion_min_hold (int) – Minimum number of days that the cumulative sum must stay elevated before a slope-reversion event is flagged.

  • slope_reversion_min_reversion (int) – Minimum number of days the reversion phase must last.

  • slope_reversion_cumsum_threshold (float) – Minimum peak z-score of the cumulative sum (relative to its rolling σ) required to trigger a slope-reversion candidate.

  • slope_reversion_max_duration (int) – Maximum allowed duration (days) for slope-reversion events. Longer candidates are treated as structural drift and ignored.

  • decay_lookahead (int) – Number of days to inspect after a pass-1 peak for a decay tail.

  • decay_fit_min_r2 (float) – Minimum R² required for a decay template fit to extend a point event.

  • min_segment_run (int) – Minimum number of consecutive elevated windows required by the segmented-shift detector to form an event.

  • sustained_hysteresis (float) – Fraction of sustained_threshold used to keep an in-progress segmented run active after it starts. Values in (0, 1] reduce fragmentation from brief dips.

  • segment_max_gap (int) – Maximum number of non-flagged days allowed between two segmented runs before stitching them into one event.

  • merge_distance_days (int) – Events within this many days of each other (or overlapping) are merged.

  • max_anomalies_per_series (int) – Cap on the number of events returned per series.

fit(residual_df, pass1_records=None)

Detect extended anomalies in residual_df.

Parameters:
  • residual_df (pd.DataFrame) – Clean residual DataFrame (original minus all structured components). Each column is treated as an independent series.

  • pass1_records (dict, optional) – Pre-computed point anomaly records keyed by series name ({series: [{‘date’: …, ‘magnitude’: …, ‘type’: …, ‘score’: …}]}). When provided the internal pass-1 AnomalyRemoval run is skipped.

Return type:

self

get_events(series_name=None)

Return detected events.

Parameters:

series_name (str, optional) – If given, return events for that series only (as a list). Otherwise return the full dict.

static get_new_params(method='random')

Sample random parameters for optimizer search.

class autots.evaluator.feature_detector.FeatureDetectionLoss(changepoint_tolerance_days=7, level_shift_tolerance_days=7, anomaly_tolerance_days=1, holiday_tolerance_days=1, seasonality_window=14, weights=None, holiday_over_anomaly_bonus=0.4, trend_component_penalty='component', trend_complexity_window=7, trend_complexity_weight=0.0, focus_component_weights=False, validation_strictness=1.0, invalid_loss_mode='penalty', invalid_loss_penalty=1000000.0)

Bases: LossMetricsMixin, LossEvaluatorsMixin

Comprehensive loss calculator for feature detection optimization.

Each synthetic label family contributes to the total loss: - Trend changepoints and slopes - Level shifts - Anomalies (including shared events and post patterns) - Holiday timing, direct impacts, and splash/bridge days - Seasonality strength, patterns, and changepoints - Noise regimes and noise-to-signal characteristics - Low-frequency noise structure consistency (drift/shift leakage) - Series-level metadata consistency (scale, type) - Regressor impacts when present

DEFAULT_WEIGHTS = {'anomaly_loss': 1.3, 'holiday_event_loss': 1.2, 'holiday_impact_loss': 0.9, 'holiday_recall_loss': 0.9, 'holiday_splash_loss': 0.03, 'level_shift_loss': 1.3, 'metadata_loss': 0.05, 'noise_level_loss': 0.5, 'noise_regime_loss': 0.4, 'noise_structure_loss': 0.2, 'regressor_loss': 0.3, 'seasonality_changepoint_loss': 0.01, 'seasonality_pattern_loss': 2.0, 'seasonality_strength_loss': 2.0, 'trend_loss': 1.0}
INVALID_LOSS_PENALTY = 1000000.0
calculate_loss(detected_features, true_labels, series_name=None, true_components=None, date_index=None)

Calculate overall loss comparing detected features to true labels.

Parameters:
  • detected_features (dict) – Output from TimeSeriesFeatureDetector.get_detected_features(…)

  • true_labels (dict) – Labels from SyntheticDailyGenerator.get_all_labels(…)

  • series_name (str, optional) – If provided, only evaluate the named series.

  • true_components (dict, optional) – Mapping of series -> component arrays from SyntheticDailyGenerator.get_components()

  • date_index (pd.DatetimeIndex, optional) – Index used for the time series. Required for seasonality changepoint evaluation.

Returns:

Loss breakdown with per-component metrics and total weighted loss.

Return type:

dict

class autots.evaluator.feature_detector.FeatureDetectionOptimizer(synthetic_generator, loss_calculator=None, n_iterations=50, random_seed=42, starting_params=None, search_strategy='random', selection_strategy='recovery_lexicographic', stage_budget=None)

Bases: object

Optimize TimeSeriesFeatureDetector parameters using synthetic labeled data.

Defaults to a broad random/genetic search with recovery-first selection.

fine_tune_changepoints(starting_params, n_per_stage=200, curriculum_sigmas=None, tversky_alpha=0.3, tversky_beta=0.7, tversky_gamma=2.0, level_shift_weight=0.35, exclude_changepoint_methods=None, over_prediction_penalty=0.1, location_weight=0.35, count_weight=0.25, slope_match_weight=0.15)

Focused fine-tuning pass that freezes every parameter group except changepoint_params and level_shift_params.

All other parameters (seasonality, anomaly, holiday, etc.) are held fixed so the optimizer can zero in on changepoint quality without interference.

The loss function is a statistical translation of techniques designed for neural changepoint training:

Gaussian Label Smoothing

Instead of a hard ±tolerance binary label, each true changepoint is represented as a Gaussian probability distribution centred on its date with standard deviation sigma. This converts the step-function loss landscape into smooth, convex basins and ensures that detections that are “close but not exact” receive a constructive gradient signal.

Focal Tversky Loss (statistical translation)

The metric used for scoring is the Focal Tversky index with alpha < beta (default 0.3 / 0.7), which heavily penalises false negatives over false positives, directly preventing the zero-prediction collapse that plagues changepoint tuning. The focal exponent gamma=2.0 concentrates the gradient on partially-matched changepoints rather than already-correct ones.

Curriculum Learning (sigma annealing)

Three stages with decreasing sigma drive the search from coarse to fine sensitivity:

Stage 1: sigma=14 days — wide window builds initial recall Stage 2: sigma=7 days — medium window matches ±7-day tolerance Stage 3: sigma=3.5 days — tight window polishes placement precision

Parameters:
  • starting_params (dict) – Full detector parameter dict to use as the frozen baseline. All keys except changepoint_params and level_shift_params are immutably frozen throughout the run.

  • n_per_stage (int) – Number of candidate configurations evaluated per curriculum stage.

  • curriculum_sigmas (list of float, optional) – Sigma values (in days) for each curriculum stage. Defaults to [14.0, 7.0, 3.5].

  • tversky_alpha (float) – FP weight in Tversky denominator (keep < tversky_beta).

  • tversky_beta (float) – FN weight in Tversky denominator (keep > tversky_alpha).

  • tversky_gamma (float) – Focal exponent applied to (1 - Tversky_index).

  • level_shift_weight (float) – Blend weight for level-shift Tversky loss in the final score (trend changepoints get 1 - weight). Defaults below 0.5 so the fine-tune remains changepoint-first while still rewarding cleaner level-shift separation.

  • exclude_changepoint_methods (list of str, optional) – Changepoint method names to exclude from the search. Defaults to ['basic'], which prevents the evenly-spaced pseudo-detector from being selected (it cannot be used for analytic purposes). Pass an empty list [] to allow all methods including ‘basic’.

  • over_prediction_penalty (float) – Scales how quickly the count penalty ramps once detections exceed the slight-over buffer. Higher values curb severe over-segmentation without removing the mild recall bias near the target count.

  • location_weight (float) – Weight on an explicit symmetric distance penalty. This makes a count-correct but badly misplaced solution score worse than a nearby over-detected one, which is the balance needed for downstream trend fitting.

  • count_weight (float) – Weight on count calibration. Slight over-detection is tolerated more than under-detection, but the penalty ramps quickly once excess changepoints move beyond the preferred buffer.

  • slope_match_weight (float) – Weight on slope-change alignment for nearby trend changepoints. This favors candidates that place changepoints where the underlying trend change is directionally and numerically similar to ground truth.

Returns:

Best full parameter dict found, with only changepoint/level-shift params potentially changed from starting_params.

Return type:

dict

get_optimization_summary()

Return summary of optimization results.

optimize(starting_params=None)

Run genetic-style optimization to find best detector parameters.

Parameters:

starting_params (dict, optional) – Optional seed parameter configuration. Overrides constructor value when provided.

Returns:

Best parameters found

Return type:

dict

class autots.evaluator.feature_detector.ReconstructionLoss(trend_complexity_window=7, trend_complexity_weight=1.0, metric_weights=None, trend_dominance_target=0.65, trend_min_other_variance=0.0001, seasonality_lags=(7, 365), seasonality_min_autocorr=0.1, seasonality_improvement_target=0.35, anomaly_improvement_target=0.25, anomaly_min_pre_std=0.001)

Bases: FeatureDetectionLoss

Loss function tailored for real-world datasets lacking component-level labels.

Focuses on reconstruction quality while discouraging overly complex trend fits and encouraging variance to be attributed to seasonality, holidays, anomalies, and level shifts.

DEFAULT_METRIC_WEIGHTS = {'anomaly_capture_loss': 0.7, 'noise_whiteness_loss': 0.5, 'reconstruction_loss': 0.5, 'seasonality_capture_loss': 0.8, 'seasonality_shape_loss': 0.6, 'structural_loss': 1.0, 'trend_dominance_loss': 0.9, 'trend_smoothness_loss': 1.2}
calculate_loss(observed_df, detected_features, components=None, series_name=None)

Calculate reconstruction-oriented loss for unlabeled datasets.

Parameters:
  • observed_df (pd.DataFrame) – Original time series data used for detection.

  • detected_features (dict) – Output from TimeSeriesFeatureDetector.get_detected_features(…, include_components=True).

  • components (dict, optional) – Explicit component container matching get_detected_features()[‘components’].

  • series_name (str, optional) – Restrict evaluation to a single series.

Returns:

Loss metrics per series and aggregate total weighted loss.

Return type:

dict

class autots.evaluator.feature_detector.TimeSeriesFeatureDetector(seasonality_params=None, rough_seasonality_params=None, holiday_params=None, anomaly_params=None, changepoint_params=None, level_shift_params=None, level_shift_validation=None, general_transformer_params=None, smoothing_window=None, standardize=True, detection_mode='multivariate', global_holiday_anomaly_suppression=True, extended_anomaly_params=None, event_dag_params=None, holiday_country=None, holiday_countries=None)

Bases: DecompositionMixin, SeasonalityMixin, TrendMixin, HolidayMixin, AnomalyMixin, ExtendedAnomalyMixin, RescalingMixin, FormattingMixin

Comprehensive feature detection pipeline for time series.

TODO: upstream more of this code into the component classes (e.g., HolidayDetector, AnomalyRemoval, ChangepointDetector) TODO: Handle multiplicative seasonality TODO: Handle time varying seasonality using fast_kalman TODO: Improve holiday “splash” effect and weekend interactions TODO: Support identifying regressor impacts and granger lag impacts TODO: Build upon the JSON template so that it can be converted to a fixed size embedding (probably a 2d embedding). The fixed size may vary by parameters, but for a given parameter set should always be the same size. The embedding does not need to be capable of fully reconstructing the time series, just representing it. TODO: Support for modeling the trend with a fast kalman state space approach, ideally aligned with changepoints in some way if possible. TODO: consider also having “deviation from group” type anomaly detection for multivariate series TODO: Improve anomaly typing in univariate mode (currently defaults to point_outlier) and incorporate detector scores into type confidence. TODO: Detect and expose non-holiday regressor impacts (not just holiday coefficients), and persist them in template/features output.

Parameters

rough_seasonality_paramsdict, optional

Parameters for DatepartRegressionTransformer used in initial rough seasonality decomposition (to improve holiday and anomaly detection).

holiday_paramsdict, optional

Parameters for HolidayDetector

anomaly_paramsdict, optional

Parameters for AnomalyRemoval

changepoint_paramsdict, optional

Parameters for ChangepointDetector

level_shift_paramsdict, optional

Parameters for LevelShiftMagic

level_shift_validationdict, optional

Validation parameters for level shifts

general_transformer_paramsdict, optional

Parameters for GeneralTransformer applied before trend detection

smoothing_windowint, optional

Window size for smoothing before trend detection

standardizebool, default=True

Whether to standardize series before processing

detection_modestr, default=’multivariate’

Controls whether detections are unique per series (‘multivariate’) or shared across all series (‘univariate’). - ‘multivariate’: Each series gets unique anomalies, holidays, changepoints, and level shifts - ‘univariate’: All series share common anomalies, holidays, changepoints, and level shifts

(level shifts are detected on aggregated signal and scaled appropriately per series)

global_holiday_anomaly_suppressionbool, default=True

If True, anomaly detection suppresses holiday-proximate flags using a merged holiday date set from all series. Set False to disable this suppression.

TEMPLATE_VERSION = '1.2'
fit(df)

Fit the feature detector to time series data.

Decomposition follows this sequential removal strategy:

  1. INITIAL DECOMPOSITION (for detection only): - Remove rough seasonality → rough_residual - Detect holidays on rough_residual - Detect anomalies on rough_residual

  2. FINAL SEASONALITY FIT: - Fit on: original - anomalies - Holidays fitted simultaneously as regressors - Output: final_residual (has seasonality + holidays removed)

  3. LEVEL SHIFT DETECTION: - Detect on: original - anomalies - seasonality - holidays - (This is final_residual)

  4. TREND DETECTION: - Detect on: original - anomalies - seasonality - holidays - level_shifts

  5. NOISE & ANOMALY COMPONENTS: - Noise: original - trend - level_shifts - seasonality - holidays - anomalies - Anomalies: difference between original and de-anomalied version

forecast(forecast_length, frequency=None)

Generate a simple forward projection similar to BasicLinearModel. This detector is not optimized for forecasting; dedicated forecasting models may provide better results.

get_cleaned_data(series_name=None)

Return cleaned time series data with anomalies, noise, and level shifts removed.

The cleaned data consists of: - Trend (with mean included) - Seasonality - Holiday effects

Level shifts are corrected by removing the cumulative shift effect, returning the data to its baseline level. Anomalies and noise are excluded entirely.

Parameters:

series_name (str, optional) – If provided, return cleaned data for only this series. If None, return cleaned data for all series.

Returns:

Cleaned time series data with the same index as the original data. If series_name is specified, returns a DataFrame with a single column.

Return type:

pd.DataFrame

Raises:
  • RuntimeError – If fit() has not been called yet.

  • ValueError – If series_name is provided but not found in the original data.

Examples

>>> detector = TimeSeriesFeatureDetector()
>>> detector.fit(df)
>>> cleaned = detector.get_cleaned_data()
>>> cleaned_single = detector.get_cleaned_data('series_1')
get_detected_features(series_name=None, include_components=False, include_metadata=True)
get_event_dag(deep=True)

Return Event DAG metadata derived from detector outputs.

static get_new_params(method='random')

Sample random parameters for detector optimization.

get_template(deep=True)
plot(series_name=None, figsize=(16, 14), save_path=None, show=True, separate_noise_anomaly_panels=True, dual_axis_seasonality_holidays=True, dual_axis_trend_level_shift=True)
plot_event_dag(series=None, start_date=None, end_date=None, show_members=False, figsize=(14, 6), save_path=None, show=True)

Plot Event DAG macro-events on a timeline-first layout.

query_features(dates=None, series=None, include_components=False, include_metadata=False, include_event_dag=False, include_event_members=False, return_json=False)

Query a specific slice of detected features with minimal token usage.

Designed for LLM-friendly output with compact representation.

Parameters:
  • dates (str, datetime, list, slice) – Date(s) to query for features. - Single date: “2024-01-15” or datetime object - Date range: slice(“2024-01-01”, “2024-01-31”) - List of dates: [“2024-01-15”, “2024-01-20”] - None: return all features (not filtered by date)

  • series (str, list) – Series name(s) to query. - Single series: “sales” - Multiple series: [“sales”, “revenue”] - None: all series

  • include_components (bool) – Include component time series values for the date range

  • include_metadata (bool) – Include metadata like noise levels, scales, etc.

  • include_event_dag (bool) – Include Event DAG cluster and family metadata

  • include_event_members (bool) – Include raw Event DAG member events

  • return_json (bool) – Return JSON string instead of dict

Returns:

Compact feature data including anomalies, changepoints,

level shifts, holidays, and optionally components

Return type:

dict or str

Examples

>>> # Get all features for one series
>>> detector.query_features(series="sales")
>>> # Get features occurring in a date range
>>> detector.query_features(
...     dates=slice("2024-01-01", "2024-01-31"),
...     series=["sales", "revenue"]
... )
>>> # Get components for specific dates
>>> detector.query_features(
...     dates=["2024-01-15", "2024-01-16"],
...     series="sales",
...     include_components=True
... )
classmethod render_template(template, return_components=False)

Render a feature detection template back into time series data.

summary()
tune_with_synthetic(real_df, n_synthetic_series=16, n_tune_iterations=25, n_detector_iterations=30, tune_seed=42, loss_params=None, loss_weights=None, synthetic_starting_params=None, starting_params=None, verbose=True)

Tune synthetic data to a real dataset, optimize detector params, and fit self.

After completion, this instance is fitted on real_df with the optimized detector parameters and stores optimization artifacts on the instance.