Time Series Segmentation

Aeon provides algorithms to partition time series into regions with distinct characteristics, identifying change points and boundaries.

Segmentation Algorithms

Binary Segmentation

BinSegmenter - Recursive binary segmentation
- Iteratively splits series at most significant change points
- Parameters: n_segments, cost_function
- Use when: Known number of segments, hierarchical structure

Classification-Based

ClaSPSegmenter - Classification Score Profile
- Uses classification performance to identify boundaries
- Discovers segments where classification distinguishes neighbors
- Use when: Segments have different temporal patterns

Fast Pattern-Based

FLUSSSegmenter - Fast Low-cost Unipotent Semantic Segmentation
- Efficient semantic segmentation using arc crossings
- Based on matrix profile
- Use when: Large time series, need speed and pattern discovery

Information Theory

InformationGainSegmenter - Information gain maximization
- Finds boundaries maximizing information gain
- Use when: Statistical differences between segments

Gaussian Modeling

GreedyGaussianSegmenter - Greedy Gaussian approximation
- Models segments as Gaussian distributions
- Incrementally adds change points
- Use when: Segments follow Gaussian distributions

Hierarchical Agglomerative

EAggloSegmenter - Bottom-up merging approach
- Estimates change points via agglomeration
- Use when: Want hierarchical segmentation structure

Hidden Markov Models

HMMSegmenter - HMM with Viterbi decoding
- Probabilistic state-based segmentation
- Use when: Segments represent hidden states

Dimensionality-Based

HidalgoSegmenter - Heterogeneous Intrinsic Dimensionality Algorithm
- Detects changes in local dimensionality
- Use when: Dimensionality shifts between segments

Baseline

RandomSegmenter - Random change point generation
- Use when: Need null hypothesis baseline

Quick Start

python

from aeon.segmentation import ClaSPSegmenter
import numpy as np

# Create time series with regime changes
y = np.concatenate([
    np.sin(np.linspace(0, 10, 100)),      # Segment 1
    np.cos(np.linspace(0, 10, 100)),      # Segment 2
    np.sin(2 * np.linspace(0, 10, 100))   # Segment 3
])

# Segment the series
segmenter = ClaSPSegmenter()
change_points = segmenter.fit_predict(y)

print(f"Detected change points: {change_points}")

Output Format

Segmenters return change point indices:

python

# change_points = [100, 200]  # Boundaries between segments
# This divides series into: [0:100], [100:200], [200:end]

Algorithm Selection

Speed priority: FLUSSSegmenter, BinSegmenter
Accuracy priority: ClaSPSegmenter, HMMSegmenter
Known segment count: BinSegmenter with n_segments parameter
Unknown segment count: ClaSPSegmenter, InformationGainSegmenter
Pattern changes: FLUSSSegmenter, ClaSPSegmenter
Statistical changes: InformationGainSegmenter, GreedyGaussianSegmenter
State transitions: HMMSegmenter

Common Use Cases

Regime Change Detection

Identify when time series behavior fundamentally changes:

python

from aeon.segmentation import InformationGainSegmenter

segmenter = InformationGainSegmenter(k=3)  # Up to 3 change points
change_points = segmenter.fit_predict(stock_prices)

Activity Segmentation

Segment sensor data into activities:

python

from aeon.segmentation import ClaSPSegmenter

segmenter = ClaSPSegmenter()
boundaries = segmenter.fit_predict(accelerometer_data)

Seasonal Boundary Detection

Find season transitions in time series:

python

from aeon.segmentation import HMMSegmenter

segmenter = HMMSegmenter(n_states=4)  # 4 seasons
segments = segmenter.fit_predict(temperature_data)

Evaluation Metrics

Use segmentation quality metrics:

python

from aeon.benchmarking.metrics.segmentation import (
    count_error,
    hausdorff_error
)

# Count error: difference in number of change points
count_err = count_error(y_true, y_pred)

# Hausdorff: maximum distance between predicted and true points
hausdorff_err = hausdorff_error(y_true, y_pred)

Best Practices

Normalize data: Ensures change detection not dominated by scale
Choose appropriate metric: Different algorithms optimize different criteria
Validate segments: Visualize to verify meaningful boundaries
Handle noise: Consider smoothing before segmentation
Domain knowledge: Use expected segment count if known
Parameter tuning: Adjust sensitivity parameters (thresholds, penalties)

Visualization

python

import matplotlib.pyplot as plt

plt.figure(figsize=(12, 4))
plt.plot(y, label='Time Series')
for cp in change_points:
    plt.axvline(cp, color='r', linestyle='--', label='Change Point')
plt.legend()
plt.show()