scientific-skills/aeon/references/clustering.md
Aeon provides clustering algorithms adapted for temporal data with specialized distance metrics and averaging methods.
Standard k-means/k-medoids adapted for time series:
TimeSeriesKMeans - K-means with temporal distance metrics (DTW, Euclidean, etc.)TimeSeriesKMedoids - Uses actual time series as cluster centersTimeSeriesKShape - Shape-based clustering algorithmTimeSeriesKernelKMeans - Kernel-based variant for nonlinear patternsUse when: Known number of clusters, spherical cluster shapes expected.
Efficient clustering for large collections:
TimeSeriesCLARA - Clustering Large Applications with samplingTimeSeriesCLARANS - Randomized search variant of CLARAUse when: Dataset too large for standard k-medoids, need scalability.
Specialized for alignment-based similarity:
KASBA - K-means with shift-invariant elastic averagingElasticSOM - Self-organizing map using elastic distancesUse when: Time series have temporal shifts or warping.
Graph-based clustering:
KSpectralCentroid - Spectral clustering with centroid computationUse when: Non-convex cluster shapes, need graph-based approach.
Neural network-based clustering with auto-encoders:
AEFCNClusterer - Fully convolutional auto-encoderAEResNetClusterer - Residual network auto-encoderAEDCNNClusterer - Dilated CNN auto-encoderAEDRNNClusterer - Dilated RNN auto-encoderAEBiGRUClusterer - Bidirectional GRU auto-encoderAEAttentionBiGRUClusterer - Attention-enhanced BiGRU auto-encoderUse when: Large datasets, need learned representations, or complex patterns.
Transform to feature space before clustering:
Catch22Clusterer - Clusters on 22 canonical featuresSummaryClusterer - Uses summary statisticsTSFreshClusterer - Automated tsfresh featuresUse when: Raw time series not informative, need interpretable features.
Build custom clustering pipelines:
ClustererPipeline - Chain transformers with clusterersCompute cluster centers for time series:
mean_average - Arithmetic meanba_average - Barycentric averaging with DTWkasba_average - Shift-invariant averagingshift_invariant_average - General shift-invariant methodUse when: Need representative cluster centers for visualization or initialization.
from aeon.clustering import TimeSeriesKMeans
from aeon.datasets import load_classification
# Load data (using classification data for clustering)
X_train, _ = load_classification("GunPoint", split="train")
# Cluster time series
clusterer = TimeSeriesKMeans(
n_clusters=3,
distance="dtw", # Use DTW distance
averaging_method="ba" # Barycentric averaging
)
labels = clusterer.fit_predict(X_train)
centers = clusterer.cluster_centers_
Compatible distance metrics include:
Use clustering metrics from sklearn or aeon benchmarking: