scientific-skills/aeon/references/transformations.md
Aeon provides extensive transformation capabilities for preprocessing, feature extraction, and representation learning from time series data.
Aeon distinguishes between:
Fast, scalable feature generation using random kernels:
RocketTransformer - Random convolutional kernelsMiniRocketTransformer - Simplified ROCKET for speedMultiRocketTransformer - Enhanced ROCKET variantHydraTransformer - Multi-resolution dilated convolutionsMultiRocketHydraTransformer - Combines ROCKET and HydraROCKETGPU - GPU-accelerated variantUse when: Need fast, scalable features for any ML algorithm, strong baseline performance.
Domain-agnostic features based on time series characteristics:
Catch22 - 22 canonical time-series characteristicsTSFresh - Comprehensive automated feature extraction (100+ features)TSFreshRelevant - Feature extraction with relevance filteringSevenNumberSummary - Descriptive statistics (mean, std, quantiles)Use when: Need interpretable features, domain-agnostic approach, or feeding traditional ML.
Symbolic approximations for discrete representations:
SAX - Symbolic Aggregate approXimationPAA - Piecewise Aggregate ApproximationSFA - Symbolic Fourier ApproximationSFAFast - Optimized SFASFAWhole - SFA on entire series (no windowing)BORF - Bag-of-Receptive-FieldsUse when: Need discrete/symbolic representation, dimensionality reduction, interpretability.
Discriminative subsequence extraction:
RandomShapeletTransform - Random discriminative shapeletsRandomDilatedShapeletTransform - Dilated shapelets for multi-scaleSAST - Scalable And Accurate Subsequence TransformRSAST - Randomized SASTUse when: Need interpretable discriminative patterns, phase-invariant features.
Statistical summaries from time intervals:
RandomIntervals - Features from random intervalsSupervisedIntervals - Supervised interval selectionQUANTTransformer - Quantile-based interval featuresUse when: Predictive patterns localized to specific windows.
Data preparation and normalization:
MinMaxScaler - Scale to [0, 1] rangeNormalizer - Z-normalization (zero mean, unit variance)Centerer - Center to zero meanSimpleImputer - Fill missing valuesDownsampleTransformer - Reduce temporal resolutionTabularizer - Convert time series to tabular formatUse when: Need standardization, missing value handling, format conversion.
Advanced analysis methods:
MatrixProfile - Computes distance profiles for pattern discoveryDWTTransformer - Discrete Wavelet TransformAutocorrelationFunctionTransformer - ACF computationDobin - Distance-based Outlier BasIs using NeighborsSignatureTransformer - Path signature methodsPLATransformer - Piecewise Linear ApproximationADASYN - Adaptive Synthetic SamplingSMOTE - Synthetic Minority Over-samplingOHIT - Over-sampling with Highly Imbalanced Time seriesUse when: Classification with imbalanced classes.
CollectionTransformerPipeline - Chain multiple transformersTransform individual time series (e.g., for preprocessing in forecasting).
AutoCorrelationSeriesTransformer - AutocorrelationStatsModelsACF - ACF using statsmodelsStatsModelsPACF - Partial autocorrelationExponentialSmoothing - Exponentially weighted moving averageMovingAverage - Simple or weighted moving averageSavitzkyGolayFilter - Polynomial smoothingGaussianFilter - Gaussian kernel smoothingBKFilter - Baxter-King bandpass filterDiscreteFourierApproximation - Fourier-based filteringUse when: Need noise reduction, trend extraction, or frequency filtering.
PCASeriesTransformer - Principal component analysisPlASeriesTransformer - Piecewise Linear ApproximationBoxCoxTransformer - Variance stabilizationLogTransformer - Logarithmic scalingClaSPTransformer - Classification Score ProfileSeriesTransformerPipeline - Chain series transformersfrom aeon.transformations.collection.convolution_based import RocketTransformer
from aeon.classification.sklearn import RotationForest
from aeon.datasets import load_classification
# Load data
X_train, y_train = load_classification("GunPoint", split="train")
X_test, y_test = load_classification("GunPoint", split="test")
# Extract ROCKET features
rocket = RocketTransformer()
X_train_features = rocket.fit_transform(X_train)
X_test_features = rocket.transform(X_test)
# Use with any sklearn classifier
clf = RotationForest()
clf.fit(X_train_features, y_train)
accuracy = clf.score(X_test_features, y_test)
from aeon.transformations.collection import (
MinMaxScaler,
SimpleImputer,
CollectionTransformerPipeline
)
# Build preprocessing pipeline
pipeline = CollectionTransformerPipeline([
('imputer', SimpleImputer(strategy='mean')),
('scaler', MinMaxScaler())
])
X_transformed = pipeline.fit_transform(X_train)
from aeon.transformations.series import MovingAverage
# Smooth individual time series
smoother = MovingAverage(window_size=5)
y_smoothed = smoother.fit_transform(y)
Fit on training data only: Avoid data leakage
transformer.fit(X_train)
X_train_tf = transformer.transform(X_train)
X_test_tf = transformer.transform(X_test)
Pipeline composition: Chain transformers for complex workflows
pipeline = CollectionTransformerPipeline([
('imputer', SimpleImputer()),
('scaler', Normalizer()),
('features', RocketTransformer())
])
Feature selection: TSFresh can generate many features; consider selection
from sklearn.feature_selection import SelectKBest
selector = SelectKBest(k=100)
X_selected = selector.fit_transform(X_features, y)
Memory considerations: Some transformers memory-intensive on large datasets
Domain knowledge: Choose transformations matching domain: