Time Series Shapelets

Setting:

A pool of time series, of different length

They are from class A and B, labeled

Goal:

Find a series of shapelets to optimally split the set

Idea:

Supervised learning (explicit labeling), training / testing

Decision tree (find an optimal shapelet at each node)

Euclidean distance (basically distance measure only computed for subsequences with the same length)

Some modification applied to brute force method to reduce complexity and storage

Definition:

Brute force algorithm:

Modification:

1. Early abandon (faster)

2. Admissible entropy pruning (faster)

Classification by decision tree:

Impurity measures: entropy, Gini, classification error