How to get distance matrix using dynamic time warping?

Question:

I have 6 time series values as follows.

import numpy as np
series = np.array([
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1],
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1]])

Suppose, I want to get the distance matrix of dynamic time warping to perform a clustering. I used dtaidistance library for that as follows.

from dtaidistance import dtw
ds = dtw.distance_matrix_fast(series)

The output I got was as follows.

array([[       inf, 1.41421356, 2.23606798, 0.        , 1.41421356, 2.23606798],
       [       inf,        inf, 1.73205081, 1.41421356, 0.        , 1.73205081],
       [       inf,        inf,        inf, 2.23606798, 1.73205081, 0.        ],
       [       inf,        inf,        inf,        inf, 1.41421356, 2.23606798],
       [       inf,        inf,        inf,        inf,        inf, 1.73205081],
       [       inf,        inf,        inf,        inf,        inf,        inf]])

It seems to me that the output I get is wrong. For instance, as I understand the diagonal values of the ouput should be 0 (since they are ideal matches).

I want to know where I am making things wrong and how to fix it. I am also happy to get answers using other python libraries too.

I am happy to provide more details if needed.

Asked By: EmJ

||

Answers:

Everything is correct. As per the docs:

The result is stored in a matrix representation. Since only the upper
triangular matrix is required
this representation uses more memory then
necessary.

All diagonal elements are 0 the the lower triangular matrix is the the same as the upper triagular matrix mirrored at the diagonal. As all these value can be deducted from the upper triangular matrix they aren’t shown in the output.
You can even use the compact=True argument to only get the values from the upper diagonal matrix concatenated into a 1D array.

You can convert the result to a full matrix like this:

ds[ds==np.inf] = 0
ds += dt.T
Answered By: Stef

In dtw.py the default value for elements of the distance matrix are specified to be np.inf. As the matrix returns the pairwise distance between different sequences, this will not be filled in in the matrix, resulting in np.inf values.

Try running with dtw.distance_matrix_fast(series, compact=True) to prevent seeing this filler information.

Answered By: Arno C
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.