Selecting only a certain number of top features using tsfresh

Question:

How can I select top n features of time series using tsfresh? Can I decide the number of top features I want to extract?

Asked By: Chaitra

||

Answers:

Based on the above comment from @Chaitra and this answer I give an answer.

You can decide the number of top features by using the tsfresh relevance table described in the documentation. You can then sort the table by the p-value and the the top n features.

Example code printing top 11 features:

from tsfresh import extract_features
from tsfresh.feature_selection.relevance import calculate_relevance_table

extracted_features = extract_features(
    X,
    column_id="id",
    column_kind="kind",
    column_value="value",
)
relevance_table = calculate_relevance_table(extracted_features, y)
relevance_table = relevance_table[relevance_table.relevant]
relevance_table.sort_values("p_value", inplace=True)
print(relevance_table["feature"][:11])
Answered By: flyingdutchman