feature-engineering

Create new column by selecting dataframe subset based on current row value and sum another column

Create new column by selecting dataframe subset based on current row value and sum another column Question: Lets say we have a sample dataframe that looks like this # Create a sample dataframe df = pd.DataFrame({‘num_posts’: [4, 6, 3, 9, 1, 14, 2, 5, 7, 2,12], ‘date’ : [‘2020-03-01’, ‘2020-01-02’, ‘2020-01-03’, ‘2020-01-04’, ‘2019-01-05’, ‘2019-01-06’, ‘2020-01-07’, …

Total answers: 1

How to assign new column based on the list of string values in pandas

How to assign new column based on the list of string values in pandas Question: I have a dataframe that one of the column contains string values, and I want to assign new column if this column values are in the list I specified. my_list = [‘AA’, ‘TR’, ‘NZ’] For example: My dataframe : df …

Total answers: 2

How to find the difference between time and feed the difference in a new column?

How to find the difference between time and feed the difference in a new column? Question: I have a dataframe trades_df which looks like this – Open Time Open Price Close Time 19-08-2020 12:19 1.19459 19-08-2020 12:48 28-08-2020 03:09 0.90157 08-09-2020 12:20 It has columns open_time and close_time in the format 19-08-2020 12:19. I want …

Total answers: 2

How do I add external features to my pipeline?

How do I add external features to my pipeline? Question: There is a similar question asked here on SO many years back but there was no answer. I have the same question. I would like to add in new column(s) of data, in my case 3 columns for dummy variables, to a sparse matrix (from …

Total answers: 1

Does correlation important factor in Unsupervised learning (Clustering)?

Does correlation important factor in Unsupervised learning (Clustering)? Question: I am working with the dataset of size (500, 33). In particular the data set contains 9 features say [X_High, X_medium, X_low, Y_High, Y_medium, Y_low, Z_High, Z_medium, Z_low] Both visually & after correlation matrix calculation I observed that [X_High, Y_High, Z_High] & [ X_medium, Y_medium, Z_medium …

Total answers: 2