Pandas Dataframes: is the value of a column in a list nested in another column, same row?

Question:

I am working with a Pandas dataframe similar to with:

     My_date      Something_else    My_list
0    25/10/2019   ...               [25/10/2019, 26/10/2019]
1    03/07/2019   ...               [28/11/2017, 12/12/2017, 26/12/2017]
2    09/04/2019   ...               [11/06/2015]

I would like to check if the value in the column called "My_date" is in the list on the same row, column "My_list". For instance here I would like to get the following output, vectorially or very efficiently:

     Result
0    true
1    false
2    false

I could do this using a ‘for‘ loop, various methods are described here for instance. However, I am aware that iterating is rarely the best solution, all the more since my table has more than 1 million rows and many of the list have 365 values. (But as shown above, these lists are not always date ranges.)

I know that there are many ways to do vectorial calculation on DataFrames, using .loc or .eval for instance. The point is that in my case, nothing works as expected due to these nested lists… Thus, I would like to find a vectorized solution to do that. If it matters, all my "dates" are of type pandas.Timestamp.

There are probably other questions related to similar issues, however I haven’t found any appropriate answer or question using my own words.

Asked By: scūriolus

||

Answers:

Try:

df['Result'] = df.apply(lambda x: x['My_date'] in x['My_list'], axis=1)

df=pd.DataFrame({'My_date' : ['25/10/2019','03/07/2019','09/04/2019'], 'My_list' : [['25/10/2019', '26/10/2019'],['28/11/2017', '12/12/2017', '26/12/2017'],['11/06/2015']]})
df['Result'] = df.apply(lambda x: x['My_date'] in x['My_list'], axis=1)

Outputs:

      My_date                               My_list  Result
0  25/10/2019              [25/10/2019, 26/10/2019]   True
1  03/07/2019  [28/11/2017, 12/12/2017, 26/12/2017]  False
2  09/04/2019                          [11/06/2015]  False
Answered By: luigigi