Pandas filter dataframe based on another
Question:
I have two dataframes, one with all flagged instruments and other with all blacklisted instruments (Both have a column stating the date each entry were added to the list). id like to select the data from flagged instrument dataframe where their added date is later than the blacklisted dates. My dataframes:
>>> flagged[['Isin', 'date_added']]
Isin date_added
9854 CNE100001WY7 2023-01-20
18193 CNE100002CY7 2023-01-23
52791 CNE100001WY7 2023-02-05
55151 CNE100002CY7 2023-03-07
56219 CNE100003Q57 2023-03-10
>>> blacklist[['isin', 'Blacklist Date']]
isin Blacklist Date
0 CNE000001N13 2023-02-09
1 CNE100001WY7 2023-02-09
2 CNE100002CY7 2023-02-09
3 CNE100003Q57 2023-02-09
I was wondering if anyone knows how i can get the following output from the flagged dataframe (id like to keep any duplicated values in the output if possible):
Isin date_added
55151 CNE100002CY7 2023-03-07
56219 CNE100003Q57 2023-03-10
Any help would be greatly appreciated!
Answers:
You can merge the two dataframes on the Isin and isin columns, and then filter for the rows where the date_added is later than the Blacklist Date. Here’s an example code:
merged = flagged.merge(blacklist, left_on='Isin', right_on='isin')
result = merged[merged['date_added'] > merged['Blacklist Date']][['Isin', 'date_added']]
This code first merges the two dataframes on the Isin and isin columns, creating a new dataframe with all the information from both dataframes. Then it filters for the rows where the date_added is later than the Blacklist Date, and selects only the Isin and date_added columns. The resulting dataframe result should contain the desired output. Note that in this case, the CNE100001WY7 instrument is not in the result because its date_added is the same as its Blacklist Date.
I have two dataframes, one with all flagged instruments and other with all blacklisted instruments (Both have a column stating the date each entry were added to the list). id like to select the data from flagged instrument dataframe where their added date is later than the blacklisted dates. My dataframes:
>>> flagged[['Isin', 'date_added']]
Isin date_added
9854 CNE100001WY7 2023-01-20
18193 CNE100002CY7 2023-01-23
52791 CNE100001WY7 2023-02-05
55151 CNE100002CY7 2023-03-07
56219 CNE100003Q57 2023-03-10
>>> blacklist[['isin', 'Blacklist Date']]
isin Blacklist Date
0 CNE000001N13 2023-02-09
1 CNE100001WY7 2023-02-09
2 CNE100002CY7 2023-02-09
3 CNE100003Q57 2023-02-09
I was wondering if anyone knows how i can get the following output from the flagged dataframe (id like to keep any duplicated values in the output if possible):
Isin date_added
55151 CNE100002CY7 2023-03-07
56219 CNE100003Q57 2023-03-10
Any help would be greatly appreciated!
You can merge the two dataframes on the Isin and isin columns, and then filter for the rows where the date_added is later than the Blacklist Date. Here’s an example code:
merged = flagged.merge(blacklist, left_on='Isin', right_on='isin')
result = merged[merged['date_added'] > merged['Blacklist Date']][['Isin', 'date_added']]
This code first merges the two dataframes on the Isin and isin columns, creating a new dataframe with all the information from both dataframes. Then it filters for the rows where the date_added is later than the Blacklist Date, and selects only the Isin and date_added columns. The resulting dataframe result should contain the desired output. Note that in this case, the CNE100001WY7 instrument is not in the result because its date_added is the same as its Blacklist Date.