How to filter a dataframe by rows of another one?
Question:
I have two dataframes: df1:
id type
"a" "alpha"
"a" "alpha"
"a" "beta"
"a" "gamma"
and df2:
id type
"a" "alpha"
"a" "alpha"
"a" "alpha"
"a" "alpha"
"a" "beta"
"a" "beta"
for each row in df1 i want to remove single row from df2 if they have same "id" and "type". so desired result is:
id type
"a" "alpha"
"a" "alpha"
"a" "beta"
How could I do that?
Answers:
You can use collection.Counter
to subtract element of df1 from df2, then rebuild a Dataframe:
from collections import Counter
diff = Counter(zip(df2['id'], df2['type'])) - Counter(zip(df1['id'], df1['type']))
pd.DataFrame([k for k,v in diff.items() for _ in range(v)], columns=["id", "type"])
Output:
id type
0 "a" "alpha"
1 "a" "alpha"
2 "a" "beta"
I have two dataframes: df1:
id type
"a" "alpha"
"a" "alpha"
"a" "beta"
"a" "gamma"
and df2:
id type
"a" "alpha"
"a" "alpha"
"a" "alpha"
"a" "alpha"
"a" "beta"
"a" "beta"
for each row in df1 i want to remove single row from df2 if they have same "id" and "type". so desired result is:
id type
"a" "alpha"
"a" "alpha"
"a" "beta"
How could I do that?
You can use collection.Counter
to subtract element of df1 from df2, then rebuild a Dataframe:
from collections import Counter
diff = Counter(zip(df2['id'], df2['type'])) - Counter(zip(df1['id'], df1['type']))
pd.DataFrame([k for k,v in diff.items() for _ in range(v)], columns=["id", "type"])
Output:
id type
0 "a" "alpha"
1 "a" "alpha"
2 "a" "beta"