How to compare two data frame
Question:
df1
index Count
Duliajan Area 2
HAPJAN 2
KATHALGURI 2
df2
Location Category
0 NAGAJAN 0
1 JORAJAN 0
2 KATHALGURI 0
3 HEBEDA 0
4 MAKUM 0
5 BAREKURI 0
6 BAGHJAN 0
7 Duliajan Area 0
8 LANGKASHI 0
9 HAPJAN 0
I need this output:
0 NAGAJAN 0
1 JORAJAN 0
2 KATHALGURI 2
3 HEBEDA 0
4 MAKUM 0
5 BAREKURI 0
6 BAGHJAN 0
7 Duliajan Area 2
8 LANGKASHI 0
9 HAPJAN 2
Answers:
You can use pandas merge
function
for example:
df2 = df2.rename(columns={"Location": "index"})
result = pd.merge(df1, df2, on="index")
You can create dict
from two columns of df1
then using map
on the df2
.
d = dict(zip(df1['index'], df1['Count']))
df2['Category'] = df2['Location'].map(d).fillna(df2['Category']).astype(int)
print(df2)
Output:
Location Category
0 NAGAJAN 0
1 JORAJAN 0
2 KATHALGURI 2
3 HEBEDA 0
4 MAKUM 0
5 BAREKURI 0
6 BAGHJAN 0
7 Duliajan Area 2
8 LANGKASHI 0
9 HAPJAN 2
Concat your dataframe then drop duplicates:
mapping = {'index': 'Location', 'Count': 'Category'}
out = (pd.concat([df2, df1.rename(columns=mapping)])
.drop_duplicates('Location', keep='last')
.reset_index(drop=True))
print(out)
# Output
Location Category
0 NAGAJAN 0
1 JORAJAN 0
2 HEBEDA 0
3 MAKUM 0
4 BAREKURI 0
5 BAGHJAN 0
6 LANGKASHI 0
7 Duliajan Area 2
8 HAPJAN 2
9 KATHALGURI 2
import pandas as pd
# merge the two dataframes on the Location column
df_merged = pd.merge(df1, df2, left_on='index', right_on='Location', how='outer')
# fill missing values with 0
df_merged = df_merged.fillna(0)
# rename the columns
df_merged = df_merged.rename(columns={'index': 'Location', 'Count': 'Count1', 'Category': 'Category1'})
# print the merged dataframe
print(df_merged)
df1
index Count
Duliajan Area 2
HAPJAN 2
KATHALGURI 2
df2
Location Category
0 NAGAJAN 0
1 JORAJAN 0
2 KATHALGURI 0
3 HEBEDA 0
4 MAKUM 0
5 BAREKURI 0
6 BAGHJAN 0
7 Duliajan Area 0
8 LANGKASHI 0
9 HAPJAN 0
I need this output:
0 NAGAJAN 0
1 JORAJAN 0
2 KATHALGURI 2
3 HEBEDA 0
4 MAKUM 0
5 BAREKURI 0
6 BAGHJAN 0
7 Duliajan Area 2
8 LANGKASHI 0
9 HAPJAN 2
You can use pandas merge
function
for example:
df2 = df2.rename(columns={"Location": "index"})
result = pd.merge(df1, df2, on="index")
You can create dict
from two columns of df1
then using map
on the df2
.
d = dict(zip(df1['index'], df1['Count']))
df2['Category'] = df2['Location'].map(d).fillna(df2['Category']).astype(int)
print(df2)
Output:
Location Category
0 NAGAJAN 0
1 JORAJAN 0
2 KATHALGURI 2
3 HEBEDA 0
4 MAKUM 0
5 BAREKURI 0
6 BAGHJAN 0
7 Duliajan Area 2
8 LANGKASHI 0
9 HAPJAN 2
Concat your dataframe then drop duplicates:
mapping = {'index': 'Location', 'Count': 'Category'}
out = (pd.concat([df2, df1.rename(columns=mapping)])
.drop_duplicates('Location', keep='last')
.reset_index(drop=True))
print(out)
# Output
Location Category
0 NAGAJAN 0
1 JORAJAN 0
2 HEBEDA 0
3 MAKUM 0
4 BAREKURI 0
5 BAGHJAN 0
6 LANGKASHI 0
7 Duliajan Area 2
8 HAPJAN 2
9 KATHALGURI 2
import pandas as pd
# merge the two dataframes on the Location column
df_merged = pd.merge(df1, df2, left_on='index', right_on='Location', how='outer')
# fill missing values with 0
df_merged = df_merged.fillna(0)
# rename the columns
df_merged = df_merged.rename(columns={'index': 'Location', 'Count': 'Count1', 'Category': 'Category1'})
# print the merged dataframe
print(df_merged)