Merge two pandas dataframes that have slightly different values on the column which is being merged

Question:

How can I merge two data frames when the column has a slight offset than the column I am merging to?

df1 =

col1 col2
1 a
2 b
3 c

df2 =

col1 col3
1.01 d
2 e
2.95 f

so, the merged column would end up like this even though the values in col1 are slightly different.

df_merge =

col1 col2 col3
1 a d
2 b e
3 c f

I have seen scenarios like this where "col1" is a string, but I’m wondering if it’s possible to do this with something like pandas.merge() in the scenario where there is slight numerical offset (e.g +/- 0.05).

Asked By: Candice

||

Answers:

Lets do merge_asof with tolerance parameter

pd.merge_asof(
    df1.astype({'col1': 'float'}).sort_values('col1'), 
    df2.sort_values('col1'), 
    on='col1', 
    direction='nearest', 
    tolerance=.05
)

   col1 col2 col3
0   1.0    a    d
1   2.0    b    e
2   3.0    c    f

PS: if the dataframes are already sorted on col1 then there is no need to sort again.

Answered By: Shubham Sharma
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.