keep both merging keys after pandas.merge_asof

Question:

I have found this nice function pandas.merge_asof.
From the documentation

pandas.merge_asof(left, right, on=None, left_on=None, right_on=None)

Parameters: 

left : DataFrame
right : DataFrame
on : label

Field name to join on. Must be found in both DataFrames.
The data MUST be ordered. 
Furthermore this must be a numeric column,such as datetimelike, integer, or float. 
On or left_on/right_on must be given.

and it works as expected.

However, my merged dataframe keeps as columns on only the one that originally was in left. I would need to keep them both, so to have

   mydf=pandas.merge_asof(left, right, on='Time')

and mydf to contain both Time from left and right

Example data:

a=pd.DataFrame(data=pd.date_range('20100201', periods=100, freq='6h3min'),columns=['Time'])
b=pd.DataFrame(data=
                  pd.date_range('20100201', periods=24, freq='1h'),columns=['Time'])
b['val']=range(b.shape[0])
out=pd.merge_asof(a,b,on='Time',direction='forward',tolerance=pd.Timedelta('30min'))
Asked By: 00__00__00

||

Answers:

I think one possible solution is rename columns:

out = pd.merge_asof(a.rename(columns={'Time':'Time1'}), 
                    b.rename(columns={'Time':'Time2'}), 
                    left_on='Time1',
                    right_on='Time2',
                    direction='forward',
                    tolerance=pd.Timedelta('30min'))

print (out.head())
                Time1      Time2  val
0 2010-02-01 00:00:00 2010-02-01  0.0
1 2010-02-01 06:03:00        NaT  NaN
2 2010-02-01 12:06:00        NaT  NaN
3 2010-02-01 18:09:00        NaT  NaN
4 2010-02-02 00:12:00        NaT  NaN
Answered By: jezrael

One trick that I found while working on time-series data is this function.
To articulate what I have done is perform an outer join on both dataframes. Suppose the entry against some "Date and Time" is not in the other dataframe. In that case, we copied the value it had previously, as I assumed that the value in the other dataframe stayed whatever it was before that time.

def merger(df1,df2):
    
    df1['Date and Time'] = pd.to_datetime(df1['Date and Time'], errors='coerce')
    df2['Date and Time'] = pd.to_datetime(df2['Date and Time'], errors='coerce') 
   

    df = pd.merge(df1,df2,on = 'Date and Time', how='outer')
    df.fillna(method = 'ffill',inplace = True)
    
    return df

Hope it Helps.

Answered By: Mohit Lunia
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.