Merging pandas dataframes based on lists of paired indices
Question:
I have two dataframes, df1
and df2
, and a fairly complicated set of logical statements that I have to run as a separate function to merge them. That function returns a pair of indices for the row in df1
and the row in df2
, that looks right now like
matches = [[1,2,7,14], [1,2,7,14], [3,8]]
something like that so that matches[idx]
has a list of indices in df2
to merge with the row df1.loc[idx]
, so rows 0 and 1 in df1
would merge with rows 1,2,7,14 in df2
, and on.
How would I merge df1
with df2
on these lists? The logic is prohibitive to try to run through pandas in terms of speed, so I have to start with these lists of matches between the dataframes.
Answers:
Per @MYousefi, this was the solution:
Try
pd.concat([df1, pd.Series(matches, name='match')], axis=1).explode('match').merge(df2, left_on='match', right_index=True)
Should work for numerical indices.
I have two dataframes, df1
and df2
, and a fairly complicated set of logical statements that I have to run as a separate function to merge them. That function returns a pair of indices for the row in df1
and the row in df2
, that looks right now like
matches = [[1,2,7,14], [1,2,7,14], [3,8]]
something like that so that matches[idx]
has a list of indices in df2
to merge with the row df1.loc[idx]
, so rows 0 and 1 in df1
would merge with rows 1,2,7,14 in df2
, and on.
How would I merge df1
with df2
on these lists? The logic is prohibitive to try to run through pandas in terms of speed, so I have to start with these lists of matches between the dataframes.
Per @MYousefi, this was the solution:
Try
pd.concat([df1, pd.Series(matches, name='match')], axis=1).explode('match').merge(df2, left_on='match', right_index=True)
Should work for numerical indices.