Python pandas: left join by key and value in list of values:

Question:

Code:

df1 = pd.DataFrame({'key': ['A', 'B', 'C', 'D'],
                   'value': [1, 2, 3, 4]})

df2 = pd.DataFrame({'key': ['B', 'D', 'D', 'F'],
                   'list_values': [[2, 4, 6], [4, 8], [1, 3, 5], [7, 9]]})

I need to make a left join by :

  1. df1['key'] = df2['key']
  2. df1['value'] in df2['list_values']

The output needs to be:

   key  value  list_values
0   A     1       Nan
1   B     2     [2, 4, 6]
2   C     3      Nan
3   D     4     [4, 8]

I can merge by key, but how I can add a second conditional?

merged_df = df1.merge(df2, left_on='key', right_on='key', how='left')
............ ??
Asked By: Galat

||

Answers:

You could add an extra column in df2 and explode it to act as a secondary key for the merge:

out = df1.merge(df2.assign(value=df2['list_values']).explode('value'),
                on=['key', 'value'], how='left')

Alternatively, merge and filter:

out = (df1.merge(df2, on='key', how='left')
          .loc[lambda d: [not isinstance(l, list) or v in l
                          for v, l in zip(d['value'], d['list_values'])]]
       )

Output:

  key value list_values
0   A     1         NaN
1   B     2   [2, 4, 6]
2   C     3         NaN
3   D     4      [4, 8]
Answered By: mozway

You can even just tried to insert column by checking list_values for each key’s using lambda.

Code :

df1['list_values'] = df1.apply(lambda r: sum([l for l in list(df2.loc[df2['key']==r.key].list_values) if r.value in l] ,[]), axis=1)

Output:

key value   list_values
0   A   1   []
1   B   2   [2, 4, 6]
2   C   3   []
3   D   4   [4, 8]
Answered By: R. Baraiya
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.