Compare 2 list columns in a pandas dataframe. Remove value from one list if present in another


Say I have 2 list columns like below:

group1 = [['John', 'Mark'], ['Ben', 'Johnny'], ['Sarah', 'Daniel']]
group2 = [['Aya', 'Boa'], ['Mab', 'Johnny'], ['Sarah', 'Peter']]

df = pd.DataFrame({'group1':group1, 'group2':group2})

I want to compare the two list columns and remove the list elements from group1 if they are present in group2. So expected results for above:

    group1                       group2
['John', 'Mark']             ['Aya', 'Boa']
['Ben']                     ['Mab', 'Johnny']
['Daniel']                  ['Sarah', 'Peter']

How can I do this? I have tried this:

df['group1'] = [[name for name in df['group1'] if name not in df['group2']]]

But got errror:

TypeError: unhashable type: 'list'

Please help.

Asked By: amnesic



You need to zip the two Series. I’m using a set here for efficiency (this is not critical if you have only a few items per list):

df['group1'] = [[x for x in a if x not in S]
                for a, S in zip(df['group1'], df['group2'].apply(set))]


         group1          group2
0  [John, Mark]      [Aya, Boa]
1         [Ben]   [Mab, Johnny]
2      [Daniel]  [Sarah, Peter]
Answered By: mozway

you can use a loop in lambda function:

df['group1']=df[['group1','group2']].apply(lambda x: [i for i in x['group1'] if i not in x['group2']],axis=1)
         group1          group2
0  [John, Mark]      [Aya, Boa]
1         [Ben]   [Mab, Johnny]
2      [Daniel]  [Sarah, Peter]
Answered By: Clegane

You can use set difference:

df.apply(lambda x: set(x['group1']).difference(x['group2']), axis=1)


0    {John, Mark}
1           {Ben}
2        {Daniel}
dtype: object

To get lists you can add .apply(list) at the end.

Answered By: Mykola Zotko

Using zip initially and list comprehension afterward:

Here is an example:

for i,j in zip(df1,df2):
    group1 = [name for name in i if name not in j]


   Group1 :  [['John', 'Mark'], ['Ben'], ['Daniel']]
   Group2 :  [['Aya', 'Boa'], ['Mab', 'Johnny'], ['Sarah', 'Peter']]
Answered By: AI ML Team
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.