Look for 100% matches,partial matches and dismatches between python list and list column in Pandas

Question:

I know how to compare lists but working with unflatten data in Pandas is beyond my skills..anyway..
I have:

list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

and

df = pd.DataFrame({
"col1":["a", "b", "c", "d"], 
"col2":[[1, 2, 3, 4, 5], [20, 30, 100], [1, 3, 20], [20,30]]
})

I want to compare the list with the df and print rows that contain 100% matches, partial matches and dismatches. Desired outcome is then this:

"100% matches are in rows: a"
"Partial matches are in rows: c"
"Dismatches are in rows: b, d"

Python 3 solution would be much appreciated.Thanks!

Asked By: Václav Pech

||

Answers:

Code:

###INPUT
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
df = pd.DataFrame({
"col1": ['a', 'b', 'c', 'd'], 
"col2":[[1, 2, 3, 4, 5], [20, 30, 100], [1, 3, 20], [20,30]]})
 

#Creating another list to define categorize
cat = ['100% matches are in rows:', 'Partial matches are in rows:', 'Dismatches are in rows:']


#Using lambda fun reading DF row by row and compareing row list with actaul list
#If difference is 0 pick first category from cat list
#if difference is 1 pick second category from cat list
# if difference more than one pick the last category from cat list

df['col1'].groupby(df['col2'].apply(lambda row: cat[-1] if len(set(row).difference(lst)) > 1 else cat[len(set(row).difference(lst))])).agg(','.join)

Output:

col2
100% matches are in rows         a
Dismatches are in rows         b,d
Partial matches are in rows      c
Name: col1, dtype: object

Update: Suppose you have col1 with integer values

.agg(lambda col: ','.join(map(str, col.tolist())))
Answered By: R. Baraiya

You can use something like:

lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

s = df['col2'].explode().isin(lst).groupby(level=0).mean()

import numpy as np
group = np.select([s.eq(1), s.eq(0)],
                  ['100% matches', 'Dismatches'],
                  'Partial matches')

df['col1'].groupby(group).agg(', '.join)

output:

100% matches          a
Dismatches         b, d
Partial matches       c
Name: col1, dtype: object
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.