Find the unique rows in a file with python

Question

I am trying to take rows which are uniques. Here unique means cells should not share any letter.
I have an excel file like this with thousands of rows:

id	letters
1	A,B,G
2	B,G
21	C,D
30	E
35	K,M
40	E,F

The values in letters column should not be contained in another letter cell.
The output should be like this, because the letters C,D,K and M don’t appeared in another cell:

id	letters
21	C,D
35	K,M

Asked By: abcabc

||

Source

Answer 1

You can split values by ,, explode for possible remove all duplicates and join per groups to original joined values, last get rows with same data:

s = (df['letters'].str.split(',')
                  .explode()
                  .drop_duplicates(keep=False)
                  .groupby(level=0)
                  .agg(','.join))

df = df[df['letters'].eq(s)]
print (df)
   id letters
2  21     C,D
4  35     K,M

Answered By: jezrael

Find the unique rows in a file with python

Question:

Answers: