How to remove substrings from column rows if substrings aren't part of dictionary's values?

Question:

I have this simplified DataFrame:

A B
foo A, B, C, D

I create a dictionary d:

d = {'foo': 'A, B, C'}

Dictionary keys are in column A and their values are in column B. How can I remove any substrings that aren’t part of my dictionary’s values?

Desired DataFrame:

A B
foo A, B, C
Asked By: Luka Banfi

||

Answers:

If need compare by spiltted values by , use:

d = {'foo': 'A, B, C'}

f = lambda x: ', '.join(y for y in x.B.split(', ') if y in x.A.split(', '))
df['B'] = df.assign(A = df['A'].map(d)).apply(f, axis=1)
print (df)
     A        B
0  foo  A, B, C
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.