How to change columns values(list) using another data frame in Python


I have two data frame, I need to change column values of first data frame that are in list, using second data frame.

First Data frame
df1 = pd.DataFrame({'title':['The Godfather','Fight Club','The Empire'], genre_ids':[[18, 80],[18],[12, 28, 878]]})

    title           genre_ids
0   The Godfather   [18, 80]
1   Fight Club      [18]
2   The Empire      [12, 28, 878]

Second Data frame
df2 = pd.DataFrame({'id'[18,80,12,28,878,99],'name'['Action','Adventure','Adventure','Animation','Comedy','Documentary']})

    id  name
0   18  Action
1   80  Horror
2   12  Adventure
3   28  Animation
4   878 Comedy
5   99  Documentary

How can I assign genere_ids like this using df2 in python

       title       genre_ids  
0   The Godfather  [Action, Horror] 
1   Fight Club     [Action]
2   The Empire     [Adventure, Animation, Comedy]
Asked By: Manish Patel



You can do (note: id 80 is Adventure in your example, not Horror):

m = dict(zip(df2["id"], df2["name"]))
df1["genre_ids"] = df1["genre_ids"].apply(lambda l: [m.get(v) for v in l])



           title                       genre_ids
0  The Godfather             [Action, Adventure]
1     Fight Club                        [Action]
2     The Empire  [Adventure, Animation, Comedy]
Answered By: Andrej Kesely

You can explode the genre_ids and merge the 2 dataframe.

merged = df1.explode('genre_ids').merge(df2,left_on='genre_ids',right_on='id')[['title','name']]

Then using groupby convert the name column to the list.

Answered By: Mohsen_Fatemi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.