Change Values in Dataframe with Values in Some Other Columns in Other Dataframe

Question

I want to change values in my datframe

student = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10,],
                        'homeground': ['TOKYO','SOUTH KOREA','RIYADH','JAPAN','TOKYO','OSAKA','SAUDI ARABIA','SEOUL','','BUSAN']})

this is the master homegroud

hg = pd.DataFrame({'id_country':[1,2,2,3,3,3,3],
                   'country': ['SAUDI ARABIA','SOUTH KOREA','SOUTH KOREA','JAPAN','JAPAN','JAPAN','JAPAN'],
                   'id_city':[1,2,3,4,5,6,7],
                   'city': ['RIYADH','SEOUL','BUSAN','TOKYO','TOKYO','OSAKA','OSAKA']})

I want to change homeground values in student so the result will be like this

id homeground
1  4
2  2
3  1
4  3
5  4
6  6
7  1
8  2
9  0
10 3

Asked By: Arthur

||

Source

Answer 1

Use Series.map by city, then by country with lowercase and removed duplicates and last replace missing values to 0 if no match, because duplicates are removed all dupes without first values in both mappings:

s1 = student.homeground.map(hg.drop_duplicates(['city']).set_index('city')['id_city'])
s = hg.drop_duplicates(['country']).set_index('country')['id_country'].rename(str.lower)
s2 = student.homeground.str.lower().map(s)

student['homeground'] = s1.fillna(s2).fillna(0, downcast='int')
print (student)
   id  homeground
0   1           4
1   2           2
2   3           1
3   4           3
4   5           4
5   6           6
6   7           1
7   8           2
8   9           0
9  10           3

EDIT: If need avoid duplicates – output are unique values in lists:

s11 = hg.drop_duplicates(['city','id_city']).groupby('city')['id_city'].agg(list)
s1 = student.homeground.map(s11)
    
s22 = (hg.drop_duplicates(['country','id_country'])
         .groupby('country')['id_country'].agg(list).rename(str.lower))
s2 = student.homeground.str.lower().map(s22)

student['homeground'] = s1.fillna(s2).fillna(0, downcast='int')

print (student)
   id homeground
0   1     [4, 5]
1   2        [2]
2   3        [1]
3   4        [3]
4   5     [4, 5]
5   6     [6, 7]
6   7        [1]
7   8        [2]
8   9          0
9  10        [3]

Or in joined values by , :

s11 = (hg.drop_duplicates(['city','id_city'])
       .assign(id_city = lambda x: x['id_city'].astype(str))
       .groupby('city')['id_city'].agg(','.join))
s1 = student.homeground.map(s11)
  
s22 = (hg.drop_duplicates(['country','id_country'])
     .assign(id_country = lambda x: x['id_country'].astype(str))
      .groupby('country')['id_country']
      .agg(','.join).rename(str.lower))
s2 = student.homeground.str.lower().map(s22)
  
student['homeground'] = s1.fillna(s2).fillna('0', downcast='int')
  
print (student)
   id homeground
0   1        4,5
1   2          2
2   3          1
3   4          3
4   5        4,5
5   6        6,7
6   7          1
7   8          2
8   9          0
9  10          3

Answered By: jezrael

Change Values in Dataframe with Values in Some Other Columns in Other Dataframe

Question:

Answers: