python – Joining two columns pandas – returning NA if any value is NA, however need to return real join

Question:

I have dataframe:

df = pd.DataFrame({'student_id': [71, 63, 23],
                   'student_name': [nan, 'Peter Andrews', 'Amy Powers'],
                   })

I am creating new column column which joins id + name using

df['student_id_name'] = df['student_id'].astype(str) + ' ' + df['student_name']

Needed output:

{student_id_name : [71, 63 Peter Andrews, 23 Amy Powers]}

What I get:

{student_id_name : [nan, 63 Peter Andrews, 23 Amy Powers]}

May you help to get to expected outcome?

Answers:

Use Series.str.cat with na_rep parameter, last remove possible trailing spaces by Series.str.strip:

df['student_id_name'] = (df['student_id'].astype(str).str.cat(df['student_name'], 
                                                    sep=' ', na_rep='').str.strip())
print (df)
   student_id   student_name   student_id_name
0          71            NaN                71
1          63  Peter Andrews  63 Peter Andrews
2          23     Amy Powers     23 Amy Powers
Answered By: jezrael

You can use fillna() to cleanup missing/blank values in dataframe. Then your original expression will work. Note that this will actually replace nan with replace value used:

import math
df = pd.DataFrame({'student_id': [71, 63, 23],
                   'student_name': [math.nan, 'Peter Andrews', 'Amy Powers'],
                   })
# 
df = df.fillna('')
df['student_id_name'] = df['student_id'].astype(str) + ' ' + df['student_name']

[Out]:
   student_id   student_name   student_id_name
0          71                              71 
1          63  Peter Andrews  63 Peter Andrews
2          23     Amy Powers     23 Amy Powers
Answered By: Azhar Khan
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.