Create DataFrame from df1 and df2 and take empty value from df2 for column value if not exist in df1 column value
Question:
df1 = pd.DataFrame({'call_sign': ['ASD','BSD','CDSF','GFDFD','FHHH'],'frn':['123','124','','656','']})
df2 = pd.DataFrame({'call_sign': ['ASD','CDSF','BSD','GFDFD','FHHH'],'frn':['1234','','124','','765']})
need to get a new df like
df2 = pd.DataFrame({'call_sign': ['ASD','BSD','CDSF','GFDFD','FHHH'],'frn':['123','','124','656','765']})
I need to take frn from df2 if it’s missing in df1 and create a new df
Answers:
Replace empty strings to missing values and use DataFrame.set_index
with DataFrame.fillna
, because need ordering like df2.call_sign
add DataFrame.reindex
:
df = (df1.set_index('call_sign').replace('', np.nan)
.fillna(df2.set_index('call_sign').replace('', np.nan))
.reindex(df2['call_sign']).reset_index())
print(df)
call_sign frn
0 ASD 123
1 CDSF NaN
2 BSD 124
3 GFDFD 656
4 FHHH 765
If you want to update df2
you can use boolean indexing:
# is frn empty string?
m = df2['frn'].eq('')
# update those rows from the value in df1
df2.loc[m, 'frn'] = df2.loc[m, 'call_sign'].map(df1.set_index('call_sign')['frn'])
Updated df2
:
call_sign frn
0 ASD 1234
1 CDSF
2 BSD 124
3 GFDFD 656
4 FHHH 765
temp = df1.merge(df2,how='left',on='call_sign')
df1['frn']=temp.frn_x.where(temp.frn_x!='',temp.frn_y)
call_sign frn
0 ASD 123
1 BSD
2 CDSF 124
3 GFDFD 656
4 FHHH 765
df1 = pd.DataFrame({'call_sign': ['ASD','BSD','CDSF','GFDFD','FHHH'],'frn':['123','124','','656','']})
df2 = pd.DataFrame({'call_sign': ['ASD','CDSF','BSD','GFDFD','FHHH'],'frn':['1234','','124','','765']})
need to get a new df like
df2 = pd.DataFrame({'call_sign': ['ASD','BSD','CDSF','GFDFD','FHHH'],'frn':['123','','124','656','765']})
I need to take frn from df2 if it’s missing in df1 and create a new df
Replace empty strings to missing values and use DataFrame.set_index
with DataFrame.fillna
, because need ordering like df2.call_sign
add DataFrame.reindex
:
df = (df1.set_index('call_sign').replace('', np.nan)
.fillna(df2.set_index('call_sign').replace('', np.nan))
.reindex(df2['call_sign']).reset_index())
print(df)
call_sign frn
0 ASD 123
1 CDSF NaN
2 BSD 124
3 GFDFD 656
4 FHHH 765
If you want to update df2
you can use boolean indexing:
# is frn empty string?
m = df2['frn'].eq('')
# update those rows from the value in df1
df2.loc[m, 'frn'] = df2.loc[m, 'call_sign'].map(df1.set_index('call_sign')['frn'])
Updated df2
:
call_sign frn
0 ASD 1234
1 CDSF
2 BSD 124
3 GFDFD 656
4 FHHH 765
temp = df1.merge(df2,how='left',on='call_sign')
df1['frn']=temp.frn_x.where(temp.frn_x!='',temp.frn_y)
call_sign frn
0 ASD 123
1 BSD
2 CDSF 124
3 GFDFD 656
4 FHHH 765