How to sum dataframes in Pandas without getting NaN values?

Question:

I have some dataframes I need to sum, but some of them have missing column. Unfortunately, the result will have NaN values for those columns, which were missing in some of the input dataframes.

How to keep the original values for those columns?

Here is a small code:

#!/usr/bin/env ipython
# ---------------------
import pandas as pd
import numpy as np
import datetime
# ----------------------------------------
N=10
years = [vv for vv in range(2010,2010+N)]
# generate data:
data_a = {'years':years,'A':np.random.random(N),'B':np.random.random(N)}
data_b = {'years':years,'A':np.random.random(N),'C':np.random.random(N)}
# ----------------------------------------
dfa = pd.DataFrame.from_dict(data_a);dfa = dfa.set_index('years')
dfb = pd.DataFrame.from_dict(data_b);dfb = dfb.set_index('years')
dfc = dfa + dfb
# ----------------------------------------

Instead of having dfc as:

              A   B   C
years                  
2010   0.830207 NaN NaN
2011   1.237387 NaN NaN
2012   1.386908 NaN NaN
2013   0.949136 NaN NaN
2014   0.897436 NaN NaN
2015   0.375644 NaN NaN
2016   1.134836 NaN NaN
2017   1.125501 NaN NaN
2018   1.140183 NaN NaN
2019   0.522178 NaN NaN

I would like to have original values from dfa for column B and from dfb for column C.

As the actual tables are large, some automatic solution is preferred.

Asked By: msi_gerva

||

Answers:

Use DataFrame.add:

dfc = dfa.add(dfb, fill_value=0)

print (dfc)
              A         B         C
years                              
2010   0.986393  0.020584  0.607545
2011   1.090208  0.969910  0.170524
2012   1.024139  0.832443  0.065052
2013   0.965020  0.212339  0.948886
2014   0.612089  0.181825  0.965632
2015   0.941170  0.183405  0.808397
2016   0.257757  0.304242  0.304614
2017   1.380411  0.524756  0.097672
2018   1.193530  0.431945  0.684233
2019   0.754523  0.291229  0.440152
Answered By: jezrael

It seems that dfc = dfa.add(dfb, fill_value=0) solves that issue.

Answered By: thevoiddancer
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.