Merging two dataframe in pandas

Question:

I have following dataframes

df1

ind stay res
AB10 yes yes
AB11 no yes

df2

ind date c
AB10 22-01 1
AB10 24-01 2
AB11 20-01 1
AB11 21-01 2
AB11 22-01 3

I want following dataframe as a final one

ind date c stay res
AB10 yes yes
AB10 22-01 1
AB10 24-01 2
AB11 no yes
AB11 20-01 1
AB11 21-01 2
AB11 22-01 3

How to get this using pandas?

Asked By: learner

||

Answers:

I believe what you are looking for is a pd.concat() for your dataframes

df_con = pd.concat([df1, df2])
df_con.sort_values('ind').fillna('')
Answered By: ArchAngelPwn

This will do what your question asks:

df1 = pd.concat([
    df1.reindex(columns=list(df2.columns) + list(df1.columns)[1:]),
    df2.reindex(columns=list(df2.columns) + list(df1.columns)[1:])]).sort_values('ind')

Output:

    ind   date    c stay  res
0  AB10    NaN  NaN  yes  yes
0  AB10  22-01  1.0  NaN  NaN
1  AB10  24-01  2.0  NaN  NaN
1  AB11    NaN  NaN   no  yes
2  AB11  20-01  1.0  NaN  NaN
3  AB11  21-01  2.0  NaN  NaN
4  AB11  22-01  3.0  NaN  NaN

If you want empty strings instead of NaN values, append .fillna(''):

    ind   date    c stay  res
0  AB10              yes  yes
0  AB10  22-01  1.0
1  AB10  24-01  2.0
1  AB11               no  yes
2  AB11  20-01  1.0
3  AB11  21-01  2.0
4  AB11  22-01  3.0

A slightly shorter alternative would be to use concat() first and then order the columns:

df1 = pd.concat([df1, df2])[list(df2.columns) + list(df1.columns)[1:]].sort_values('ind').fillna('')
Answered By: constantstranger
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.