Concatenate rows of two dataframes in pandas

Question:

I need to concatenate two dataframes df_a and df_b that have equal number of rows (nRow) horizontally without any consideration of keys. This function is similar to cbind in the R programming language. The number of columns in each dataframe may be different.

The resultant dataframe will have the same number of rows nRow and number of columns equal to the sum of number of columns in both the dataframes. In other words, this is a blind columnar concatenation of two dataframes.

import pandas as pd
dict_data = {'Treatment': ['C', 'C', 'C'], 'Biorep': ['A', 'A', 'A'], 'Techrep': [1, 1, 1], 'AAseq': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'mz':[500.0, 500.5, 501.0]}
df_a = pd.DataFrame(dict_data)
dict_data = {'Treatment1': ['C', 'C', 'C'], 'Biorep1': ['A', 'A', 'A'], 'Techrep1': [1, 1, 1], 'AAseq1': ['ELVISLIVES', 'ELVISLIVES', 'ELVISLIVES'], 'inte1':[1100.0, 1050.0, 1010.0]}
df_b = pd.DataFrame(dict_data)
Asked By: user1140126

||

Answers:

call concat and pass param axis=1 to concatenate column-wise:

In [5]:

pd.concat([df_a,df_b], axis=1)
Out[5]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010  

There is a useful guide to the various methods of merging, joining and concatenating online.

For example, as you have no clashing columns you can merge and use the indices as they have the same number of rows:

In [6]:

df_a.merge(df_b, left_index=True, right_index=True)
Out[6]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010  

And for the same reasons as above a simple join works too:

In [7]:

df_a.join(df_b)
Out[7]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010  
Answered By: EdChum

Thanks to @EdChum
I was struggling with same problem especially when indexes do not match. Unfortunatly in pandas guide this case is missed (when you for example delete some rows)

import pandas as pd
t=pd.DataFrame()
t['a']=[1,2,3,4]
t=t.loc[t['a']>1] #now index starts from 1

u=pd.DataFrame()
u['b']=[1,2,3] #index starts from 0

#option 1
#keep index of t
u.index = t.index 

#option 2
#index of t starts from 0
t.reset_index(drop=True, inplace=True)

#now concat will keep number of rows 
r=pd.concat([t,u], axis=1)
Answered By: Yury Wallet

If the index labels are different (e.g., if df_a.index == [0, 1, 2] and df_b.index == [10, 20, 30] are True), a straightforward join (or concat or merge) may produce NaN rows. A useful method in that case is set_axis() that coerces the indices to be the same.

concatenated_df = df_a.join(df_b.set_axis(df_a.index))
# or 
concatenated_df = pd.concat([df_a, df_b.set_axis(df_a.index)], axis=1)

If the length of the frames are the same, then you can also assign df_b to df_a. Unlike concat (or join or merge), this alters df_a and doesn’t create a new dataframe.

df_a[df_b.columns] = df_b

# if index labels are different
df_a[df_b.columns] = df_b.set_axis(df_a.index)
Answered By: cottontail
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.