Pandas column bind (cbind) two data frames
Question:
I’ve got a dataframe df_a
with id information:
unique_id lacet_number
15 5570613 TLA-0138365
24 5025490 EMP-0138757
36 4354431 DXN-0025343
and another dataframe df_b
, with the same number of rows that I know correspond to the rows in df_a
:
latitude longitude
0 -93.193560 31.217029
1 -93.948082 35.360874
2 -103.131508 37.787609
What I want to do is simply concatenate the two horizontally (similar to cbind
in R) and get:
unique_id lacet_number latitude longitude
0 5570613 TLA-0138365 -93.193560 31.217029
1 5025490 EMP-0138757 -93.948082 35.360874
2 4354431 DXN-0025343 -103.131508 37.787609
What I have tried:
df_c = pd.concat([df_a, df_b], axis=1)
which gives me an outer join.
unique_id lacet_number latitude longitude
0 NaN NaN -93.193560 31.217029
1 NaN NaN -93.948082 35.360874
2 NaN NaN -103.131508 37.787609
15 5570613 TLA-0138365 NaN NaN
24 5025490 EMP-0138757 NaN NaN
36 4354431 DXN-0025343 NaN NaN
The problem is that the indices for the two dataframes do not match. I read the documentation for pandas.concat
, and saw that there is an option ignore_index
. But that only applies to the concatenation axis, in my case the columns and it certainly is not the right choice for me. So my question is: is there a simple way to achieve this?
Answers:
If you’re sure the index row values are the same then to avoid the index alignment order then just call reset_index()
, this will reset your index values back to start from 0
:
df_c = pd.concat([df_a.reset_index(drop=True), df_b], axis=1)
DataFrame.join
While concat
is fine, it’s simpler to join
:
C = A.join(B)
This still assumes aligned indexes, so reset_index
as needed. In OP’s example, B
‘s index is already default, so we only need to reset A
:
C = A.reset_index(drop=True).join(B)
# unique_id lacet_number latitude longitude
# 0 5570613 TLA-0138365 -93.193560 31.217029
# 1 5025490 EMP-0138757 -93.948082 35.360874
# 2 4354431 DXN-0025343 -103.131508 37.787609
You can use set_axis
to make the index labels of one of the frames to be the same as the other’s and concatenate horizontally or join. Unlike reset_index
, this method preserves the index labels of one of the dataframes.
joined_df = pd.concat([df_a.set_axis(df_b.index), df_b], axis=1)
# or using `join`
joined_df = df_a.set_axis(df_b.index).join(df_b)
I’ve got a dataframe df_a
with id information:
unique_id lacet_number
15 5570613 TLA-0138365
24 5025490 EMP-0138757
36 4354431 DXN-0025343
and another dataframe df_b
, with the same number of rows that I know correspond to the rows in df_a
:
latitude longitude
0 -93.193560 31.217029
1 -93.948082 35.360874
2 -103.131508 37.787609
What I want to do is simply concatenate the two horizontally (similar to cbind
in R) and get:
unique_id lacet_number latitude longitude
0 5570613 TLA-0138365 -93.193560 31.217029
1 5025490 EMP-0138757 -93.948082 35.360874
2 4354431 DXN-0025343 -103.131508 37.787609
What I have tried:
df_c = pd.concat([df_a, df_b], axis=1)
which gives me an outer join.
unique_id lacet_number latitude longitude
0 NaN NaN -93.193560 31.217029
1 NaN NaN -93.948082 35.360874
2 NaN NaN -103.131508 37.787609
15 5570613 TLA-0138365 NaN NaN
24 5025490 EMP-0138757 NaN NaN
36 4354431 DXN-0025343 NaN NaN
The problem is that the indices for the two dataframes do not match. I read the documentation for pandas.concat
, and saw that there is an option ignore_index
. But that only applies to the concatenation axis, in my case the columns and it certainly is not the right choice for me. So my question is: is there a simple way to achieve this?
If you’re sure the index row values are the same then to avoid the index alignment order then just call reset_index()
, this will reset your index values back to start from 0
:
df_c = pd.concat([df_a.reset_index(drop=True), df_b], axis=1)
DataFrame.join
While concat
is fine, it’s simpler to join
:
C = A.join(B)
This still assumes aligned indexes, so reset_index
as needed. In OP’s example, B
‘s index is already default, so we only need to reset A
:
C = A.reset_index(drop=True).join(B)
# unique_id lacet_number latitude longitude
# 0 5570613 TLA-0138365 -93.193560 31.217029
# 1 5025490 EMP-0138757 -93.948082 35.360874
# 2 4354431 DXN-0025343 -103.131508 37.787609
You can use set_axis
to make the index labels of one of the frames to be the same as the other’s and concatenate horizontally or join. Unlike reset_index
, this method preserves the index labels of one of the dataframes.
joined_df = pd.concat([df_a.set_axis(df_b.index), df_b], axis=1)
# or using `join`
joined_df = df_a.set_axis(df_b.index).join(df_b)