Can I avoid that the join column of the right data frame in a pandas merge appears in the output?

Question:

I am merging two data frames with pandas. I would like to avoid that, when joining, the output includes the join column of the right table.

Example:

import pandas as pd

age = [['tom', 10], ['nick', 15], ['juli', 14]]   
df1 = pd.DataFrame(age, columns = ['Name', 'Age']) 

toy = [['tom', 'GIJoe'], ['nick', 'car']]   
df2 = pd.DataFrame(toy, columns = ['Name_child', 'Toy'])

df = pd.merge(df1,df2,left_on='Name',right_on='Name_child',how='left')

df.columns will give the output Index(['Name', 'Age', 'Name_child', 'Toy'], dtype='object'). Is there an easy way to obtain Index(['Name', 'Age', 'Toy'], dtype='object') instead? I can drop the column afterwards of course like this del df['Name_child'], but I’d like my code to be as short as possible.

Asked By: koteletje

||

Answers:

Set the index of the second dataframe to "Name_child". If you do this in the merge statement the columns in df2 remain unchanged.

df = pd.merge(df1,df2.set_index('Name_child'),left_on='Name',right_index=True,how='left')

This ouputs the correct columns:

df

Name    Age Toy
0   tom 10  GIJoe
1   nick    15  car
2   juli    14  NaN
df.columns

df.columns
Index(['Name', 'Age', 'Toy'], dtype='object')
Answered By: forgetso

Based on @mgc comments, you don’t have to rename the columns of df2. Just you pass df2 to merge function with renamed columns. df2 column names will remain as it is.

df = pd.merge(df1,df2.rename(columns={'Name_child': 'Name'}),on='Name', how='left')

df
    Name    Age Toy
0   tom     10  GIJoe
1   nick    15  car
2   juli    14  NaN

df.columns
Index(['Name', 'Age', 'Toy'], dtype='object')

df2.columns
Index(['Name_child', 'Toy'], dtype='object')
Answered By: ggaurav

Seems to be even simpler to drop the column right after.

df = (pd.merge(df1,df2,left_on='Name',right_on='Name_child',how='left')
      .drop('Name_child', axis=1))
#----------------
import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.    
Answered By: gregV
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.