How to specify columns to be merged using pd.merge
Question:
Trying to use pd.merge , I don’t understand how to specify the column we want to add in an other dataframe.
Is it possible?
Below is what I would Like to do :
DF1
index
ID
Style
0
1
a
1
21
z
2
7
e
DF2
index
ID
Name
Date
0
1
ART
09-13-2022
1
21
DRAW
09-13-2022
1.1
7
GAME
09-13-2022
…
5
GAME
09-13-2022
115
8
GAME
09-13-2022
The output would be :
index
ID
Style
Name
0
1
a
ART
1
21
z
DRAW
2
7
e
GAME
I have tried :
Output = DF1.merge(DF2, on = 'ID', how='left')
Which works but it merge all column and not only the name column.
How can I specify 1 or several column(s) using merge
? In this example, we need to specify that we want only Name column to be added to DF1.
Answers:
Slice DF2 to only keep the key(s) and the column(s) to add:
Output = DF1.merge(DF2[['ID', 'Name']], on='ID', how='left')
output:
index ID Name
0 0 1 ART
1 1 21 DRAW
2 2 7 GAME
You can change the columns associated to your dataframes within the merge itself, for example
Output = DF1.merge(DF2[['ID','Name']], on = 'ID', how='left')
This will only take the two columns you want into the merge and because ‘Name’ is an additional column it will join this onto DF1
Trying to use pd.merge , I don’t understand how to specify the column we want to add in an other dataframe.
Is it possible?
Below is what I would Like to do :
DF1
index | ID | Style |
---|---|---|
0 | 1 | a |
1 | 21 | z |
2 | 7 | e |
DF2
index | ID | Name | Date |
---|---|---|---|
0 | 1 | ART | 09-13-2022 |
1 | 21 | DRAW | 09-13-2022 |
1.1 | 7 | GAME | 09-13-2022 |
… | 5 | GAME | 09-13-2022 |
115 | 8 | GAME | 09-13-2022 |
The output would be :
index | ID | Style | Name |
---|---|---|---|
0 | 1 | a | ART |
1 | 21 | z | DRAW |
2 | 7 | e | GAME |
I have tried :
Output = DF1.merge(DF2, on = 'ID', how='left')
Which works but it merge all column and not only the name column.
How can I specify 1 or several column(s) using merge
? In this example, we need to specify that we want only Name column to be added to DF1.
Slice DF2 to only keep the key(s) and the column(s) to add:
Output = DF1.merge(DF2[['ID', 'Name']], on='ID', how='left')
output:
index ID Name
0 0 1 ART
1 1 21 DRAW
2 2 7 GAME
You can change the columns associated to your dataframes within the merge itself, for example
Output = DF1.merge(DF2[['ID','Name']], on = 'ID', how='left')
This will only take the two columns you want into the merge and because ‘Name’ is an additional column it will join this onto DF1