Copy column value from a dataframe to another if values are equal
Question:
I have two dataframes like that (this is an example because my dataframes are complex) :
lst_p = [['2', 0], ['3', 1], ['4', 0], ['5', 0]]
df_p = pd.DataFrame(lst_p, columns =['id', 'redness'])
lst_c = [['apple', 2], ['orange', 2], ['banana', 3], ['kiwi', 4], ['cherry', 5]]
df_c = `pd.DataFrame(lst_c, columns =['name', 'id'])`
My two dataframes don’t have the same length.
As you can see in my second df_c, some ‘id’ appears 2 times. (for id=2)
I would like to create a new column in my df_c
that copy the value 'redness' of my df_p
if 'id'
from my df_c == ‘id’ from my df_p
.
I don’t know if it’s very clear…
Thanks a LOT !!!
Answers:
Use can simply try to convert df_p two column to dictionary any using lambda look for each id’s redness, and create new column.
Code:
df_c['redness'] = df_c['id'].apply(lambda x: pd.Series(df_p.redness.values,index=df_p.id).to_dict()[str(x)])
df_c
A simple merge will do the trick:
One issue that you have is that in one dataframe your id
is of type string
,
and in the other dataframe, the id
is of type int
.
The easiest way to resolve this is to convert the string to int before merge,
and convert back if so desired.
Code:
df_p.id = df_p.id.astype(int)
df_c = pd.merge(df_c, df_p, on=['id'], how='left')
print(df_c)
Output:
I have two dataframes like that (this is an example because my dataframes are complex) :
lst_p = [['2', 0], ['3', 1], ['4', 0], ['5', 0]]
df_p = pd.DataFrame(lst_p, columns =['id', 'redness'])
lst_c = [['apple', 2], ['orange', 2], ['banana', 3], ['kiwi', 4], ['cherry', 5]]
df_c = `pd.DataFrame(lst_c, columns =['name', 'id'])`
My two dataframes don’t have the same length.
As you can see in my second df_c, some ‘id’ appears 2 times. (for id=2)
I would like to create a new column in my df_c
that copy the value 'redness' of my df_p
if 'id'
from my df_c == ‘id’ from my df_p
.
I don’t know if it’s very clear…
Thanks a LOT !!!
Use can simply try to convert df_p two column to dictionary any using lambda look for each id’s redness, and create new column.
Code:
df_c['redness'] = df_c['id'].apply(lambda x: pd.Series(df_p.redness.values,index=df_p.id).to_dict()[str(x)])
df_c
A simple merge will do the trick:
One issue that you have is that in one dataframe your id
is of type string
,
and in the other dataframe, the id
is of type int
.
The easiest way to resolve this is to convert the string to int before merge,
and convert back if so desired.
Code:
df_p.id = df_p.id.astype(int)
df_c = pd.merge(df_c, df_p, on=['id'], how='left')
print(df_c)