How to unify string column values based on unique Id in python
Question:
How to unify data frame column values based on unique id
Input :
import pandas as pd
d = {'code' : pd.Series(['VC_1', 'VC_1', 'BN_3','BN_4'] ),
'value' : pd.Series(['LTC Limited', 'LTC LTD', 'AMZ ENT', 'BBM CROP'])}
df = pd.DataFrame(d)
print (df)
code value
0 VC_1 LTC Limited
1 VC_1 LTC LTD
2 BN_3 AMZ ENT
3 BN_4 BBM CROP
Expected output:
code value
0 VC_1 LTC Limited
1 VC_1 LTC Limited
2 BN_3 AMZ ENT
3 BN_4 BBM CROP
Here in value column want to unify with first occurrence of name or with any one name for the entire unique combination.
Answers:
try this,
df['value'] = df.groupby(['code'])['value'].transform('first')
code value
0 VC_1 LTC Limited
1 VC_1 LTC Limited
2 BN_3 AMZ ENT
3 BN_4 BBM CROP
What if it’s not the first? If I need it to always choose the ‘value’ in a specific format?
for example:
code value
0 VC_1 LTC| Limited
1 VC_1 limited
2 BN_3 ent **
3 BN_3 AMZ| ENT
4 BN_4 BBM CROP
How I need it:
code value
0 VC_1 LTC| Limited
1 VC_1 LTC| Limited
2 BN_3 AMZ| ENT
3 BN_3 AMZ| ENT
4 BN_4 BBM CROP
How to unify data frame column values based on unique id
Input :
import pandas as pd
d = {'code' : pd.Series(['VC_1', 'VC_1', 'BN_3','BN_4'] ),
'value' : pd.Series(['LTC Limited', 'LTC LTD', 'AMZ ENT', 'BBM CROP'])}
df = pd.DataFrame(d)
print (df)
code value
0 VC_1 LTC Limited
1 VC_1 LTC LTD
2 BN_3 AMZ ENT
3 BN_4 BBM CROP
Expected output:
code value
0 VC_1 LTC Limited
1 VC_1 LTC Limited
2 BN_3 AMZ ENT
3 BN_4 BBM CROP
Here in value column want to unify with first occurrence of name or with any one name for the entire unique combination.
try this,
df['value'] = df.groupby(['code'])['value'].transform('first')
code value
0 VC_1 LTC Limited
1 VC_1 LTC Limited
2 BN_3 AMZ ENT
3 BN_4 BBM CROP
What if it’s not the first? If I need it to always choose the ‘value’ in a specific format?
for example:
code value
0 VC_1 LTC| Limited
1 VC_1 limited
2 BN_3 ent **
3 BN_3 AMZ| ENT
4 BN_4 BBM CROP
How I need it:
code value
0 VC_1 LTC| Limited
1 VC_1 LTC| Limited
2 BN_3 AMZ| ENT
3 BN_3 AMZ| ENT
4 BN_4 BBM CROP