How to unify string column values based on unique Id in python

Question:

How to unify data frame column values based on unique id

Input :

import pandas as pd

    d = {'code' : pd.Series(['VC_1', 'VC_1', 'BN_3','BN_4'] ),
       'value' : pd.Series(['LTC Limited', 'LTC LTD', 'AMZ ENT', 'BBM CROP'])}
    
    df = pd.DataFrame(d)
    print (df)


   code        value
0  VC_1  LTC Limited
1  VC_1      LTC LTD
2  BN_3      AMZ ENT
3  BN_4     BBM CROP

Expected output:

code        value
0  VC_1  LTC Limited
1  VC_1  LTC Limited
2  BN_3      AMZ ENT
3  BN_4     BBM CROP

Here in value column want to unify with first occurrence of name or with any one name for the entire unique combination.

Asked By: naveen kumar

||

Answers:

try this,

df['value'] = df.groupby(['code'])['value'].transform('first')

   code        value
0  VC_1  LTC Limited
1  VC_1  LTC Limited
2  BN_3      AMZ ENT
3  BN_4     BBM CROP
Answered By: sushanth

What if it’s not the first? If I need it to always choose the ‘value’ in a specific format?

for example:

code value
0 VC_1 LTC| Limited
1 VC_1 limited
2 BN_3 ent **
3 BN_3 AMZ| ENT
4 BN_4 BBM CROP

How I need it:

code value
0 VC_1 LTC| Limited
1 VC_1 LTC| Limited
2 BN_3 AMZ| ENT
3 BN_3 AMZ| ENT
4 BN_4 BBM CROP

Answered By: Stephania V
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.