How to combine dataframe rows, and combine their string column into list?
Question:
Say I have a Pandas dataframe:
index name A
0 one a
1 two a
2 one b
3 two a
How can I merge rows with identical ‘name’ so that the new column A is a list of all the A associated with each ‘name’? So, the output would be:
index name A
0 one [a, b]
1 two [a]
Answers:
Update Answer
df = pd.DataFrame({
'name' : ['one', 'two', 'one', 'two'],
'A' : ['a', 'a', 'a', 'a'],
'B': ['a', 'b', 'a', 'a']
})
df['COLUMN_MERGE'] = df['A'].astype(str) + ' ' + df['B'].astype(str)
df = df.groupby(['name'])['COLUMN_MERGE'].apply(set).apply(list).reset_index(name = 'Distinct_Merge')
Say I have a Pandas dataframe:
index name A
0 one a
1 two a
2 one b
3 two a
How can I merge rows with identical ‘name’ so that the new column A is a list of all the A associated with each ‘name’? So, the output would be:
index name A
0 one [a, b]
1 two [a]
Update Answer
df = pd.DataFrame({
'name' : ['one', 'two', 'one', 'two'],
'A' : ['a', 'a', 'a', 'a'],
'B': ['a', 'b', 'a', 'a']
})
df['COLUMN_MERGE'] = df['A'].astype(str) + ' ' + df['B'].astype(str)
df = df.groupby(['name'])['COLUMN_MERGE'].apply(set).apply(list).reset_index(name = 'Distinct_Merge')