pandas.DataFrame: How to merge rows with a common column value in the same pandas.DataFrame

Question

I have a pandas.DataFrame that looks like that:

index	projectid	question	answer
0	1	‘q1’	‘str1’
1	1	‘q2’	‘str2’
2	1	‘q3’	‘str3’
3	2	‘q1’	‘str4’
4	2	‘q3’	‘str6’

And I would like to format it like that:

index	projectid	question1	answer1	question2	answer2	question3	answer3
0	1	‘q1’	‘str1’	‘q2’	‘str2’	‘q3’	‘str3’
1	2	‘q1’	‘str4’	None	None	‘q3’	‘str6’

Not every project has the same number of question but questions are shared for each project. So when a specific question isn’t in a project, I would like cells to be filled up with None values.

I didn’t found any way to do it with join or concat, but I don’t know how to properly use it.

I would like to improve my pandas skills so my question is:
Is there any way to do it with pandas treatment or doing it manually by treating my DataFrames with iterrows is the only way ?

Thank you !

Asked By: AxRab

||

Source

Answer 1

You can use cumcount before pivoting to get your suffixes:

df['idx'] = df.groupby('projectid').cumcount() + 1
df = df.pivot(index='projectid',columns='idx')[['question','answer']]
df.columns = [''.join(map(str, col)) for col in df.columns]
print(df)

Output::

          question1 question2 question3 answer1 answer2 answer3
projectid
1              'q1'      'q2'      'q3'  'str1'  'str2'  'str3'
2              'q1'      'q3'       NaN  'str4'  'str6'     NaN

Answered By: Tranbi

pandas.DataFrame: How to merge rows with a common column value in the same pandas.DataFrame

Question:

Answers: