Reshape dataframe from long to wide

Question:

my df:

d = {'project_id': [19,20,19,20,19,20], 
     'task_id': [11,22,11,22,11,22], 
     "task": ["task_1","task_1","task_1","task_1","task_1","task_1"], 
     "username": ["tom","jery","tom","jery","tom","jery"],
     "image_id":[101,202,303,404,505,606],
     "frame":[0,0,9,8,11,11],
     "label":['foo','foo','bar','xyz','bar','bar']} 
df = pd.DataFrame(data=d) 

So my df, is long format, in some duplicate and only image_id is unique.
I trying pivot my df, with pd.pivot and pd.merge reshape to wide format by username.
My code:

pd.pivot(df, index=['task','frame','image_id'], columns = 'username', values='label')

My output:
actual

I expected(or want to reach):
expected

So, as you see, I don’t really need image_id in my output. Just summary, which user use tag per frame.

Asked By: TeoK

||

Answers:

You can add a groupby.first after the pivot:

(pd.pivot(df, index=['task','frame','image_id'],
          columns='username', values='label')
   .groupby(level=['task','frame']).first()
)

Or use pivot_table with aggfunc='first':

pd.pivot_table(df, index=['task','frame'],
               columns='username', values='label',
               aggfunc='first')   

Output:

username      jery   tom
task   frame            
task_1 0       foo   foo
       8       xyz  None
       9      None   bar
       11      bar   bar
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.