How to pivot and stack dataframe in pandas without group by while having duplicate values in the pivot?

Question:

I have df that looks like this:

ref      text   id     
a        zz     12eia
a        yy     radf02
b        aa     a8adf
b        bb     2022a

I am trying to rotate this dataframe to look like below with values in column ref becoming column names and values in text becoming values under those columns and I dont need the ‘id’ column :

a     b     
zz    aa
yy    bb 

I tried using this line, but I am not getting the result, without adding the id column:

df_rotated = df.pivot_table(index='ref', values='text', columns='id', aggfunc='first')

The data collapses and is not the result I want, what am I doing wrong?

Asked By: RustyShackleford

||

Answers:

You need to create an appropriate index, which you can do using groupby and .cumcount.

Here I create the required index:

df['ind'] = df.groupby(['ref']).cumcount()

Which looks like this:

  ref text      id  ind
0   a   zz   12eia    0
1   a   yy  radf02    1
2   b   aa   a8adf    0
3   b   bb   2022a    1

You can then create your df.pivot as per the following code:

Code:

df = pd.DataFrame({ 'ref': ['a', 'a', 'b', 'b'],
                    'text': ['zz', 'yy', 'aa', 'bb'],
                    'id': ['12eia', 'radf02', 'a8adf', '2022a']})

df['ind'] = df.groupby(['ref']).cumcount()


df_rotated = df.pivot(columns='ref', values='text', index = 'ind').reset_index(drop='true')
print(df_rotated)

Output:

ref   a   b
0    zz  aa
1    yy  bb
Answered By: ScottC
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.