Reformat/pivot pandas dataframe
Question:
For each row, I want that rows’ index to be changed to column_index, and have the whole thing split from x row * y columns to 1 row * x*y columns shape.
import pandas as pd
df = pd.DataFrame(data=[['Jon', 21, 1.77,160],['Jane',44,1.6,130]],columns=['name','age', 'height','weight'])
want = pd.DataFrame(data=[['Jon', 21, 1.77,160,'Jane',44,1.6,130]],columns=['name_0','age_0', 'height_0','weight_0','name_1','age_1', 'height_1','weight_1'])
# original df
name age height weight
0 Jon 21 1.77 160
1 Jane 44 1.60 130
# desired df - want
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
I tried df.unstack().to_frame().T
and while it reduces the rows to one, it creates a multiindex, so not ideal:
name age height weight
0 1 0 1 0 1 0 1
0 Jon Jane 21 44 1.77 1.6 160 130
I don’t think pivot table will work here.
Answers:
Using stack
and flattening the MultiIndex:
out = df.stack().swaplevel().to_frame().T
out.columns = out.columns.map(lambda x: f'{x[0]}_{x[1]}')
Output:
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
If you want a more manual method using a new dataframe, here is an alternate solution:
df = pd.DataFrame(data=[['Jon', 21, 1.77,160],['Jane',44,1.6,130]],columns=['name','age', 'height','weight'])
new_df = pd.DataFrame()
for i in range(len(df)):
new_df = pd.concat([new_df, pd.DataFrame([[df.name.iloc[i], df.age.iloc[i], df.height.iloc[i], df.weight.iloc[i]]], columns = ['name_'+str(i), 'age_'+str(i),'height_'+str(i),'weight_'+str(i)])], axis=1)
Output:
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
Hope that helps 🙂
Mostly like @mozway’s answer:
s = df.stack()
pd.DataFrame(data=s.values.reshape(1, -1),
columns=s.index.map(lambda x: f"{x[1]}_{x[0]}"))
Output:
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
For each row, I want that rows’ index to be changed to column_index, and have the whole thing split from x row * y columns to 1 row * x*y columns shape.
import pandas as pd
df = pd.DataFrame(data=[['Jon', 21, 1.77,160],['Jane',44,1.6,130]],columns=['name','age', 'height','weight'])
want = pd.DataFrame(data=[['Jon', 21, 1.77,160,'Jane',44,1.6,130]],columns=['name_0','age_0', 'height_0','weight_0','name_1','age_1', 'height_1','weight_1'])
# original df
name age height weight
0 Jon 21 1.77 160
1 Jane 44 1.60 130
# desired df - want
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
I tried df.unstack().to_frame().T
and while it reduces the rows to one, it creates a multiindex, so not ideal:
name age height weight
0 1 0 1 0 1 0 1
0 Jon Jane 21 44 1.77 1.6 160 130
I don’t think pivot table will work here.
Using stack
and flattening the MultiIndex:
out = df.stack().swaplevel().to_frame().T
out.columns = out.columns.map(lambda x: f'{x[0]}_{x[1]}')
Output:
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
If you want a more manual method using a new dataframe, here is an alternate solution:
df = pd.DataFrame(data=[['Jon', 21, 1.77,160],['Jane',44,1.6,130]],columns=['name','age', 'height','weight'])
new_df = pd.DataFrame()
for i in range(len(df)):
new_df = pd.concat([new_df, pd.DataFrame([[df.name.iloc[i], df.age.iloc[i], df.height.iloc[i], df.weight.iloc[i]]], columns = ['name_'+str(i), 'age_'+str(i),'height_'+str(i),'weight_'+str(i)])], axis=1)
Output:
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130
Hope that helps 🙂
Mostly like @mozway’s answer:
s = df.stack()
pd.DataFrame(data=s.values.reshape(1, -1),
columns=s.index.map(lambda x: f"{x[1]}_{x[0]}"))
Output:
name_0 age_0 height_0 weight_0 name_1 age_1 height_1 weight_1
0 Jon 21 1.77 160 Jane 44 1.6 130