How to make a rubik's cube dataframe transformation
Question:
I’m trying to make a transformation rubik’s cube like (see I/O below) :
# INPUT :
COLX COLY COLZ
0 A C NaN
1 C B A
2 C NaN B
# OUTPUT :
COLX COLY COLZ
0 A Missing C
1 A B C
2 Missing B C
Basically, I need to sort the values of each row alphabetically.
Here is the dataframe I’m using :
import pandas as pd
import numpy as np
df = pd.DataFrame({'COLX': ['A', 'C', 'C'], 'COLY': ['C', 'B', np.nan] , 'COLZ': [np.nan, 'A', 'B']})
Is there any propositions ?
Should I use pandas.DataFrame.shift
? If so, how to proceed, please ?
Answers:
Let’s stack, then pivot:
out = (df.stack().reset_index(name='value')
.assign(col=lambda x: x['value'])
.pivot(index='level_0', columns='col', values='value')
.fillna('Missing')
)
out.columns = df.columns
Output:
COLX COLY COLZ
level_0
0 A Missing C
1 A B C
2 Missing B C
Update per request:
out = (df.stack().reset_index(name='value')
.assign(col=lambda x: x['value'])
.pivot(index='level_0', columns='col', values='value')
)
out[:] = np.where(out.isna(), ['Missing_' + out.columns], out)
out.columns = df.columns
Output:
COLX COLY COLZ
level_0
0 A Missing_B C
1 A B C
2 Missing_A B C
I’m trying to make a transformation rubik’s cube like (see I/O below) :
# INPUT :
COLX COLY COLZ
0 A C NaN
1 C B A
2 C NaN B
# OUTPUT :
COLX COLY COLZ
0 A Missing C
1 A B C
2 Missing B C
Basically, I need to sort the values of each row alphabetically.
Here is the dataframe I’m using :
import pandas as pd
import numpy as np
df = pd.DataFrame({'COLX': ['A', 'C', 'C'], 'COLY': ['C', 'B', np.nan] , 'COLZ': [np.nan, 'A', 'B']})
Is there any propositions ?
Should I use pandas.DataFrame.shift
? If so, how to proceed, please ?
Let’s stack, then pivot:
out = (df.stack().reset_index(name='value')
.assign(col=lambda x: x['value'])
.pivot(index='level_0', columns='col', values='value')
.fillna('Missing')
)
out.columns = df.columns
Output:
COLX COLY COLZ
level_0
0 A Missing C
1 A B C
2 Missing B C
Update per request:
out = (df.stack().reset_index(name='value')
.assign(col=lambda x: x['value'])
.pivot(index='level_0', columns='col', values='value')
)
out[:] = np.where(out.isna(), ['Missing_' + out.columns], out)
out.columns = df.columns
Output:
COLX COLY COLZ
level_0
0 A Missing_B C
1 A B C
2 Missing_A B C