Shifting rows based on current column values in pandas
Question:
I’m trying to shift all rows that have an E in colB to be blank in colA and then the values of colA and colB shifted over 1.
Everything I search on shift or adjusting has to do with moving columns up or down. I haven’t found this solution yet.
Edit: I gave a poor example here. There needs to be 2 unchanged columns in the front of colA. That was the shifting I was having and issue with. Not a standard shift. Again my apologies for the bad example.
df =
col0 col1 colA colB colC
0 Z A 1 2 A
1 Z A 3 E
2 Z A 8 7 B
3 Z A 3 E
4 Z A 7 C
# Desired Output:
df_shifted =
col0 col1 colA colB colC
0 Z A 1 2 A
1 Z A 3 E
2 Z A 8 7 B
3 Z A 3 E
4 Z A 7 C
I’ve tried using Shift commands or even splitting my base data differently. I can’t seem to find a better solution than what I’m looking for here.
This is what I tried but to no avail
df = pd.DataFrame(
{
'colZ' :["z", "z", "z", "z", "z"],
'colX' :["A", "A", "A", "A", "A"],
'colA' :(1,3,8,3,7),
'colB' :(2, "E", 7, "E", "C"),
'colC' :("A", "", "B", "", "")
}
)
df2 = pd.DataFrame()
df2["colZ"] = df["colZ"]
df2["colX"] = df["colX"]
df[df['colB'] == 'E'] = df[df['colB'] == 'E'].shift(1, axis=1).fillna('')
df_shifted = pd.DataFrame()
df_shifted["colZ"] = df2["colZ"]
df_shifted["colX"] = df2["colX"]
df_shifted["colA"] = df["colA"]
df_shifted["colB"] = df["colB"]
df_shifted["colC"] = df["colC"]
# Result from this code
df_shifted =
colZ colX colA colB colC
0 z A 1 2 A
1 z A A 3 E
2 z A 8 7 B
3 z A A 3 E
4 z A 7 C
Also this seems super long for what I’m trying to do.
Answers:
Building on this answer, you can use where
:
df[['colA','colB','colC']] = df[['colC','colA','colB']].where(df['colB'] == "E", df[['colA','colB','colC']].values)
prints:
colA colB colC
0 1 2 A
1 NaN 3 E
2 8 7 B
3 NaN 3 E
4 6 7 C
The conditional shift
operation along the column axis and subsequent filling of NaNs is still (after editing the original question) the most efficient method
df.loc[df['colB'] == 'E', ['colA', 'colB', 'colC']] = df.loc[df['colB']=='E', ['colA', 'colB', 'colC']].shift(1, axis=1).fillna('')
which yields
>df
col0 col1 colA colB colC
0 Z A 1 2 A
1 Z A 3 E
2 Z A 8 7 B
3 Z A 3 E
4 Z A 7 C
I’m trying to shift all rows that have an E in colB to be blank in colA and then the values of colA and colB shifted over 1.
Everything I search on shift or adjusting has to do with moving columns up or down. I haven’t found this solution yet.
Edit: I gave a poor example here. There needs to be 2 unchanged columns in the front of colA. That was the shifting I was having and issue with. Not a standard shift. Again my apologies for the bad example.
df =
col0 col1 colA colB colC
0 Z A 1 2 A
1 Z A 3 E
2 Z A 8 7 B
3 Z A 3 E
4 Z A 7 C
# Desired Output:
df_shifted =
col0 col1 colA colB colC
0 Z A 1 2 A
1 Z A 3 E
2 Z A 8 7 B
3 Z A 3 E
4 Z A 7 C
I’ve tried using Shift commands or even splitting my base data differently. I can’t seem to find a better solution than what I’m looking for here.
This is what I tried but to no avail
df = pd.DataFrame(
{
'colZ' :["z", "z", "z", "z", "z"],
'colX' :["A", "A", "A", "A", "A"],
'colA' :(1,3,8,3,7),
'colB' :(2, "E", 7, "E", "C"),
'colC' :("A", "", "B", "", "")
}
)
df2 = pd.DataFrame()
df2["colZ"] = df["colZ"]
df2["colX"] = df["colX"]
df[df['colB'] == 'E'] = df[df['colB'] == 'E'].shift(1, axis=1).fillna('')
df_shifted = pd.DataFrame()
df_shifted["colZ"] = df2["colZ"]
df_shifted["colX"] = df2["colX"]
df_shifted["colA"] = df["colA"]
df_shifted["colB"] = df["colB"]
df_shifted["colC"] = df["colC"]
# Result from this code
df_shifted =
colZ colX colA colB colC
0 z A 1 2 A
1 z A A 3 E
2 z A 8 7 B
3 z A A 3 E
4 z A 7 C
Also this seems super long for what I’m trying to do.
Building on this answer, you can use where
:
df[['colA','colB','colC']] = df[['colC','colA','colB']].where(df['colB'] == "E", df[['colA','colB','colC']].values)
prints:
colA colB colC
0 1 2 A
1 NaN 3 E
2 8 7 B
3 NaN 3 E
4 6 7 C
The conditional shift
operation along the column axis and subsequent filling of NaNs is still (after editing the original question) the most efficient method
df.loc[df['colB'] == 'E', ['colA', 'colB', 'colC']] = df.loc[df['colB']=='E', ['colA', 'colB', 'colC']].shift(1, axis=1).fillna('')
which yields
>df
col0 col1 colA colB colC
0 Z A 1 2 A
1 Z A 3 E
2 Z A 8 7 B
3 Z A 3 E
4 Z A 7 C