Fill sparse row of a dataframe with the existing values in the column
Question:
I have a dataframe of the following types:
CurrentDf = pd.DataFrame(np.array([[5, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], [7, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0], [8, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0], [7, 0, 1, 0, 4, 0, 0, 0, 0, 0, 0], [5, 0, 1, 0, 5, 0, 0, 0, 0, 0, 0], [5, 1, 0, 0, 3, 0, 0, 0, 0, 0, 0]]),
columns=['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10'])
'''
and I would like to transform it in this one:
'''
DesiredDf = pd.DataFrame(np.array([[5, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0], [7, 0, 0, 0, 1, 1, 2, 2, 0, 0, 0], [8, 0, 0, 0, 1, 1, 3, 3, 3, 0, 0], [7, 0, 1, 1, 4, 4, 4, 4, 0, 0, 0], [5, 0, 1, 1, 5, 5, 0, 0, 0, 0, 0], [5, 1, 1, 1, 3, 3, 0, 0, 0, 0, 0]]),
columns=['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10'])
For each row, the next column take the value of the previous column if the next column is zero, ’till the column number which is given in the column 0.
Answers:
NaNs
are easier to work with than 0s:
df = df.replace(0, np.nan)
df = df.ffill(axis=1).fillna(0).astype(int)
print(df)
Output:
1 2 3 4 5 6
0 0 0 0 0 1 1
1 0 1 1 1 1 1
2 1 1 2 2 2 2
3 0 0 1 1 2 2
4 2 1 1 3 3 3
5 0 0 2 2 5 5
I have a dataframe of the following types:
CurrentDf = pd.DataFrame(np.array([[5, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0], [7, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0], [8, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0], [7, 0, 1, 0, 4, 0, 0, 0, 0, 0, 0], [5, 0, 1, 0, 5, 0, 0, 0, 0, 0, 0], [5, 1, 0, 0, 3, 0, 0, 0, 0, 0, 0]]),
columns=['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10'])
'''
and I would like to transform it in this one:
'''
DesiredDf = pd.DataFrame(np.array([[5, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0], [7, 0, 0, 0, 1, 1, 2, 2, 0, 0, 0], [8, 0, 0, 0, 1, 1, 3, 3, 3, 0, 0], [7, 0, 1, 1, 4, 4, 4, 4, 0, 0, 0], [5, 0, 1, 1, 5, 5, 0, 0, 0, 0, 0], [5, 1, 1, 1, 3, 3, 0, 0, 0, 0, 0]]),
columns=['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10'])
For each row, the next column take the value of the previous column if the next column is zero, ’till the column number which is given in the column 0.
NaNs
are easier to work with than 0s:
df = df.replace(0, np.nan)
df = df.ffill(axis=1).fillna(0).astype(int)
print(df)
Output:
1 2 3 4 5 6
0 0 0 0 0 1 1
1 0 1 1 1 1 1
2 1 1 2 2 2 2
3 0 0 1 1 2 2
4 2 1 1 3 3 3
5 0 0 2 2 5 5