Pandas dataframe replace al non-nan values by a value of specific column
Question:
I would like to transform a dataframe such that all values that are not nan are replaced with the corresponding value of the column ‘id’.
Example:
From
df = pd.DataFrame({'id': ['X', 'Y', 'Z'],
'A': [1, np.nan,0],
'B': [0, 0, np.nan],
'C': [np.nan, 1, 1]})
to
df = pd.DataFrame({'id': ['X', 'Y', 'Z'],
'A': ['X', np.nan,'Z'],
'B': ['X', 'Y', np.nan],
'C': [np.nan, 'Y', 'Z']})
Doing it with looping over column and row indices would probably take very long on large dataframes, so I would prefer a solution using the pandas functions.
Answers:
You can use a mask and multiplication of the boolean mask as string:
m = df.notna()
out = m.mul(df['id'], axis=0).where(m)
Or with numpy:
import numpy as np
m = df.notna()
out = pd.DataFrame(np.where(m, np.repeat(df['id'].to_numpy()[:,None],
df.shape[1], axis=1),
df),
index=df.index, columns=df.columns)
Another idea with reindexing:
out = df[['id']].reindex(columns=df.columns).ffill(axis=1).where(df.notna())
Output:
id A B C
0 X X X NaN
1 Y NaN Y Y
2 Z Z NaN Z
I would like to transform a dataframe such that all values that are not nan are replaced with the corresponding value of the column ‘id’.
Example:
From
df = pd.DataFrame({'id': ['X', 'Y', 'Z'],
'A': [1, np.nan,0],
'B': [0, 0, np.nan],
'C': [np.nan, 1, 1]})
to
df = pd.DataFrame({'id': ['X', 'Y', 'Z'],
'A': ['X', np.nan,'Z'],
'B': ['X', 'Y', np.nan],
'C': [np.nan, 'Y', 'Z']})
Doing it with looping over column and row indices would probably take very long on large dataframes, so I would prefer a solution using the pandas functions.
You can use a mask and multiplication of the boolean mask as string:
m = df.notna()
out = m.mul(df['id'], axis=0).where(m)
Or with numpy:
import numpy as np
m = df.notna()
out = pd.DataFrame(np.where(m, np.repeat(df['id'].to_numpy()[:,None],
df.shape[1], axis=1),
df),
index=df.index, columns=df.columns)
Another idea with reindexing:
out = df[['id']].reindex(columns=df.columns).ffill(axis=1).where(df.notna())
Output:
id A B C
0 X X X NaN
1 Y NaN Y Y
2 Z Z NaN Z