Make a row-wise Conditional Column
Question:
I got this dataframe:
Df = pd.DataFrame({'TIPOIDPRESTADOR': ['CC', 'NI', 'CE', 'RS'],
'Levels': [0, 1, np.nan, np.nan]
})
| TIPOIDPRESTADOR | Levels |
| -------- | -------- |
| CC | 0 |
| NI | 1 |
| CE | NaN |
| RS | NaN |
and a want to make a loop that given the maximun value of the column ‘Levels’ (in this case 1) if the netx row is nan, then pass the maximun value of the column plus 1 and so on
the desired output should be something like this:
Desired_Output = pd.DataFrame({'TIPOIDPRESTADOR': ['CC', 'NI', 'CE', 'RS'],
'Levels': [0, 1, 2, 3]
})
| TIPOIDPRESTADOR | Levels |
| -------- | -------- |
| CC | 0 |
| NI | 1 |
| CE | 2 |
| RS | 3 |
i was trying to use iterrows like this
for row in Df.iterrows():
Max_value = float(max(Df[["TIPOIDPRESTADOR"]))
Df['TIPOIDPRESTADOR'] = np.where(Df["TIPOIDPRESTADOR"].isna()==True, Max_value+1, Df["TIPOIDPRESTADOR"])
Max_value = Max_value+1
but i’m getting something like this:
| TIPOIDPRESTADOR | Levels |
| -------- | -------- |
| CC | 0 |
| NI | 1 |
| CE | 2 |
| RS | 2 |
i know that it’s a simple task but it’s really struggling me
I would greatly appreciate your help
Answers:
You were performing operations on TIPOIDPRESTADOR
column rather than on Levels
(assume those were typos, otherwise you wouldn’t have got your result) and when using np.where()
in a loop you probably have filled all NaN
values in the first iteration and there has become nothing to update afterwards.
Try this:
for i, row in Df.iterrows():
if pd.isna(row['Levels']) == True:
Df.loc[i, 'Levels'] = Df['Levels'].max() + 1
else:
pass
Df
Output:
TIPOIDPRESTADOR Levels
0 CC 0.0
1 NI 1.0
2 CE 2.0
3 RS 3.0
I got this dataframe:
Df = pd.DataFrame({'TIPOIDPRESTADOR': ['CC', 'NI', 'CE', 'RS'],
'Levels': [0, 1, np.nan, np.nan]
})
| TIPOIDPRESTADOR | Levels |
| -------- | -------- |
| CC | 0 |
| NI | 1 |
| CE | NaN |
| RS | NaN |
and a want to make a loop that given the maximun value of the column ‘Levels’ (in this case 1) if the netx row is nan, then pass the maximun value of the column plus 1 and so on
the desired output should be something like this:
Desired_Output = pd.DataFrame({'TIPOIDPRESTADOR': ['CC', 'NI', 'CE', 'RS'],
'Levels': [0, 1, 2, 3]
})
| TIPOIDPRESTADOR | Levels |
| -------- | -------- |
| CC | 0 |
| NI | 1 |
| CE | 2 |
| RS | 3 |
i was trying to use iterrows like this
for row in Df.iterrows():
Max_value = float(max(Df[["TIPOIDPRESTADOR"]))
Df['TIPOIDPRESTADOR'] = np.where(Df["TIPOIDPRESTADOR"].isna()==True, Max_value+1, Df["TIPOIDPRESTADOR"])
Max_value = Max_value+1
but i’m getting something like this:
| TIPOIDPRESTADOR | Levels |
| -------- | -------- |
| CC | 0 |
| NI | 1 |
| CE | 2 |
| RS | 2 |
i know that it’s a simple task but it’s really struggling me
I would greatly appreciate your help
You were performing operations on TIPOIDPRESTADOR
column rather than on Levels
(assume those were typos, otherwise you wouldn’t have got your result) and when using np.where()
in a loop you probably have filled all NaN
values in the first iteration and there has become nothing to update afterwards.
Try this:
for i, row in Df.iterrows():
if pd.isna(row['Levels']) == True:
Df.loc[i, 'Levels'] = Df['Levels'].max() + 1
else:
pass
Df
Output:
TIPOIDPRESTADOR Levels
0 CC 0.0
1 NI 1.0
2 CE 2.0
3 RS 3.0