If the value is a number between x and y, change it to z (pd df)
Question:
Consider:
data.loc[data.cig_years (10 < a < 20), 'cig_years'] = 1
This is the code I have tried, but it’s not working. In pseudocode I want:
In the df 'data'
In the column 'cig_years'
If the value is a number between 10 and 20, change it to 1
Is there a Pythonic way of doing this? Preferably without for loops.
Answers:
I got you. You can filter pandas dataframes with square brackets []:
data['cig_years'] [ (data['cig_years']>10) | (data['cig_years']<20) ] = 1
This basically says:
The columns ‘cig_years’ in data, where the columns ‘cig_years’ is more than 10, or less than, is set equal to 1
This is super useful in pandas dataframes because you can filter for specific columns, or filter by conditions on other columns, and then set those filtered values.
You need to use your dataframe name "data" and change it using .loc like below:
data.loc[10 < data['cig_years'] < 20, 'cig_years'] = 1
You could also use an np.where statement
This statement assumes you are going to leave the cig years alone if it is not between 10 and 20
np.where(10 < data['cig_years'] < 20, 1, data['cig_years'])
Consider:
data.loc[data.cig_years (10 < a < 20), 'cig_years'] = 1
This is the code I have tried, but it’s not working. In pseudocode I want:
In the df 'data'
In the column 'cig_years'
If the value is a number between 10 and 20, change it to 1
Is there a Pythonic way of doing this? Preferably without for loops.
I got you. You can filter pandas dataframes with square brackets []:
data['cig_years'] [ (data['cig_years']>10) | (data['cig_years']<20) ] = 1
This basically says:
The columns ‘cig_years’ in data, where the columns ‘cig_years’ is more than 10, or less than, is set equal to 1
This is super useful in pandas dataframes because you can filter for specific columns, or filter by conditions on other columns, and then set those filtered values.
You need to use your dataframe name "data" and change it using .loc like below:
data.loc[10 < data['cig_years'] < 20, 'cig_years'] = 1
You could also use an np.where statement
This statement assumes you are going to leave the cig years alone if it is not between 10 and 20
np.where(10 < data['cig_years'] < 20, 1, data['cig_years'])