Multiply column by a random value between a range depending on the month of the date column
Question:
I have a Dataframe with a "Date" column, and other column that has the "eta".
What I want to do is basically multiply the eta time by a random number, and the range of that number depends on the month.
I almost got the solution but the issue is that it multiplies all the rows that has one of those months by the same number (It gets just one random number which applies to all columns) what I want to do is to recalculate that random number for each row that has the chosen months.
I would post the dataframe visualization as a table but I dont know how to do it and pictures are not allowed.
My code:
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df.loc[df['Date'].dt.month.isin((1,2,3)), 'eta'] *= random.uniform(1,2)
Answers:
Use a mask for boolean indexing and numpy.random.uniform
with a size
equal to the number of True
of the mask (counted using sum
):
m = df['Date'].dt.month.isin((1,2,3))
df.loc[m, 'eta'] *= np.random.uniform(1, 2, m.sum())
Example:
df = pd.DataFrame({'Date': pd.date_range('2023-01-01', '2023-12-31', freq='MS'),
'eta': range(1, 13)})
m = df['Date'].dt.month.isin((1,2,3))
df.loc[m, 'eta'] *= np.random.uniform(1, 2, m.sum())
Output:
Date eta
0 2023-01-01 1.704245 # values multiplied
1 2023-02-01 2.051441 # each by a different
2 2023-03-01 5.712714 # random factor
3 2023-04-01 4.000000
4 2023-05-01 5.000000
5 2023-06-01 6.000000
6 2023-07-01 7.000000
7 2023-08-01 8.000000
8 2023-09-01 9.000000
9 2023-10-01 10.000000
10 2023-11-01 11.000000
11 2023-12-01 12.000000
I have a Dataframe with a "Date" column, and other column that has the "eta".
What I want to do is basically multiply the eta time by a random number, and the range of that number depends on the month.
I almost got the solution but the issue is that it multiplies all the rows that has one of those months by the same number (It gets just one random number which applies to all columns) what I want to do is to recalculate that random number for each row that has the chosen months.
I would post the dataframe visualization as a table but I dont know how to do it and pictures are not allowed.
My code:
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df.loc[df['Date'].dt.month.isin((1,2,3)), 'eta'] *= random.uniform(1,2)
Use a mask for boolean indexing and numpy.random.uniform
with a size
equal to the number of True
of the mask (counted using sum
):
m = df['Date'].dt.month.isin((1,2,3))
df.loc[m, 'eta'] *= np.random.uniform(1, 2, m.sum())
Example:
df = pd.DataFrame({'Date': pd.date_range('2023-01-01', '2023-12-31', freq='MS'),
'eta': range(1, 13)})
m = df['Date'].dt.month.isin((1,2,3))
df.loc[m, 'eta'] *= np.random.uniform(1, 2, m.sum())
Output:
Date eta
0 2023-01-01 1.704245 # values multiplied
1 2023-02-01 2.051441 # each by a different
2 2023-03-01 5.712714 # random factor
3 2023-04-01 4.000000
4 2023-05-01 5.000000
5 2023-06-01 6.000000
6 2023-07-01 7.000000
7 2023-08-01 8.000000
8 2023-09-01 9.000000
9 2023-10-01 10.000000
10 2023-11-01 11.000000
11 2023-12-01 12.000000