Multiply a column deppending on the value of other column

Question

I have a Dataframe with a "Weather" column, and other column that has the "eta".

What I want to do is basically multiply the eta time by a random number, and the range of that number depends on the climate.

The pseudocode looks like this:

If(Climate == 'Sunny') then 'eta' = 'eta' * Random(0.8*1.0)
else if (Climate == 'Rainny') then 'eta' = 'eta' * Random(1.0*1.2)

else if (Climate == 'Cloudy') then 'eta' = 'eta' * Random(0.9*1.1)

I dont know how to achieve this using a Pandas DataFrame, my best aproximation was this but didnt work.

df.loc[df['Climate'] == 'Rain', 'eta' * random.uniform(1.0, 1.2)]

I expected it to multiply the eta column by a rand value between 1.0-1.2 if the value of the ‘eta’ column was ‘Rain’

Asked By: Alberto

||

Source

Answer 1

You might want to use:

min_max = {'Sunny': (0.8, 1), 'Rainy': (1, 1.2)}

df['eta'] = (df.groupby('Climate')['eta']
               .apply(lambda x: x*np.random.uniform(*min_max[x.name], size=len(x)))
             )

Example (as new column for clarity):

  Climate       eta   new_eta
0   Sunny  3.258367  3.026513
1   Sunny  5.615873  4.962923
2   Sunny  4.046182  3.761648
3   Sunny  0.367640  0.296795
4   Sunny  2.875452  2.677827
5   Rainy  3.576453  3.856957
6   Rainy  5.674834  5.895780
7   Rainy  7.876974  8.576879
8   Rainy  8.098803  9.473710
9   Rainy  0.750729  0.841462

For a vectorial approach, using numpy:

min_max = {'Sunny': (0.8, 1), 'Rainy': (1, 1.2)}

low, up = (pd.DataFrame(min_max, index=['min', 'max'])
             .reindex(columns=df['Climate']).to_numpy()
           )

a = np.random.random(size=len(df))

df['eta'] *= a*(up-low)+low

Answered By: mozway

Multiply a column deppending on the value of other column

Question:

Answers: