Null Behavior in Pandas DataFrame Multiplication

Question:

I have a DataFrame of values to questions, vals, and a DataFrame of the weights to those questions multiply_vals. Each record in the vals DataFrame corresponds to a single user.

import pandas as pd
vals = pd.DataFrame({'A1':[0,1], 'A2':[1,2], 'A3':[3,3],'A4':[4,2],'B1':[2,1]})
multiply_vals = pd.DataFrame({'Weights':[.5,.25,.75,1,.33]}, index=['A1','A2','A3','A4','B1'])

#vals
   A1  A2  A3  A4  B1
0   0   1   3   4   2
1   1   2   3   2   1

#Multiply Vals
    Weights
A1     0.50
A2     0.25
A3     0.75
A4     1.00
B1     0.33

I want to multiply each row in vals by the correct weight multiply_vals, but there seems to be some unexpected results with nulls.

Expected result:

    A1    A2    A3  A4    B1
0  0.0  0.25  2.25   4  0.66
1  0.5  0.50  2.25   2  0.33

What I tried:
I tried using mul/multiply as well as combining it with transpose/T but it returns nulls.

vals.mul(multiply_vals.T, axis=1)

         A1  A2  A3  A4  B1
0       NaN NaN NaN NaN NaN
1       NaN NaN NaN NaN NaN
Weights NaN NaN NaN NaN NaN
 

Unexpected Behavior:
if I take the exact same but use .values it works.

vals.mul(multiply_vals.T.values, axis=1)

    A1    A2    A3   A4    B1
0  0.0  0.25  2.25  4.0  0.66
1  0.5  0.50  2.25  2.0  0.33

Why does .values work?
Using pandas version '0.25.0'

Asked By: MattR

||

Answers:

Define the second one as a Series as it is only one column, then multiply by its transpose:

import pandas as pd
vals = pd.DataFrame({'A1':[0,1], 'A2':[1,2], 'A3':[3,3],'A4':[4,2],'B1':[2,1]})
multiply_vals = pd.Series([.5,.25,.75,1,.33], index=['A1','A2','A3','A4','B1'])
vals*multiply_vals.T
    A1    A2    A3   A4    B1
0  0.0  0.25  2.25  4.0  0.66
1  0.5  0.50  2.25  2.0  0.33
Answered By: anishtain4

You just need the values from multiply vals

vals * multiply_vals.values.T

    A1    A2    A3   A4    B1
0  0.0  0.25  2.25  4.0  0.66
1  0.5  0.50  2.25  2.0  0.33
Answered By: Kenan

Try this code:

import pandas as pd
vals = pd.DataFrame({'A1':[0,1], 'A2':[1,2], 'A3':[3,3],'A4':[4,2],'B1':[2,1]})
multiply_vals = pd.DataFrame({'Weights':[.5,.25,.75,1,.33]}, index=['A1','A2','A3','A4','B1'])
vals2 = vals.transpose()
vals2.columns =['0', '1']
df_join = pd.merge(vals2, multiply_vals, left_index=True, right_index=True)
df_join['0 weighted'] = df_join['0']*df_join['Weights']
df_join['1 weighted'] = df_join['1']*df_join['Weights']
df_final = df_join[['0 weighted', '1 weighted']]
df_final = df_final.transpose()
df_final.head()
Answered By: Gon E

The reason DataFrame.mul and DataFrame.multiply don’t work as expected is that they are referencing the names of the columns and rows to do elementwise operations. This is very useful for other purposes.

Converting to a Series with vals.mul(multiply_vals.T.values, axis=1)

or vals * multiply_vals.values.T solves the original problem.

However, if you want to make DataFrame.mul to work, you could do this:

Starting with the same DataFrames…

vals = pd.DataFrame({'A1':[0,1], 'A2':[1,2], 'A3':[3,3],'A4':[4,2],'B1':[2,1]})
multiply_vals = pd.DataFrame({'Weights':[.5,.25,.75,1,.33]}, index=['A1','A2','A3','A4','B1'])

We need to reshape multiply_vals to match the expected shape.

# copying the rows, in a somewhat silly exercise
multiply_vals_reshaped = pd.concat([multiply_vals.T, multiply_vals.T], axis=0)

# matching the index of vals
multiply_vals_reshaped.reset_index(drop=True, inplace=True)

#multiply_vals_reshaped

     A1   A2    A3    A4   B1
  0  0.5  0.25  0.75  1.0  0.33
  1  0.5  0.25  0.75  1.0  0.33

vals.mul(multiply_vals_reshaped) now behaves as expected:

     A1    A2   A3    A4   B1
  0  0.0  0.25  2.25  4.0  0.66
  1  0.5  0.50  2.25  2.0  0.33
Answered By: nycdatawrangler
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.