Pandas division by zero, errors despite np.where condition

Question:

So I am using jupyter notebooks and I have a function that uses the

data['woe2'] = np.log(data['B']/data['MonthSales'])

equation. The issue i’m having is that when ‘B’ equals 0 Python throws a tantrum over division by 0. This happens even though I tried using np.where to make an exception. Do you guys have any ideas?

import pandas as pd
data = pd.DataFrame({"A" : ["John","Deep","Julia","Kate","Sandy"], 
                     "MonthSales" : [25,30,35,40,45], "B" : [10,0,0,20,40]})
data['woe2'] = np.where((data['B'] != 0),
                               np.log(data['B']/data['MonthSales']), 0)
Asked By: Mobix

||

Answers:

It isn’t complaining about division BY zero, but division of zero by a non-zero denominator. It is producing -inf.

Here is a bit cleaner way to do it, as you can pass Pandas tests in as conditionals.

data_bool = data[‘B’] != 0

data[‘woe2’] = np.log(data[data_bool][‘B’]/data[data_bool][‘MonthSales’])

Answered By: gdahlm

In recent:

"RuntimeWarning: divide by zero encountered in log" in numpy.log even though small values were filtered out

we explain that np.where is a conditional selector; its arguments are evaluated in full first.

The Series division:

In [72]: data['B']/data['MonthSales']                                                                                   
Out[72]:                                                                                                                
0    0.400000                                                                                                           
1    0.000000                                                                                                           
2    0.000000                                                                                                           
3    0.500000                                                                                                           
4    0.888889                                                                                                           
 dtype: float64                                                                                                          

Taking the log, raises the warning. Note it is issued by pandas.core.arraylike:

In [73]: np.log(data['B']/data['MonthSales'])                                                                           
C:Userspaulminiconda3libsite-packagespandascorearraylike.py:402: RuntimeWarning: divide by zero encountered in log                                                                                                                        
result = getattr(ufunc, method)(*inputs, **kwargs)                                                                    
Out[73]:                                                                                                                
0   -0.916291                                                                                                           
1        -inf                                                                                                           
2        -inf                                                                                                           
3   -0.693147                                                                                                           
4   -0.117783                                                                                                            
 dtype: float64                                                                                                          

If instead we take the log of the equivalent array, using the where/out parameters to make it conditional, we avoid the warning:

In [74]: np.log((data['B']/data['MonthSales']).values, where=data['B']>0, 
out=np.zeros(data.shape[0]))                  
Out[74]: array([-0.91629073,  0.        ,  0.        , -0.69314718, -0.11778304]) 
Answered By: hpaulj

I think that warning is just not significant (like all warnings), indeed, in numpy documentation, they put an example with 0 in the argument array and when we run the same code of the example, it goes with the same warning

Answered By: FrankAyra
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.