python: getting around division by zero

Question:

I have a big data set of floating point numbers. I iterate through them and evaluate np.log(x) for each of them.
I get

RuntimeWarning: divide by zero encountered in log

I would like to get around this and return 0 if this error occurs.

I am thinking of defining a new function:

def safe_ln(x):
    #returns: ln(x) but replaces -inf with 0
    l = np.log(x)
    #if l = -inf:
    l = 0
    return l

Basically,I need a way of testing that the output is -inf but I don’t know how to proceed.
Thank you for your help!

Asked By: Julia

||

Answers:

you could do:

def safe_ln(x):
    #returns: ln(x) but replaces -inf with 0
    try:
        l = np.log(x)
    except RunTimeWarning:
        l = 0
    return l
Answered By: Cameron Sparr

Since the log for x=0 is minus infinite, I’d simply check if the input value is zero and return whatever you want there:

def safe_ln(x):
    if x <= 0:
        return 0
    return math.log(x)

EDIT: small edit: you should check for all values smaller than or equal to 0.

EDIT 2: np.log is of course a function to calculate on a numpy array, for single values you should use math.log. This is how the above function looks with numpy:

def safe_ln(x, minval=0.0000000001):
    return np.log(x.clip(min=minval))
Answered By: Constantinius

use exception handling:

In [27]: def safe_ln(x):
    try:
        return math.log(x)
    except ValueError:       # np.log(x) might raise some other error though
        return float("-inf")
   ....:     

In [28]: safe_ln(0)
Out[28]: -inf

In [29]: safe_ln(1)
Out[29]: 0.0

In [30]: safe_ln(-100)
Out[30]: -inf
Answered By: Ashwini Chaudhary

You can do this.

def safe_ln(x):
   try:
      l = np.log(x)
   except ZeroDivisionError:
      l = 0
   return l
Answered By: Jeff

You are using a np function, so I can safely guess that you are working on a numpy array?
Then the most efficient way to do this is to use the where function instead of a for loop

myarray= np.random.randint(10,size=10)
result = np.where(myarray>0, np.log(myarray), 0)

otherwise you can simply use the log function and then patch the hole:

myarray= np.random.randint(10,size=10)
result = np.log(myarray)
result[result==-np.inf]=0

The np.log function return correctly -inf when used on a value of 0, so are you sure that you want to return a 0? if somewhere you have to revert to the original value, you are going to experience some problem, changing zeros into ones…

Answered By: EnricoGiampieri

The answer given by Enrico is nice, but both solutions result in a warning:

RuntimeWarning: divide by zero encountered in log

As an alternative, we can still use the where function but only execute the main computation where it is appropriate:

# alternative implementation -- a bit more typing but avoids warnings.
loc = np.where(myarray>0)
result2 = np.zeros_like(myarray, dtype=float)
result2[loc] =np.log(myarray[loc])

# answer from Enrico...
myarray= np.random.randint(10,size=10)
result = np.where(myarray>0, np.log(myarray), 0)

# check it is giving right solution:
print(np.allclose(result, result2))

My use case was for division, but the principle is clearly the same:

x = np.random.randint(10, size=10)
divisor = np.ones(10,)
divisor[3] = 0 # make one divisor invalid

y = np.zeros_like(divisor, dtype=float)
loc = np.where(divisor>0) # (or !=0 if your data could have -ve values)
y[loc] = x[loc] / divisor[loc]
Answered By: Bonlenfum

I like to use sys.float_info.min as follows:

>>> import numpy as np
>>> import sys
>>> arr = np.linspace(0.0, 1.0, 3)
>>> print(arr)
[0.  0.5 1. ]
>>> arr[arr < sys.float_info.min] = sys.float_info.min
>>> print(arr)
[2.22507386e-308 5.00000000e-001 1.00000000e+000]
>>> np.log10(arr)
array([-3.07652656e+02, -3.01029996e-01,  0.00000000e+00])

Other answers have also introduced introduced small positive values, but I prefer to use the smallest valid input when I am approximating the result for an input that is too small to be processed.

Answered By: Chandran Goodchild

For those looking for a np.log solution that intakes a np.ndarray and nudges up only zero values:

import sys
import numpy as np

def smarter_nextafter(x: np.ndarray) -> np.ndarray:
    safe_x = np.where(x != 0, x, np.nextafter(x, 1))
    return np.log(safe_x)

def clip_usage(x: np.ndarray, safe_min: float | None = None) -> np.ndarray:
    # Inspiration: https://stackoverflow.com/a/13497931/
    clipped_x = x.clip(min=safe_min or np.finfo(x.dtype).min)
    return np.log(clipped_x)

def inplace_usage(x: np.ndarray, safe_min: float | None = None) -> np.ndarray:
    # Inspiration: https://stackoverflow.com/a/62292638/
    x[x == 0] = safe_min or np.finfo(x.dtype).min
    return np.log(x)

Or if you don’t mind nudging all values and like bad big-O runtimes:

def brute_nextafter(x: np.ndarray) -> np.ndarray:
    # Just for reference, don't use this
    while not x.all():
        x = np.nextafter(x, 1)
    return np.log(x)
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.