# Unexpected behaviour of log1p numpy

## Question:

I am using the function numpy.log1p to calculate the value of log(1 + x) for very small complex numbers, and I am getting unexpected results.

I would expect that outputs should be practically equal to the function input. While this does not seem to be the case in the simple examples below.

```
np.log1p(1e-14 * (1 + 1j))
Out[75]: (9.992007221626358e-15+9.9999999999999e-15j)
np.log1p(1e-15 * (1 + 1j))
Out[76]: (1.110223024625156e-15+9.999999999999989e-16j)
np.log1p(1e-16 * (1 + 1j))
Out[77]: 1e-16j
```

The log1p function from scipy.special seems to be working correctly, but unfortunatelly I need to use the numpy function (for numba).

I am currently using numpy version 1.26.4 on Python 3.10.10

```
np.__version__
Out[78]: '1.26.4'
```

## Answers:

The numpy docs say:

For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy.

Which seems to be an indirect way of saying they "gave up" on complex inputs. Indeed, in plain Python (which doesn’t have `cmath.log1p`

), on Windows:

```
>>> import cmath
>>> p = 1e-14 * (1 + 1j)
>>> p
(1e-14+1e-14j)
>>> cmath.log(1 + p)
(9.992007221626409e-15+9.9999999999999e-15j)
```

which is very close to what you got. So sure looks like `numpy.log1p(c)`

for complex `c`

is adding 1.0 to `c`

before calling `numpy.log()`

.

The result "should be" close to `p - p**2/2`

for `p`

so small (just the first two terms of the Taylor expansion of `log(1+p)`

around `p=0`

):

```
>>> p - p**2/2
(1e-14+9.9999999999999e-15j)
```

which is essentially delivered by the `mpmath`

extension module:

```
>>> import mpmath
>>> p = mpmath.mpc(1, 1) * 1e-14
>>> p
mpc(real='1.0e-14', imag='1.0e-14')
>>> mpmath.log1p(p)
mpc(real='1.0e-14', imag='9.9999999999999006e-15')
```

So this looks like a (inadequately documented) limitation of `numpy`

‘s current implementation of `log1p()`

.

So if you have to use `numpy`

for this, you’ll have to supply your own alternative.

I would suggest using either a three or four term Taylor series here. You’re using Numba, so it’s pretty straightforward to get something that’s both accurate and perfomant.

Here is how you can implement this:

```
import numba as nb
@nb.vectorize(fastmath=True)
def log1p_acc(x):
if np.abs(x) >= 1e-3:
# np.log1p is accurate for large values of x
return np.log1p(x)
else:
terms = 4
x_pow = x
sign = 1
sum_ = 0
for i in range(1, terms + 1):
# Note: use * (1 / i) to allow optimizer to avoid fdiv
sum_ += sign * x_pow * (1 / i)
sign *= -1
x_pow *= x
return sum_
```

To change the number of terms, you can change the `terms`

variable.

### Accuracy

Let’s measure how accurate this function is.

First, in order to quantify how accurate our replacement is, we need three things: a reference implementation, a test data set, and an error metric.

I chose the `mpmath.log1p`

implementation as my reference implementation, and set the accuracy to 1000 digits.^{1}

Next, we need to ask what domain we care about accuracy across. I assumed that you only care about small numbers, so I generated samples from a log-uniform distribution spanning from 1e-30 to 1. This essentially assumes that you care about the range from 1e-30 to 1e-29 as much as you care about the range 0.1 to 1. Here is an example of what a log uniform distribution looks like when plotted as a histogram, with linear and log scales. https://en.wikipedia.org/wiki/Reciprocal_distribution#/media/File:Reciprocal_Histogram.svg

Using that log-uniform distribution, I sampled from one of four cases an equal number of times:

- Real and complex are independent log uniform samples.
- Real and complex are the same log uniform sample.
- Real is log uniform, and complex is zero.
- Real is zero, and complex is log uniform.

This forms my test set for evaluating performance.

Third, I chose an error metric. By some metrics, your result with `np.log1p`

isn’t so bad: it differs from the true result by about 10^{-17}, which is about the machine epsilon. In relative terms, it’s bad – it’s about 0.07% off.

For that reason, I think you’re more interested in relative error. I measured this using the formula `error = abs(true - pred)/abs(true)`

, where `abs()`

is the complex absolute value.

I then compared my function, SciPy’s log1p, and NumPy’s log1p. The following plot graphs relative error versus absolute value of the input, for three different ways of calculating it.

There are a few notable things about this plot:

- For some inputs, NumPy has a relative error of nearly 100%. For example, the code
`np.log1p((1e-16+1e-30j))`

returns a value with zero as the real term. - SciPy is much more accurate than NumPy – most inputs are accurate to about 1e-16.
- From x=1e-30 to 1e-4, the custom method has error of less than 1e-16.
- At x=1e-4, the error of the custom method rises above SciPy. If it did not switch back to NumPy at x=1e-3, the error would keep rising, especially as you move outside the interval of convergence.

Here’s the code used to produce this plot:

```
import numba as nb
import numpy as np
import scipy.special as sc
import scipy
import matplotlib.pyplot as plt
import mpmath
mpmath.mp.dps = 1000
plt.rcParams["figure.figsize"] = (12,8)
def generate_log_distribution(N):
"""Generate small numbers spanning many orders of magnitude"""
# return 10 ** np.random.uniform(-3, -30, size=N)
return scipy.stats.loguniform.rvs(1e-30, 1e-0, size=N)
test_set_size = 100000 // 4
test_set1 = generate_log_distribution(test_set_size) + 1j * generate_log_distribution(test_set_size)
identical_real_complex = generate_log_distribution(test_set_size)
test_set2 = identical_real_complex + 1j * identical_real_complex
test_set3 = generate_log_distribution(test_set_size) + 0j
test_set4 = 1j * generate_log_distribution(test_set_size) + 0
test_set = np.concatenate([test_set1, test_set2, test_set3, test_set4])
test_set_ref = [mpmath.log1p(c) for c in test_set]
@nb.vectorize(fastmath=True)
def log1p_acc(x):
if np.abs(x) >= 1e-3:
# np.log1p is accurate for large values of x
return np.log1p(x)
else:
terms = 4
x_pow = x
sign = 1
sum_ = 0
for i in range(1, terms + 1):
# Note: use (1 / i) to allow optimizer to avoid fdiv
sum_ += sign * x_pow * (1 / i)
sign *= -1
x_pow *= x
return sum_
functions = [
("NumPy log1p", np.log1p),
("SciPy log1p", sc.log1p),
("Custom log1p", log1p_acc),
]
for func_name, func in functions:
x = []
y = []
for i in range(len(test_set)):
number = test_set[i]
logged = func(number)
# Do error calculation in arbitrary precision
logged = mpmath.mpc(logged)
out = logged - test_set_ref[i]
out_abs = mpmath.fabs(out)
number_abs = mpmath.fabs(number)
relative_err = out_abs / mpmath.fabs(test_set_ref[i])
x.append(number_abs)
y.append(relative_err)
print(f"Report for {func_name}")
print(f"Average relative error: {float(np.mean(y))}")
# print("Worst", np.max(y), "for", test_set[np.argmax(y)])
print()
plt.scatter(x, y, label=func_name, s=1, alpha=0.1)
plt.xscale('log')
plt.yscale('log')
plt.xlabel('Norm of x')
plt.ylabel('Relative error in log1p(x)')
leg = plt.legend()
for lh in leg.legend_handles:
lh.set_alpha(1)
lh.set_sizes([20])
```

^{1} Note that setting `dps`

to 1000 only means that intermediate calculations are done with 1000 digits of accuracy. It does not necessarily mean that the final result is accurate to a thousand digits. It might only be accurate to half that. I have not checked how numerically stable `mpmath.log1p()`

is.

### Performance

I compared the performance of this function versus the SciPy and NumPy versions by taking the log1p of the entire test set. It is roughly 2x faster than NumPy, and 3x faster than SciPy. It takes about 14 nanoseconds per log1p evaluation.

Code:

```
print("NumPy log1p")
%timeit np.log1p(test_set)
print("SciPy log1p")
%timeit sc.log1p(test_set)
print("Custom log1p")
%timeit log1p_acc(test_set)
```

Output:

```
NumPy log1p
2.98 ms ± 63.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
SciPy log1p
3.69 ms ± 5.58 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Custom log1p
1.42 ms ± 2.74 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
```

I applied the following ideas to make this fast:

- I began with the taylor series, and turned it into a loop. Instead of using exponentiation to calculate each term, I multiplied the previous term by x.
- Floating point division is usually slightly slower than floating point multiplication. For that reason, I rewrote
`sign * x_pow / i`

as`sign * x_pow * (1 / i)`

- As
`terms`

is a constant, Numba is smart enough to fully unroll this loop, which makes this as fast as writing the Taylor series out explicitly. - I used
`fastmath`

, which allows Numba to re-arrange the order it does math in. - I found that having a
`sign`

variable was slightly faster than selecting either`-=`

or`+=`

with the loop index.