Degenerate root finding problem: get the first value for which f(x)>0

Question:

Given a function f(x) that is zero for all values x less than a critical value c, and non-zero for values x>c.

I want to approximate the critical value c using an optimization method. Because the function f(x) is expensive, I want to compute it as few times as possible. Therefore, computing f(x) for a predefined list of values x is not viable.

Think of a function like the following one, with critical point a=1.4142

def func(x):
    return max(x**2 - 2, 0) if x > 0 else 0

I would have implemented this using some custom bisection function. However, I was wondering if this "degenerate" root finding problem can be solved using existing functions in SciPy or NumPy. I have experimented with scipy.optimize.root_scalar. However, it does not seem to support functions like the one above.

from scipy.optimize import root_scalar
root_scalar(func, bracket=[-6, 8])
# Yields an error: f(a) and f(b) must have different signs.
Asked By: normanius

||

Answers:

Technically speaking, everywhere that x < 1.414 in that function is a root. If you want to use a root finder, you need to make it a valid root-finding problem. I suggest subtracting a very small value from the output of your function:

from scipy.optimize import root_scalar
import math
wrapper = lambda func: lambda *args: func(*args) - math.ulp(0)
root_scalar(wrapper(func), bracket=[-6, 8])

This works by subtracting one ULP (Unit in Last Place, the smallest value you can add or subtract) from your function. This makes it strictly negative for all x < 1.414 and strictly positive for all x > 1.414. On a computer with 64-bit floating point math, that means it’s subtracting 5*10-324 from each result, which very slightly moves the position of the root, but not much.

Answered By: Nick ODell

This is just to present a "custom" bisect method that works for the presented example. However, I still recommend using the accepted answer.

def bisect(func, x_min, x_max, x_func=None,
           tol=None, iter_max=None, **kwargs):
    y_min = func(x_min, **kwargs)
    if y_min > 0:
        msg = "Warning: no solution as y_min>0, with x_min=%f."
        print(msg % x_min)
        return x_min
    y_max = func(x_max, **kwargs)
    if y_max <= 0:
        msg = "Warning: no solution as y_max<=0, with x_max=%f."
        print(msg % x_max)
        return x_max
    if tol is None and iter_max is None:
        tol = 1e-7
    if x_func is None:
        x_func = lambda x0, x1: (x1+x0)/2
    
    from itertools import count, islice
    x_last = np.infty
    for cnt in islice(count(), iter_max):
        x_new = x_func(x_min, x_max)
        y_new = func(x_new, **kwargs)
        if y_new<=0:
            x_min = x_new
        else:
            x_max = x_new
        if (tol is not None) and abs(x_last - x_new) < tol:
            break
        x_last = x_new
    return x_min if y_new>0 else x_max

def func(x):
    return max(x**2 - 2, 0) if x > 0 else 0

ret = bisect(func, x_min=-6, x_max=8)
print(ret)

Interestingly, the performance of bisect() seems better than the root_scalar approach for this basic type of function func(x). But I expect this advantage to vanish for more complex functions.

%timeit bisect(func, x_min=-6, x_max=8, iter_max=None, tol=1e-7)
# 46.3 µs ± 1.96 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit root_scalar(wrapper(func), bracket=[-6, 8], xtol=1e-7)
# 84.6 µs ± 6.29 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

# Test that the results are equivalent:
x1 = bisect(func, x_min=-6, x_max=8, iter_max=None, tol=1e-7)
x2 = root_scalar(wrapper(func), bracket=[-6, 8], xtol=1e-7).root
assert abs(x2-x1) < 1e-7
# 2.9078300656237843e-08
Answered By: normanius