Degenerate root finding problem: get the first value for which f(x)>0
Question:
Given a function f(x) that is zero for all values x less than a critical value c, and non-zero for values x>c.
I want to approximate the critical value c using an optimization method. Because the function f(x) is expensive, I want to compute it as few times as possible. Therefore, computing f(x) for a predefined list of values x is not viable.
Think of a function like the following one, with critical point a=1.4142
def func(x):
return max(x**2 - 2, 0) if x > 0 else 0
I would have implemented this using some custom bisection function. However, I was wondering if this "degenerate" root finding problem can be solved using existing functions in SciPy or NumPy. I have experimented with scipy.optimize.root_scalar
. However, it does not seem to support functions like the one above.
from scipy.optimize import root_scalar
root_scalar(func, bracket=[-6, 8])
# Yields an error: f(a) and f(b) must have different signs.
Answers:
Technically speaking, everywhere that x < 1.414 in that function is a root. If you want to use a root finder, you need to make it a valid root-finding problem. I suggest subtracting a very small value from the output of your function:
from scipy.optimize import root_scalar
import math
wrapper = lambda func: lambda *args: func(*args) - math.ulp(0)
root_scalar(wrapper(func), bracket=[-6, 8])
This works by subtracting one ULP (Unit in Last Place, the smallest value you can add or subtract) from your function. This makes it strictly negative for all x < 1.414 and strictly positive for all x > 1.414. On a computer with 64-bit floating point math, that means it’s subtracting 5*10-324 from each result, which very slightly moves the position of the root, but not much.
This is just to present a "custom" bisect method that works for the presented example. However, I still recommend using the accepted answer.
def bisect(func, x_min, x_max, x_func=None,
tol=None, iter_max=None, **kwargs):
y_min = func(x_min, **kwargs)
if y_min > 0:
msg = "Warning: no solution as y_min>0, with x_min=%f."
print(msg % x_min)
return x_min
y_max = func(x_max, **kwargs)
if y_max <= 0:
msg = "Warning: no solution as y_max<=0, with x_max=%f."
print(msg % x_max)
return x_max
if tol is None and iter_max is None:
tol = 1e-7
if x_func is None:
x_func = lambda x0, x1: (x1+x0)/2
from itertools import count, islice
x_last = np.infty
for cnt in islice(count(), iter_max):
x_new = x_func(x_min, x_max)
y_new = func(x_new, **kwargs)
if y_new<=0:
x_min = x_new
else:
x_max = x_new
if (tol is not None) and abs(x_last - x_new) < tol:
break
x_last = x_new
return x_min if y_new>0 else x_max
def func(x):
return max(x**2 - 2, 0) if x > 0 else 0
ret = bisect(func, x_min=-6, x_max=8)
print(ret)
Interestingly, the performance of bisect()
seems better than the root_scalar
approach for this basic type of function func(x)
. But I expect this advantage to vanish for more complex functions.
%timeit bisect(func, x_min=-6, x_max=8, iter_max=None, tol=1e-7)
# 46.3 µs ± 1.96 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit root_scalar(wrapper(func), bracket=[-6, 8], xtol=1e-7)
# 84.6 µs ± 6.29 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
# Test that the results are equivalent:
x1 = bisect(func, x_min=-6, x_max=8, iter_max=None, tol=1e-7)
x2 = root_scalar(wrapper(func), bracket=[-6, 8], xtol=1e-7).root
assert abs(x2-x1) < 1e-7
# 2.9078300656237843e-08
Given a function f(x) that is zero for all values x less than a critical value c, and non-zero for values x>c.
I want to approximate the critical value c using an optimization method. Because the function f(x) is expensive, I want to compute it as few times as possible. Therefore, computing f(x) for a predefined list of values x is not viable.
Think of a function like the following one, with critical point a=1.4142
def func(x):
return max(x**2 - 2, 0) if x > 0 else 0
I would have implemented this using some custom bisection function. However, I was wondering if this "degenerate" root finding problem can be solved using existing functions in SciPy or NumPy. I have experimented with scipy.optimize.root_scalar
. However, it does not seem to support functions like the one above.
from scipy.optimize import root_scalar
root_scalar(func, bracket=[-6, 8])
# Yields an error: f(a) and f(b) must have different signs.
Technically speaking, everywhere that x < 1.414 in that function is a root. If you want to use a root finder, you need to make it a valid root-finding problem. I suggest subtracting a very small value from the output of your function:
from scipy.optimize import root_scalar
import math
wrapper = lambda func: lambda *args: func(*args) - math.ulp(0)
root_scalar(wrapper(func), bracket=[-6, 8])
This works by subtracting one ULP (Unit in Last Place, the smallest value you can add or subtract) from your function. This makes it strictly negative for all x < 1.414 and strictly positive for all x > 1.414. On a computer with 64-bit floating point math, that means it’s subtracting 5*10-324 from each result, which very slightly moves the position of the root, but not much.
This is just to present a "custom" bisect method that works for the presented example. However, I still recommend using the accepted answer.
def bisect(func, x_min, x_max, x_func=None,
tol=None, iter_max=None, **kwargs):
y_min = func(x_min, **kwargs)
if y_min > 0:
msg = "Warning: no solution as y_min>0, with x_min=%f."
print(msg % x_min)
return x_min
y_max = func(x_max, **kwargs)
if y_max <= 0:
msg = "Warning: no solution as y_max<=0, with x_max=%f."
print(msg % x_max)
return x_max
if tol is None and iter_max is None:
tol = 1e-7
if x_func is None:
x_func = lambda x0, x1: (x1+x0)/2
from itertools import count, islice
x_last = np.infty
for cnt in islice(count(), iter_max):
x_new = x_func(x_min, x_max)
y_new = func(x_new, **kwargs)
if y_new<=0:
x_min = x_new
else:
x_max = x_new
if (tol is not None) and abs(x_last - x_new) < tol:
break
x_last = x_new
return x_min if y_new>0 else x_max
def func(x):
return max(x**2 - 2, 0) if x > 0 else 0
ret = bisect(func, x_min=-6, x_max=8)
print(ret)
Interestingly, the performance of bisect()
seems better than the root_scalar
approach for this basic type of function func(x)
. But I expect this advantage to vanish for more complex functions.
%timeit bisect(func, x_min=-6, x_max=8, iter_max=None, tol=1e-7)
# 46.3 µs ± 1.96 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit root_scalar(wrapper(func), bracket=[-6, 8], xtol=1e-7)
# 84.6 µs ± 6.29 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
# Test that the results are equivalent:
x1 = bisect(func, x_min=-6, x_max=8, iter_max=None, tol=1e-7)
x2 = root_scalar(wrapper(func), bracket=[-6, 8], xtol=1e-7).root
assert abs(x2-x1) < 1e-7
# 2.9078300656237843e-08