Write a function to find the maximum value of: 𝑓(𝑥) = 𝑥−5 + sqrt(4−𝑥^2) by applying gradient descent: 𝑥𝑡+1=𝑥𝑡−𝜂𝑓′(𝑥𝑡)
Question:
The return value of x in f_gradient_descent(0, 0.01, 1000) is not -2.17 (max). Would you please have a look? Thank you so much!
import math
def f(x):
return x-5 + math.sqrt(4-x**2)
def f_derivative(x):
return 1 + 1/(2*math.sqrt(4-x**2))*(-2*x)
def f_gradient_descent(x0, eta, n_step):
"""
Parameter:
x0: Start point
eta: Learning rate
n_step: algorithm will stop after `n_step` cycle
"""
x0 = 1.23
eta = 0.001
for _ in range(n_step):
x0 = x0 - eta * f_derivative(x0)
if abs(f_derivative(x0)) < 0.00001:
break
return f(x0)
assert f_gradient_descent(0, 0.01, 1000) - (-2.17) < 1e-4
Answers:
There are quite a few mistakes in the code.
-
To arrive at the peak you have to ascend, not descend. So you have to increase x0
if the derivative is positive.
-
The values for x0
and eta
that are passed to the function are ignored because they are overwritten inside. eta = 0.001
might be too small to arrive at the maximum within 1000 steps.
-
In order to evaluate the result, you have to use the absolute value of the difference between the actual and the expected result.
-
The actual maximum is -2.1716…, not -2.17, which is more than 1e-4
away.
The return value of x in f_gradient_descent(0, 0.01, 1000) is not -2.17 (max). Would you please have a look? Thank you so much!
import math
def f(x):
return x-5 + math.sqrt(4-x**2)
def f_derivative(x):
return 1 + 1/(2*math.sqrt(4-x**2))*(-2*x)
def f_gradient_descent(x0, eta, n_step):
"""
Parameter:
x0: Start point
eta: Learning rate
n_step: algorithm will stop after `n_step` cycle
"""
x0 = 1.23
eta = 0.001
for _ in range(n_step):
x0 = x0 - eta * f_derivative(x0)
if abs(f_derivative(x0)) < 0.00001:
break
return f(x0)
assert f_gradient_descent(0, 0.01, 1000) - (-2.17) < 1e-4
There are quite a few mistakes in the code.
-
To arrive at the peak you have to ascend, not descend. So you have to increase
x0
if the derivative is positive. -
The values for
x0
andeta
that are passed to the function are ignored because they are overwritten inside.eta = 0.001
might be too small to arrive at the maximum within 1000 steps. -
In order to evaluate the result, you have to use the absolute value of the difference between the actual and the expected result.
-
The actual maximum is -2.1716…, not -2.17, which is more than
1e-4
away.