parameter b diverges when calculating gradient descent for linear regression

Question:

I have two arrays x and y:

x = np.arange(-500, 500, 3)
y = 2.5*x+5

I want to perform a linear regression by calculating gradient descent

I implemented a gradient descent function,

def gradient_descent(x, y, alpha, epochs):
  w = 0.0
  b = 0.0
  n = len(x)
  for a in range(epochs):
    if a % 100 == 0: print(w,b)
    for i in range(n):
      new_w = (-2*alpha*x[i]*(y[i]-w*x[i]+b))/n
      new_b = (-2*alpha*(y[i]-w*x[i]+b))/n
      w -= new_w
      b -= new_b
  return w,b

Given hyperparameters alpha=0.0005 and epochs=10000, it calculates well w but when calculating b it diverges to infinity. How can I fix it?

Asked By: Iya Lee

||

Answers:

You were close but you missed some parentheses in both your formulas:
Not only b would diverge to infinity like this but probably w as well (but slower).

See the missing parentheses below:

  new_w = (-2*alpha*x[i]*(y[i]-  (  w*x[i]+b  )  ))/n
  new_b = (-2*alpha*(y[i]-  (  w*x[i]+b  )  ))/n

The difference to your code is -wx[i]+b != -(wx[i]+b) because of the – sign outside of the parentheses.

Answered By: tetris programming