How does TensorFlow perform algorithms so fast?

Question:

I wanted to solve a linear regression problem by creating by myself the Adam optimization algorithm and performing it on my dataset. However, the problem is solved with an acceptable loss in around 100 epochs, while a tenth than my loss is computed by TensorFlow in just 3 epochs, at parity of hyperparameters.

Here is my code:

def grad(x, y, w, b, par):
  if par == "w":
    return -2*x*(y-w*x-b)
  if par == "b":
    return -2*(y-w*x-b)

def Adam(x, y, alpha, beta1=0.9, beta2=0.999, epsilon=10e-8, epochs=1000, batch_size=32):
  mw = 0
  vw = 0
  mb = 0
  vb = 0

  mmw = 0
  vvw = 0
  mmb = 0
  vvb = 0

  w = 0
  b = 0

  bestw = 0
  bestb = 0
  bestcost = cost(x, y, w, b)

  n = len(x)
  sw = int(n/batch_size)-1
  counter = 0
  
  for t in range(epochs):
    if t % sw == 0:
      arr = np.random.randint(0, n, batch_size)
    
    if t % 100 == 0:
      print(cost(x, y, w, b))

    if t % 15 == 0:
      if cost(x, y, w, b) < bestcost:
        bestw = w
        bestb = b
        bestcost = cost(x, y, w, b)

    for i in range(len(arr)):
      mw = beta1 * mw + (1 - beta1) * grad(x[arr[i]], y[arr[i]], w, b, 'w')
      vw = beta2 * vw + (1 - beta2) * ((grad(x[arr[i]], y[arr[i]], w, b, 'w'))**2)

      mmw = mw/(1 - beta1**(t+1))
      vvw = vw/(1 - beta2**(t+1))

      w = w - (alpha * mmw)/(math.sqrt(vvw)+epsilon)

      mb = beta1 * mb + (1 - beta1) * grad(x[arr[i]], y[arr[i]], w, b, 'b')
      vb = beta2 * vb + (1 - beta2) * ((grad(x[arr[i]], y[arr[i]], w, b, 'b'))**2)

      mmb = mb/(1 - beta1**(t+1))
      vvb = vb/(1 - beta2**(t+1))

      b = b - (alpha * mmb)/(math.sqrt(vvb)+epsilon)

  return bestw, bestb, bestcost # it returns both parameters and loss 

And here’s the problem solved with TensorFlow:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(1)
])
model.compile(loss="mse", optimizer = tf.keras.optimizers.Adam(learning_rate = 0.001))
model.fit(tf.expand_dims(x, axis=-1), y, epochs=100)

How can the TensorFlow algorithm perform the problem with so low epochs?
However, I have to note that while TF computes an epoch, my algorithm computes around 130.

Asked By: Iya Lee

||

Answers:

Some python modules contain functions that run iterations with compiled C code. If you rewrote your iteratives with vectorized NumPy functions, you could see improvements with performance.

https://www.geeksforgeeks.org/vectorized-operations-in-numpy/

NumPy Documentation

Answered By: Luke Ruter