gradient-descent

tensorflow GradientDescentOptimizer: Incompatible shapes between op input and calculated input gradient

tensorflow GradientDescentOptimizer: Incompatible shapes between op input and calculated input gradient Question: The model worked well before optimization step. However, when I want to optimize my model, the error message showed up: Incompatible shapes between op input and calculated input gradient. Forward operation: softmax_cross_entropy_with_logits_sg_12. Input index: 0. Original input shape: (16, 1). Calculated input gradient …

Total answers: 1

Why do we need to call zero_grad() in PyTorch?

Why do we need to call zero_grad() in PyTorch? Question: Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero. Asked By: user1424739 || Source Answers: In PyTorch, for every mini-batch during the training phase, we typically want to explicitly set the gradients to zero …

Total answers: 6

Sklearn SGDClassifier partial fit

Sklearn SGDClassifier partial fit Question: I’m trying to use SGD to classify a large dataset. As the data is too large to fit into memory, I’d like to use the partial_fit method to train the classifier. I have selected a sample of the dataset (100,000 rows) that fits into memory to test fit vs. partial_fit: …

Total answers: 1

gradient descent using python and numpy

gradient descent using python and numpy Question: def gradient(X_norm,y,theta,alpha,m,n,num_it): temp=np.array(np.zeros_like(theta,float)) for i in range(0,num_it): h=np.dot(X_norm,theta) #temp[j]=theta[j]-(alpha/m)*( np.sum( (h-y)*X_norm[:,j][np.newaxis,:] ) ) temp[0]=theta[0]-(alpha/m)*(np.sum(h-y)) temp[1]=theta[1]-(alpha/m)*(np.sum((h-y)*X_norm[:,1])) theta=temp return theta X_norm,mean,std=featureScale(X) #length of X (number of rows) m=len(X) X_norm=np.array([np.ones(m),X_norm]) n,m=np.shape(X_norm) num_it=1500 alpha=0.01 theta=np.zeros(n,float)[:,np.newaxis] X_norm=X_norm.transpose() theta=gradient(X_norm,y,theta,alpha,m,n,num_it) print theta My theta from the above code is 100.2 100.2, but it should be …

Total answers: 5