Posted onNov 12, 2019 • Edited onMay 13, 2020

MLS.1.b Gradient Descent in Linear Regression

#gradientdescent #python #linearregression #machinelearning

Gradient Descent in Linear Regression

Gradient Descent is a first order optimization algorithm to find the minimum of a function.It finds the minimum (local) of a function by moving along the direction of steep descent (downwards). This helps us to update the parameters of the model (weights and bias) more accurately.

To get to the local minima we can't just go directly to the point (on a graph plot). We need to descend in smaller steps and check for minima and take another step to the direction of descent until we get our desired local minimum.

The small steps mentioned above is called as learning rate. If the learning rate is very small the precision is more but is very time consuming. And large learning rate may lead us to miss the minimum (overshooting). The theory is to use the learning rate at higher rate until the slope of curve starts to decrease ad once it starts decreasing, we start using smaller learning rates(Less time and More Precision).

Gradient descending over a slope

The cost function helps us to evaluate how good our model is functioning or predicting. Its aloss function. It has its own curve and parameters (weights and bias). The slope of the curve helps us to update our parameters accordingly. The less the cost more the predicted probability.

In the training phase, we are finding the y_train value to find how much is the value is deviating from the given output. Then we calculate the cost error in the given second phase by using the cost error formula.

y_train = w * x_i + b
cost = (1/N) * ∑(y_i − y_train)² {i from 1 to n}

foriinrange(n_iters):#Training phasey_train=np.dot(X,self.weights)+self.bias#Cost error calculating Phasecost=(1/n_samples)*np.sum((y_train-y)**2)costs.append(cost)

Now we update the weights and bias to decrease our error by doing

#Updating the weight and bias derivativesDelta_w=(2/n_samples)*np.dot(X.T,(y_hat-y))Delta_b=(2/n_samples)*np.sum((y_hat-y))#Updating weightsself.weights=self.weights-learn_rate*Delta_wself.bias=self.bias-learn_rate*Delta_b# end of loop

And ploting cost function against iterations

Cost against iterations

Above given is a cost function curve against number of iterations. As number of iterations increases (steps) cost decreased drastically meaning minimum is nearby and almost became zero. We do the above updations until the error becomesnegligible or minimum is reached.

Source code from Scratch

classLinearModel:"""    Linear Regression Model Class    """def__init__(self):passdefgradient_descent(self,X,y,learn_rate=0.01,n_iters=100):"""        Trains a linear regression model using gradient descent        """n_samples,n_features=X.shapeself.weights=np.zeros(shape=(n_features,1))self.bias=0self.prev_weights=[]self.prev_bias=[]self.X=Xself.y=ycosts=[]foriinrange(n_iters):""""            Training Phase            """y_hat=np.dot(X,self.weights)+self.bias"""            Cost error Phase            """cost=(1/n_samples)*np.sum((y_hat-y)**2)costs.append(cost)"""            Verbose: Description of cost at each iteration            """ifi%200==0:print("Cost at iteration {0}: {1}".format(i,cost))"""            Updating the derivative            """Delta_w=(2/n_samples)*np.dot(X.T,(y_hat-y))Delta_b=(2/n_samples)*np.sum((y_hat-y))""""            Updating weights and bias            """self.weights=self.weights-learn_rate*Delta_wself.bias=self.bias-learn_rate*Delta_b"""            Save the weights for visualisation            """self.prev_weights.append(self.weights)self.prev_bias.append(self.bias)returnself.weights,self.bias,costsdefpredict(self,X):"""        Predicting the values by using Linear Model        """returnnp.dot(X,self.weights)+self.bias

# We have created our Linear Model class. Now we need to create and load our model.model=LinearModel()w_trained,b_trained,costs=model.gradient_descent(X_train,y_train,learn_rate=0.005,n_iters=1000)

defvisualize_training(self):"""        Visualizing the line against the dataset                """self.prev_weights=np.array(self.prev_weights)x=self.X[:,0]line,=ax.plot(x,x,color='red')ax.scatter(x,self.y)defanimate(line_data):m,c=line_dataline.set_ydata(m*x+c)# update the datareturnline,definit():returnline,defget_next_weight_and_bias():foriinrange(len(self.prev_weights)):yieldself.prev_weights[i][0],self.prev_bias[i]returnanimation.FuncAnimation(fig,animate,get_next_weight_and_bias,init_func=init,interval=35,blit=True)

# Visualization of training phase to get the best fit linefig,ax=plt.subplots()ani=model.visualize_training()plt.show()

# Prediction Phase to test our modeln_samples,_=X_train.shapen_samples_test,_=X_test.shapey_p_train=model.predict(X_train)y_p_test=model.predict(X_test)error_train=(1/n_samples)*np.sum((y_p_train-y_train)**2)error_test=(1/n_samples_test)*np.sum((y_p_test-y_test)**2)print("Error on training set: {}".format(np.round(error_train,6)))print("Error on test set: {}".format(np.round(error_test,6)))

# Plotting predicted best fit linefig=plt.figure(figsize=(8,6))plt.scatter(X_train,y_train)plt.scatter(X_test,y_p_test)plt.xlabel("x")plt.ylabel("y")plt.show()