The Gradient Descent (GD) is an iterative approach for minimizing the given function, or, in other words, a way to find a local minimum of a function. The algorithm starts with an initial estimate of the solution that we can give in several ways: one approach is to randomly sample values for the parameters. We evaluate the slope of the function at that point, determine the solution in the negative direction of the gradient, and repeat this process. The algorithm will eventually converge where the gradient is zero, corresponding to a local minimum.
The steepest descent step size is replaced by a similar size from the previous step. The gradient is basically defined as the slope of the curve, as shown in the following figure:
In Chapter 2, Basic Concepts – Simple Linear Regression, we saw that the goal of OLS regression is...