Math behind the Gradient Descent Algorithm

Why do we care about the Gradient descent algorithm?

As we know, supervised machine learning algorithms are described as learning a target function (f) that best maps input variables (x) to an output variable (Y).

Gradient Descent Algorithm: An Explanation

Gradient descent algorithm is an optimization technique, that follows the negative gradient of an objective function in order to locate the minimum of a function. It is technically called the first-order optimization algorithm as it explicitly makes use of the first-order derivative of the target objective function. Functioning of GDA is represented can be visualized as below:

Source: https://ml-cheatsheet.readthedocs.io/en/latest/gradient_descent.html

Food for thought!

Why does the algorithm HAVE to follow the NEGATIVE gradient of the loss function?

To find a local minimum of a function using gradient descent, we take steps proportional to the negative of the gradient of the function at the current point. In other words, we move in the direction opposite to the gradient. Have you ever wondered why?

Source: Deep Learning IIT Ropar — Mitesh Khapra

Gradient Descent Rule and the Parameter Update Equation

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Harjot Kaur

Harjot Kaur

Product and Analytics Specialist with deep interest in Machine Learning (https://www.linkedin.com/in/harjot-kaur-99792118/)