Gradient Descent

Definition

Gradient descent is a mathematical optimization algorithm used to train AI models. It works by slowly adjusting the model’s weights—its internal settings—so the output becomes more accurate over time. This is done by finding the direction of the steepest error and moving in the opposite direction to reduce it.

Example

“During training, a neural network uses gradient descent to get better at predicting the next word in a sentence.”

How It’s Used in AI

Gradient descent is the foundation of how neural networks and machine learning models learn. It powers everything from fine-tuning to massive pretraining of LLMs, allowing the model to improve with each batch of data.

Brief History

Used in statistics since the 1800s, gradient descent became a key part of training deep neural networks in the 2010s. Variants like Stochastic Gradient Descent (SGD) and Adam are now standard in AI research.

Key Tools or Models

  • Algorithms: SGD, Adam, RMSprop

  • Libraries: TensorFlow, PyTorch

  • Used in most modern AI training workflows

Pro Tip

Learning rate matters. If it’s too high, the model overshoots and fails to learn. Too low, and training takes forever.

Like this AI term? Share with others.

7-day Money-Back Guarantee

Choose a plan that fits your needs and try Supedia out for yourself. If you won’t be satisfied, we’ll give you a refund (yes, that’s how sure we are you’ll love it)!

Dashboard Image

7-day Money-Back Guarantee

Choose a plan that fits your needs and try Supedia out for yourself. If you won’t be satisfied, we’ll give you a refund (yes, that’s how sure we are you’ll love it)!

Dashboard Image

7-day Money-Back Guarantee

Choose a plan that fits your needs and try Supedia out for yourself. If you won’t be satisfied, we’ll give you a refund (yes, that’s how sure we are you’ll love it)!

Dashboard Image