Stochastic Gradient Descent (SGD) is an optimization algorithm used in machine learning to minimize a loss function. Unlike traditional gradient descent, which computes gradients using the entire dataset, SGD updates model parameters using only one randomly selected training example per iteration. This approach reduces computation time and helps the model escape local minima, making it suitable for large-scale data.