![](https://crypto4nerd.com/wp-content/uploads/2023/06/0Wghi3021x6buuJem-1024x683.jpeg)
1. Batch Gradient Descent
The algorithm being with an initial set of parameters and calculates the gradient of the cost function with respect to those parameters. Then, the parameters are updated by taking steps proportional to the negative gradient, iteratively moving towards the minimum.
This method is valid if and only if the cost function is differentiable and continuous everywhere. It is used in most machine model including regression, classification, and neural networks.
A drawback of the gradient descent is that the learning rate will require a careful selection otherwise the cost function can get stuck at the minimum point which can be computationally expensive for high-dimensional data.