\star + - = \alpha \lambda \beta R \alpha= \beta= \beta = 0 \beta=1 \alpha = 1/\lambda_i \text{model} 0 p_1 0 \bar{p}_1 2\sqrt{\beta} \lambda_i \lambda_i = 0 \alpha > 1/\lambda_i \max\{|\sigma_1|,|\sigma_2|\} > 1 x_i^k - x_i^* \xi_i \beta = (1 - \sqrt{\alpha \lambda_i})^2

A birds-eye view of optimization algorithms

Outline

Credits. This material was created by Fabian Pedregosa for an invied lecture at McGill University. Source code can be found here. The template and the visualizations are modified from the distill article How momentum really works. Some parts of the knowing your problem section are based on the scipy lecture notes.