April 18, 2024, 11:00–12:15
Toulouse
Room Auditorium 3 - JJ Laffont
MAD-Stat. Seminar
Abstract
It was established in the 90's by Pemantle, Brandière and Duflo that stochastic gradient descent (SGD) escapes saddle points of smooth function. Knowing that critical points of a typical/generic smooth function are either local minima or saddle points, we can interpret this result as a generic convergence to local minima of SGD. The purpose of this talk is to extend such a result to the class of non-smooth, weakly-convex functions. We will first investigate the generic properties of non-smooth, semi-algebraic (or more generally definable in an o-minimal structure) functions. As recently shown by Davis and Drusvyatskiy, active strict saddles form a generic type of critical points in this class. Second, we will show that on weakly-convex functions SGD avoids active strict saddles with probability one. As a consequence, "generically" on a weakly-convex, semi-algebraic function, SGD converges to a local minimum.