Soutenance de thèse de Ryan Boustany lundi 31 mars 2025

Ryan Boustany soutiendra publiquement sa thèse en mathématiques le lundi 31 mars 2025 à 14h30 (Auditorium 6, bâtiment TSE)
Titre de la soutenance : On deep network training: complexity, robustness of nonsmooth backpropagation, and inertial algorithms
Directeur de thèse : Professeur J érôme BOLTE

Pour assister à la soutenance, merci de contacter l'école doctorale de TSE

Membres du jury :

Pierre ABLIN – Apple, ex-CNRS – Examinateur
Samir ADLY – XLIM-DMI, University of Limoges – Rapporteur
Jérôme BOLTE – University of Toulouse 1 Capitole – Supervisor
Peter OCHS – Saarland University – Rapporteur
Edouard PAUWELS – Toulouse School of Economics – Co directeur
Audrey REPETTI – Heriot-Watt University – Rapporteure

Résumé (en anglais) :

Learning based on neural networks relies on the combined use of first-order non-convex optimization techniques, subsampling approximation, and algorithmic differentiation, which is the automated numerical application of differential calculus. These methods are fundamental to modern computing libraries such as TensorFlow, PyTorch and JAX. However, these libraries use algorithmic differentiation beyond their primary focus on basic differentiable operations. Often, models incorporate non-differentiable activation functions like ReLU or generalized derivatives for complex objects (solutions to sub-optimization problems). Consequently, understanding the behavior of algorithmic differentiation and its impact on learning has emerged as a key issue in the machine learning community.

To address this, a new concept of nonsmooth differentiation, called conservative gradients, has been developed to model nonsmooth algorithmic differentiation in modern learning contexts. This concept also facilitates the formulation of learning guarantees and the stability of algorithms in deep neural networks as they are practically implemented.

In this context, we propose two extensions of the conservative calculus, finding a wide range of applications in machine learning. The first result provides a simple model to estimate the computational costs of the backward and forward modes of algorithmic differentiation for a wide class of nonsmooth programs. A second result focuses on the reliability of automatic differentiation for nonsmooth neural networks operating with floating-point numbers. Finally, we focus on building a new optimizer algorithm exploiting second-order information only using noisy first-order nonsmooth nonconvex automatic differentiation. Starting from a dynamical system (an ordinary differential equation), we build INNAprop, derived from a combination of INNA and RMSprop.

Soutenance de thèse de Ryan Boustany lundi 31 mars 2025

Partager

Sur le même thème