Portrait de Charles Guille-escuret n'est pas disponible

Charles Guille-escuret

Doctorat - Université de Montréal
Superviseur⋅e principal⋅e

Publications

CADet: Fully Self-Supervised Out-Of-Distribution Detection With Contrastive Learning
Charles Guille-Escuret
Pau Rodriguez
David Vazquez
Joao Monteiro
Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection
Charles Guille-Escuret
Pierre-Andre Noel
David Vazquez
Joao Monteiro
Improving the reliability of deployed machine learning systems often involves developing methods to detect out-of-distribution (OOD) inputs.… (voir plus) However, existing research often narrowly focuses on samples from classes that are absent from the training set, neglecting other types of plausible distribution shifts. This limitation reduces the applicability of these methods in real-world scenarios, where systems encounter a wide variety of anomalous inputs. In this study, we categorize five distinct types of distribution shifts and critically evaluate the performance of recent OOD detection methods on each of them. We publicly release our benchmark under the name BROAD (Benchmarking Resilience Over Anomaly Diversity). Our findings reveal that while these methods excel in detecting unknown classes, their performance is inconsistent when encountering other types of distribution shifts. In other words, they only reliably detect unexpected inputs that they have been specifically designed to expect. As a first step toward broad OOD detection, we learn a generative model of existing detection scores with a Gaussian mixture. By doing so, we present an ensemble approach that offers a more consistent and comprehensive solution for broad OOD detection, demonstrating superior performance compared to existing methods. Our code to download BROAD and reproduce our experiments is publicly available.
Towards Out-of-Distribution Adversarial Robustness
Adam Ibrahim
Charles Guille-Escuret
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fail… (voir plus)s to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different
No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths
Charles Guille-Escuret
Hiroki Naganuma
Kilian FATRAS
Understanding the optimization dynamics of neural networks is necessary for closing the gap between theory and practice. Stochastic first-or… (voir plus)der optimization algorithms are known to efficiently locate favorable minima in deep neural networks. This efficiency, however, contrasts with the non-convex and seemingly complex structure of neural loss landscapes. In this study, we delve into the fundamental geometric properties of sampled gradients along optimization paths. We focus on two key quantities, which appear in the restricted secant inequality and error bound. Both hold high significance for first-order optimization. Our analysis reveals that these quantities exhibit predictable, consistent behavior throughout training, despite the stochasticity induced by sampling minibatches. Our findings suggest that not only do optimization trajectories never encounter significant obstacles, but they also maintain stable dynamics during the majority of training. These observed properties are sufficiently expressive to theoretically guarantee linear convergence and prescribe learning rate schedules mirroring empirical practices. We conduct our experiments on image classification, semantic segmentation and language modeling across different batch sizes, network architectures, datasets, optimizers, and initialization seeds. We discuss the impact of each factor. Our work provides novel insights into the properties of neural network loss functions, and opens the door to theoretical frameworks more relevant to prevalent practice.
CADet: Fully Self-Supervised Anomaly Detection With Contrastive Learning
Charles Guille-Escuret
Pau Rodriguez
David Vazquez
Joao Monteiro
Gradient Descent Is Optimal Under Lower Restricted Secant Inequality And Upper Error Bound
Charles Guille-Escuret
Adam Ibrahim
Baptiste Goujaud
The study of first-order optimization is sensitive to the assumptions made on the objective functions. These assumptions induce complexity c… (voir plus)lasses which play a key role in worst-case analysis, including the fundamental concept of algorithm optimality. Recent work argues that strong convexity and smoothness—popular assumptions in literature—lead to a pathological definition of the condition number. Motivated by this result, we focus on the class of functions satisfying a lower restricted secant inequality and an upper error bound. On top of being robust to the aforementioned pathological behavior and including some non-convex functions, this pair of conditions displays interesting geometrical properties. In particular, the necessary and sufficient conditions to interpolate a set of points and their gradients within the class can be separated into simple conditions on each sampled gradient. This allows the performance estimation problem (PEP) to be solved analytically, leading to a lower bound on the convergence rate that proves gradient descent to be exactly optimal on this class of functions among all first-order algorithms.
A Study of Condition Numbers for First-Order Optimization
Charles Guille-Escuret
Manuela Girotti
Baptiste Goujaud
A Study of Condition Numbers for First-Order Optimization
Charles Guille-Escuret
Baptiste Goujaud
Manuela Girotti
The study of first-order optimization algorithms (FOA) typically starts with assumptions on the objective functions, most commonly smoothnes… (voir plus)s and strong convexity. These metrics are used to tune the hyperparameters of FOA. We introduce a class of perturbations quantified via a new norm, called *-norm. We show that adding a small perturbation to the objective function has an equivalently small impact on the behavior of any FOA, which suggests that it should have a minor impact on the tuning of the algorithm. However, we show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations, leading to excessively conservative tunings and convergence issues. In view of these observations, we propose a notion of continuity of the metrics, which is essential for a robust tuning strategy. Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous. We describe their mutual relations and provide their guaranteed convergence rates for the Gradient Descent algorithm accordingly tuned. Finally we discuss how our work impacts the theoretical understanding of FOA and their performances.