Portrait de Nicolas Le Roux

Nicolas Le Roux

Membre industriel principal
Chaire en IA Canada-CIFAR
Professeur associé, McGill University, École d'informatique
Professeur associé, Université de Montréal, Département d'informatique et de recherche opérationnelle
Chercheur scientifique, Microsoft Research
Sujets de recherche
Apprentissage par renforcement
Apprentissage profond
Modèles génératifs
Optimisation

Biographie

Je suis un chercheur universitaire spécialisé dans l'apprentissage automatique, la vision par ordinateur, les réseaux de neurones, l'apprentissage en profondeur, l'optimisation, l'apprentissage à grande échelle et la modélisation statistique en général.

Étudiants actuels

Maîtrise recherche - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Maîtrise recherche - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

A Geometric Perspective on Optimal Representations for Reinforcement Learning
Will Dabney
Robert Dadashi
Adrien Ali Taiga
Dale Schuurmans
Tor Lattimore
Clare Lyle
A Geometric Perspective on Optimal Representations for Reinforcement Learning
Will Dabney
Robert Dadashi
Adrien Ali Taiga
Dale Schuurmans
Tor Lattimore
Clare Lyle
We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functi… (voir plus)ons. We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks. Our formulation considers adapting the representation to minimize the (linear) approximation of the value function of all stationary policies for a given environment. We show that this optimization reduces to making accurate predictions regarding a special class of value functions which we call adversarial value functions (AVFs). We demonstrate that using value functions as auxiliary tasks corresponds to an expected-error relaxation of our formulation, with AVFs a natural candidate, and identify a close relationship with proto-value functions (Mahadevan, 2005). We highlight characteristics of AVFs and their usefulness as auxiliary tasks in a series of experiments on the four-room domain.
Reducing the variance in online optimization by transporting past gradients
Sébastien M. R. Arnold
Pierre-Antoine Manzagol
Reza Babanezhad Harikandeh
Most stochastic optimization methods use gradients once before discarding them. While variance reduction methods have shown that reusing pas… (voir plus)t gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting. One issue is the staleness due to using past gradients. We propose to correct this staleness using the idea of implicit gradient transport (IGT) which transforms gradients computed at previous iterates into gradients evaluated at the current iterate without using the Hessian explicitly. In addition to reducing the variance and bias of our updates over time, IGT can be used as a drop-in replacement for the gradient estimate in a number of well-understood methods such as heavy ball or Adam. We show experimentally that it achieves state-of-the-art results on a wide range of architectures and benchmarks. Additionally, the IGT gradient estimator yields the optimal asymptotic convergence rate for online stochastic optimization in the restricted setting where the Hessians of all component functions are equal.
Understanding the impact of entropy in policy learning
Zafarali Ahmed
Mohammad Norouzi
Dale Schuurmans
Entropy regularization is commonly used to improve policy optimization in reinforcement learning. It is believed to help with \emph{explorat… (voir plus)ion} by encouraging the selection of more stochastic policies. In this work, we analyze this claim using new visualizations of the optimization landscape based on randomly perturbing the loss function. We first show that even with access to the exact gradient, policy optimization is difficult due to the geometry of the objective function. Then, we qualitatively show that in some environments, a policy with higher entropy can make the optimization landscape smoother, thereby connecting local optima and enabling the use of larger learning rates. This paper presents new tools for understanding the optimization landscape, shows that policy entropy serves as a regularizer, and highlights the challenge of designing general-purpose policy optimization algorithms.
Combining adaptive algorithms and hypergradient method: a performance and robustness study
Akram Erraqabi
Online Hyper-Parameter Optimization
Damien Vincent
Sylvain Gelly
Olivier Bousquet
Online variance-reducing optimization
Reza Babanezhad Harikandeh
Reza Babanezhad
Pierre-Antoine Manzagol
Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Pierre-Antoine Manzagol
The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research… (voir plus). In this work, we study the loss landscape of deep networks through the eigendecompositions of their Hessian matrix. In particular, we examine how important the negative eigenvalues are and the benefits one can observe in handling them appropriately.
BOUNDS LEAD TO IMPROVED CLASSIFIERS
The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. Wh… (voir plus)ile this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.