Portrait of Nicolas Le Roux

Nicolas Le Roux

Core Industry Member
Canada CIFAR AI Chair
Adjunct Professor, McGill University, School of Computer Science
Adjunct Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Scientist, Microsoft Research
Research Topics
Deep Learning
Generative Models
Optimization
Reinforcement Learning

Biography

I am an academic researcher with expertise in machine learning, computer vision, neural networks, deep learning, optimization, large-scale learning and statistical modelling in general.

Current Students

PhD - Université de Montréal
Principal supervisor :
Master's Research - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Master's Research - McGill University
Co-supervisor :
Postdoctorate - Université de Montréal
Co-supervisor :

Publications

Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Pierre-Antoine Manzagol
The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research… (see more). In this work, we study the loss landscape of deep networks through the eigendecompositions of their Hessian matrix. In particular, we examine how important the negative eigenvalues are and the benefits one can observe in handling them appropriately.
BOUNDS LEAD TO IMPROVED CLASSIFIERS
The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. Wh… (see more)ile this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.