Portrait de Mohammad Pezeshki n'est pas disponible

Mohammad Pezeshki

Collaborateur·rice de recherche
Superviseur⋅e principal⋅e

Publications

Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok
Tikeng Notsawo Pascal Junior
Pascal Notsawo
Hattie Zhou
Mohammad Pezeshki
Feedback-guided Data Synthesis for Imbalanced Classification
Reyhane Askari Hemmat
Mohammad Pezeshki
Florian Bordes
Michal Drozdzal
Current status quo in machine learning is to use static datasets of real images for training, which often come from long-tailed distribution… (voir plus)s. With the recent advances in generative models, researchers have started augmenting these static datasets with synthetic data, reporting moderate performance improvements on classification tasks. We hypothesize that these performance gains are limited by the lack of feedback from the classifier to the generative model, which would promote the usefulness of the generated samples to improve the classifier's performance. In this work, we introduce a framework for augmenting static datasets with useful synthetic samples, which leverages one-shot feedback from the classifier to drive the sampling of the generative model. In order for the framework to be effective, we find that the samples must be close to the support of the real data of the task at hand, and be sufficiently diverse. We validate three feedback criteria on a long-tailed dataset (ImageNet-LT) as well as a group-imbalanced dataset (NICO++). On ImageNet-LT, we achieve state-of-the-art results, with over 4 percent improvement on underrepresented classes while being twice efficient in terms of the number of generated synthetic samples. NICO++ also enjoys marked boosts of over 5 percent in worst group accuracy. With these results, our framework paves the path towards effectively leveraging state-of-the-art text-to-image models as data sources that can be queried to improve downstream applications.
Discovering environments with XRM
Mohammad Pezeshki
Diane Bouchacourt
Mark Ibrahim
Nicolas Ballas
David Lopez-Paz
Successful out-of-distribution generalization requires environment annotations. Unfortunately, these are resource-intensive to obtain, and t… (voir plus)heir relevance to model performance is limited by the expectations and perceptual biases of human annotators. Therefore, to enable robust AI systems across applications, we must develop algorithms to automatically discover environments inducing broad generalization. Current proposals, which divide examples based on their training error, suffer from one fundamental problem. These methods add hyper-parameters and early-stopping criteria that are impossible to tune without a validation set with human-annotated environments, the very information subject to discovery. In this paper, we propose Cross-Risk-Minimization (XRM) to address this issue. XRM trains two twin networks, each learning from one random half of the training data, while imitating confident held-out mistakes made by its sibling. XRM provides a recipe for hyper-parameter tuning, does not require early-stopping, and can discover environments for all training and validation data. Domain generalization algorithms built on top of XRM environments achieve oracle worst-group-accuracy, solving a long-standing problem in out-of-distribution generalization.
Negative Momentum for Improved Game Dynamics
Reyhane Askari Hemmat
Mohammad Pezeshki
Gabriel Huang
Rémi LE PRIOL
Games generalize the single-objective optimization paradigm by introducing different objective functions for different players. Differentiab… (voir plus)le games often proceed by simultaneous or alternating gradient updates. In machine learning, games are gaining new importance through formulations like generative adversarial networks (GANs) and actor-critic systems. However, compared to single-objective optimization, game dynamics are more complex and less understood. In this paper, we analyze gradient-based methods with momentum on simple games. We prove that alternating updates are more stable than simultaneous updates. Next, we show both theoretically and empirically that alternating gradient updates with a negative momentum term achieves convergence in a difficult toy adversarial problem, but also on the notoriously difficult to train saturating GANs.
Negative Momentum for Improved Game Dynamics
Reyhane Askari Hemmat
Mohammad Pezeshki
Gabriel Huang
Rémi LE PRIOL
Games generalize the single-objective optimization paradigm by introducing different objective functions for different players. Differentiab… (voir plus)le games often proceed by simultaneous or alternating gradient updates. In machine learning, games are gaining new importance through formulations like generative adversarial networks (GANs) and actor-critic systems. However, compared to single-objective optimization, game dynamics are more complex and less understood. In this paper, we analyze gradient-based methods with momentum on simple games. We prove that alternating updates are more stable than simultaneous updates. Next, we show both theoretically and empirically that alternating gradient updates with a negative momentum term achieves convergence in a difficult toy adversarial problem, but also on the notoriously difficult to train saturating GANs.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-rfou'
Guillaume Alain
Amjad Almahairi
Christof Angermüller
Nicolas Ballas
Frédéric Bastien
Justin S. Bayer
A. Belikov
A. Belopolsky
Arnaud Bergeron
J. Bergstra
Valentin Bisson
Josh Bleecher Snyder
Nicolas Bouchard
Nicolas Boulanger-Lewandowski
Xavier Bouthillier
Alexandre De Brébisson
Olivier Breuleux … (voir 92 de plus)
Pierre Luc Carrier
Kyunghyun Cho
Jan Chorowski
Paul F. Christiano
Tim Cooijmans
Marc-Alexandre Côté
Myriam Côté
Yann Dauphin
Olivier Delalleau
Julien Demouth
Guillaume Desjardins
Sander Dieleman
Laurent Dinh
M'elanie Ducoffe
Vincent Dumoulin
Dumitru Erhan
Ziye Fan
Orhan Firat
Mathieu Germain
Xavier Glorot
Ian J. Goodfellow
Matthew Graham
Caglar Gulcehre
Philippe Hamel
Iban Harlouchet
Jean-philippe Heng
Balázs Hidasi
Sina Honari
Arjun Jain
S'ebastien Jean
Kai Jia
Mikhail V. Korobov
Vivek Kulkarni
Alex Lamb
Pascal Lamblin
Eric P. Larsen
César Laurent
S. Lee
Simon-mark Lefrancois
Simon Lemieux
Nicholas Léonard
Zhouhan Lin
J. Livezey
Cory R. Lorenz
Jeremiah L. Lowin
Qianli M. Ma
Pierre-Antoine Manzagol
Olivier Mastropietro
R. McGibbon
Roland Memisevic
Bart Van Merrienboer
Vincent Michalski
Mehdi Mirza
Alberto Orlandi
Razvan Pascanu
Mohammad Pezeshki
Colin Raffel
Daniel Renshaw
Matthew David Rocklin
Markus Dr. Roth
Peter Sadowski
John Salvatier
Francois Savard
Jan Schlüter
John D. Schulman
Gabriel Schwartz
Iulian V. Serban
Dmitriy Serdyuk
Samira Shabanian
Etienne Simon
Sigurd Spieckermann
S. Subramanyam
Jakub Sygnowski
Jérémie Tanguay
Gijs van Tulder
Joseph P. Turian
Sebastian Urban
Francesco Visin
Harm de Vries
David Warde-Farley
Dustin J. Webb
M. Willson
Kelvin Xu
Lijun Xue
Li Yao
Saizheng Zhang
Ying Zhang
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-rfou'
Guillaume Alain
Amjad Almahairi
Christof Angermüller
Nicolas Ballas
Frédéric Bastien
Justin S. Bayer
A. Belikov
A. Belopolsky
Arnaud Bergeron
J. Bergstra
Valentin Bisson
Josh Bleecher Snyder
Nicolas Bouchard
Nicolas Boulanger-Lewandowski
Xavier Bouthillier
Alexandre De Brébisson
Olivier Breuleux … (voir 92 de plus)
Pierre Luc Carrier
Kyunghyun Cho
Jan Chorowski
Paul F. Christiano
Tim Cooijmans
Marc-Alexandre Côté
Myriam Côté
Yann Dauphin
Olivier Delalleau
Julien Demouth
Guillaume Desjardins
Sander Dieleman
Laurent Dinh
M'elanie Ducoffe
Vincent Dumoulin
Dumitru Erhan
Ziye Fan
Orhan Firat
Mathieu Germain
Xavier Glorot
Ian J. Goodfellow
Matthew Graham
Caglar Gulcehre
Philippe Hamel
Iban Harlouchet
Jean-philippe Heng
Balázs Hidasi
Sina Honari
Arjun Jain
S'ebastien Jean
Kai Jia
Mikhail V. Korobov
Vivek Kulkarni
Alex Lamb
Pascal Lamblin
Eric P. Larsen
César Laurent
S. Lee
Simon-mark Lefrancois
Simon Lemieux
Nicholas Léonard
Zhouhan Lin
J. Livezey
Cory R. Lorenz
Jeremiah L. Lowin
Qianli M. Ma
Pierre-Antoine Manzagol
Olivier Mastropietro
R. McGibbon
Roland Memisevic
Bart Van Merrienboer
Vincent Michalski
Mehdi Mirza
Alberto Orlandi
Razvan Pascanu
Mohammad Pezeshki
Colin Raffel
Daniel Renshaw
Matthew David Rocklin
Markus Dr. Roth
Peter Sadowski
John Salvatier
Francois Savard
Jan Schlüter
John D. Schulman
Gabriel Schwartz
Iulian V. Serban
Dmitriy Serdyuk
Samira Shabanian
Etienne Simon
Sigurd Spieckermann
S. Subramanyam
Jakub Sygnowski
Jérémie Tanguay
Gijs van Tulder
Joseph P. Turian
Sebastian Urban
Francesco Visin
Harm de Vries
David Warde-Farley
Dustin J. Webb
M. Willson
Kelvin Xu
Lijun Xue
Li Yao
Saizheng Zhang
Ying Zhang
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.