Aaron Courville

Razvan Ciuca

Maîtrise recherche - Université de Montréal

Alexandre Diz Ganito

Maîtrise recherche - UdeM

Juan Duque

Doctorat - UdeM

Doctorat - UdeM

Doctorat - UdeM

Uday Kapur

Maîtrise professionnelle - UdeM

Amr Khalifa

Doctorat - UdeM

andrei.nicolicioiu@gmail.com

Samuel Lavoie

Doctorat - UdeM

Zhixuan Lin

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Doctorat - UdeM

Co-superviseur⋅e :

Rishabh Agarwal

Andrei Nicolicioiu

Doctorat - UdeM

Site web

Google Scholar

Evgenii Nikishin

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

Co-superviseur⋅e :

Johan Samir Obando Ceron

Doctorat - UdeM

Co-superviseur⋅e :

Maîtrise recherche - UdeM

pichedereck@gmail.com

Site web

Esra'a Saleh

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Maîtrise recherche - UdeM

Superviseur⋅e principal⋅e :

Anna (Cheng-Zhi) Huang

Shawn Tan

Doctorat - UdeM

Doctorat - UdeM

Superviseur⋅e principal⋅e :

(Rex) Devon Hjelm

Google Scholar

Yusong Wu

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Anna (Cheng-Zhi) Huang

Dinghuai Zhang

Doctorat - UdeM

Co-superviseur⋅e :

Doctorat - UdeM

Hattie Zhou

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Hugo Larochelle

Publications

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

Aviral Kumar

Rishabh Agarwal

Tengyu Ma

George Tucker

Sergey Levine

Despite overparameterization, deep networks trained via supervised learning are surprisingly easy to optimize and exhibit excellent generali… (voir plus)zation. One hypothesis to explain this is that overparameterized deep networks enjoy the benefits of implicit regularization induced by stochastic gradient descent, which favors parsimonious solutions that generalize well on test inputs. It is reasonable to surmise that deep reinforcement learning (RL) methods could also benefit from this effect. In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations. Our theoretical analysis shows that when existing models of implicit regularization are applied to temporal difference learning, the resulting derived regularizer favors degenerate solutions with excessive aliasing, in stark contrast to the supervised learning case. We back up these findings empirically, showing that feature representations learned by a deep network value function trained via bootstrapping can indeed become degenerate, aliasing the representations for state-action pairs that appear on either side of the Bellman backup. To address this issue, we derive the form of this implicit regularizer and, inspired by this derivation, propose a simple and effective explicit regularizer, called DR3, that counteracts the undesirable effects of this implicit regularizer. When combined with existing offline RL methods, DR3 substantially improves performance and stability, alleviating unlearning in Atari 2600 games, D4RL domains, and robotic manipulation from images.

2022-01-28

ICLR.cc/2022/Conference (spotlight)

Fortuitous Forgetting in Connectionist Networks

Hattie Zhou

Ankit Vani

Hugo Larochelle

Forgetting is often seen as an unwanted characteristic in both human and machine learning. However, we propose that forgetting can in fact b… (voir plus)e favorable to learning. We introduce"forget-and-relearn"as a powerful paradigm for shaping the learning trajectories of artificial neural networks. In this process, the forgetting step selectively removes undesirable information from the model, and the relearning step reinforces features that are consistently useful under different conditions. The forget-and-relearn framework unifies many existing iterative training algorithms in the image classification and language emergence literature, and allows us to understand the success of these algorithms in terms of the disproportionate forgetting of undesirable information. We leverage this understanding to improve upon existing algorithms by designing more targeted forgetting operations. Insights from our analysis provide a coherent view on the dynamics of iterative training in neural networks and offer a clear path towards performance improvements.

2022-01-28

ICLR.cc/2022/Conference (poster)

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Yusong Wu

Ethan Manilow

Yi Deng

Rigel Swavely

Kyle Kastner

Tim Cooijmans

Anna (Cheng-Zhi) Huang

Jesse Engel

Musical expression requires control of both what notes are played, and how they are performed. Conventional audio synthesizers provide detai… (voir plus)led expressive controls, but at the cost of realism. Black-box neural audio synthesis and concatenative samplers can produce realistic audio, but have few mechanisms for control. In this work, we introduce MIDI-DDSP a hierarchical model of musical instruments that enables both realistic neural audio synthesis and detailed user control. Starting from interpretable Differentiable Digital Signal Processing (DDSP) synthesis parameters, we infer musical notes and high-level properties of their expressive performance (such as timbre, vibrato, dynamics, and articulation). This creates a 3-level hierarchy (notes, performance, synthesis) that affords individuals the option to intervene at each level, or utilize trained priors (performance given notes, synthesis given performance) for creative assistance. Through quantitative experiments and listening tests, we demonstrate that this hierarchy can reconstruct high-fidelity audio, accurately predict performance attributes for a note sequence, independently manipulate the attributes of a given performance, and as a complete system, generate realistic audio from a novel note sequence. By utilizing an interpretable hierarchy, with multiple levels of granularity, MIDI-DDSP opens the door to assistive tools to empower individuals across a diverse range of musical experience.

2022-01-28

ICLR.cc/2022/Conference (présentation orale)

Invariant representation driven neural classifier for anti-QCD jet tagging

Taoli Cheng

2022-01-18

ArXiv (preprint)

doi.org

arxiv.org

Chunked Autoregressive GAN for Conditional Waveform Synthesis

Max Morrison

Rithesh Kumar

Kundan Kumar

Prem Seetharaman

Yoshua Bengio

2022-01-01

International Conference on Learning Representations (publié)

Consistency-CAM: Towards Improved Weakly Supervised Semantic Segmentation.

Sai Rajeswar

Issam Hadj Laradji

Pau Rodriguez

David Vazquez

2022-01-01

BMVC (publié)

dblp.uni-trier.de

Learning to Dequantise with Truncated Flows

Shawn Tan

Chin-Wei Huang

Alessandro Sordoni

Dequantisation is a general technique used for transforming data described by a discrete random variable x into a continuous (latent) random… (voir plus) variable z, for the purpose of it being modeled by likelihood-based density models. Dequantisation was first introduced in the context of ordinal data, such as image pixel values. However, when the data is categorical, the dequantisation scheme is not obvious. We learn such a dequantisation scheme q(z|x), using variational inference with TRUncated FLows (TRUFL) — a novel flow-based model that allows the dequantiser to have a learnable truncated support. Unlike previous work, the TRUFL dequantiser is (i) capable of embedding the data losslessly in certain cases, since the truncation allows the conditional distributions q(z|x) to have non-overlapping bounded supports, while being (ii) trainable with back-propagation. Addtionally, since the support of the marginal q(z) is bounded and the support of prior p(z) is not, we propose to renormalise the prior distribution over the support of q(z). We derive a lower bound for training, and propose a rejection sampling scheme to account for the invalid samples. Experimentally, we benchmark TRUFL on constrained generation tasks, and find that it outperforms prior approaches. In addition, we find that rejection sampling results in higher validity for the constrained problems.

2022-01-01

International Conference on Learning Representations (publié)

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

Rishabh Agarwal

Max Schwarzer

Pablo Samuel Castro

Marc Gendron-Bellemare

Riemannian Diffusion Models

Chin-Wei Huang

Milad Aghajohari

Joey Bose

Prakash Panangaden

Diffusion models are recent state-of-the-art methods for image generation and likelihood estimation. In this work, we generalize continuous-… (voir plus)time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation. Computationally, we propose new methods for computing the Riemannian divergence which is needed for likelihood estimation. Moreover, in generalizing the Euclidean case, we prove that maximizing this variational lower-bound is equivalent to Riemannian score matching. Empirically, we demonstrate the expressive power of Riemannian diffusion models on a wide spectrum of smooth manifolds, such as spheres, tori, hyperboloids, and orthogonal groups. Our proposed method achieves new state-of-the-art likelihoods on all benchmarks.

Unifying Likelihood-free Inference with Black-box Optimization and Beyond

Dinghuai Zhang

Jie Fu

Yoshua Bengio

Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on th… (voir plus)e pharmaceutical industry. In this work, we propose to unify two seemingly distinct worlds: likelihood-free inference and black-box optimization, under one probabilistic framework. In tandem, we provide a recipe for constructing various sequence design methods based on this framework. We show how previous optimization approaches can be"reinvented"in our framework, and further propose new probabilistic black-box optimization algorithms. Extensive experiments on sequence design application illustrate the benefits of the proposed methodology.

2022-01-01

ICLR (publié)

Unsupervised Dependency Graph Network

Yikang Shen

Shawn Tan

Alessandro Sordoni

Peng Li

Jie Zhou

Recent work has identified properties of pretrained self-attention models that mirror those of dependency parse structures. In particular, s… (voir plus)ome self-attention heads correspond well to individual dependency types. Inspired by these developments, we propose a new competitive mechanism that encourages these attention heads to model different dependency relations. We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task. Experiment results show that UDGN achieves very strong unsupervised dependency parsing performance without gold POS tags and any other external information. The competitive gated heads show a strong correlation with human-annotated dependency types. Furthermore, the UDGN can also achieve competitive performance on masked language modeling and sentence textual similarity tasks.

2022-01-01

Annual Meeting of the Association for Computational Linguistics (published)

doi.org

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar

Pau Rodriguez

Soumye Singhal

David Vazquez