Publications

ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods

Victor Schmidt

Alexandra Luccioni

Mélisande Teng

Tianyu Zhang

Alexia Reynaud

Sunand Raghupathi

Gautier Cosne

Adrien Juraver

Vahe Vardanyan

Alex Hernández-García

Yoshua Bengio

Climate change is a major threat to humanity, and the actions required to prevent its catastrophic consequences include changes in both poli… (voir plus)cy-making and individual behaviour. However, taking action requires understanding the effects of climate change, even though they may seem abstract and distant. Projecting the potential consequences of extreme climate events such as flooding in familiar places can help make the abstract impacts of climate change more concrete and encourage action. As part of a larger initiative to build a website that projects extreme climate events onto user-chosen photos, we present our solution to simulate photo-realistic floods on authentic images. To address this complex task in the absence of suitable training data, we propose ClimateGAN, a model that leverages both simulated and real data for unsupervised domain adaptation and conditional image generation. In this paper, we describe the details of our framework, thoroughly evaluate components of our architecture and demonstrate that our model is capable of robustly generating photo-realistic flooding.

2022-01-27

ICLR.cc/2022/Conference (poster)

doi.org

openreview.net

Conditional Image Generation by Conditioning Variational Auto-Encoders

William Harvey

Saeid Naderiparizi

Frank N. Wood

We present a conditional variational auto-encoder (VAE) which, to avoid the substantial cost of training from scratch, uses an architecture … (voir plus)and training objective capable of leveraging a foundation model in the form of a pretrained unconditional VAE. To train the conditional VAE, we only need to train an artifact to perform amortized inference over the unconditional VAE's latent variables given a conditioning input. We demonstrate our approach on tasks including image inpainting, for which it outperforms state-of-the-art GAN-based approaches at faithfully representing the inherent uncertainty. We conclude by describing a possible application of our inpainting model, in which it is used to perform Bayesian experimental design for the purpose of guiding a sensor.

2022-01-27

ICLR.cc/2022/Conference (poster)

openreview.net

Constructing a Good Behavior Basis for Transfer Using Generalized Policy Updates

Safa Alver

Doina Precup

We study the problem of learning a good set of policies, so that when combined together, they can solve a wide variety of unseen reinforceme… (voir plus)nt learning tasks with no or very little new data. Specifically, we consider the framework of generalized policy evaluation and improvement, in which the rewards for all tasks of interest are assumed to be expressible as a linear combination of a fixed set of features. We show theoretically that, under certain assumptions, having access to a specific set of diverse policies, which we call a set of independent policies, can allow for instantaneously achieving high-level performance on all possible downstream tasks which are typically more complex than the ones on which the agent was trained. Based on this theoretical analysis, we propose a simple algorithm that iteratively constructs this set of policies. In addition to empirically validating our theoretical results, we compare our approach with recently proposed diverse policy set construction methods and show that, while others fail, our approach is able to build a behavior basis that enables instantaneous transfer to all possible downstream tasks. We also show empirically that having access to a set of independent policies can better bootstrap the learning process on downstream tasks where the new reward function cannot be described as a linear combination of the features. Finally, we demonstrate how this policy set can be useful in a lifelong reinforcement learning setting.

2022-01-27

ICLR.cc/2022/Conference (poster)

doi.org

openreview.net

Continuous-Time Meta-Learning with Forward Mode Differentiation

Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learni… (voir plus)ng (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field. Specifically, representations of the inputs are meta-learned such that a task-specific linear classifier is obtained as a solution of an ordinary differential equation (ODE). Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous, as opposed to a fixed and discrete number of gradient steps. As a consequence, we can optimize the amount of adaptation necessary to solve a new task using stochastic gradient descent, in addition to learning the initial conditions as is standard practice in gradient-based meta-learning. Importantly, in order to compute the exact meta-gradients required for the outer-loop updates, we devise an efficient algorithm based on forward mode differentiation, whose memory requirements do not scale with the length of the learning trajectory, thus allowing longer adaptation in constant memory. We provide analytical guarantees for the stability of COMLN, we show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.

2022-01-27

ICLR.cc/2022/Conference (spotlight)

doi.org

openreview.net

Coordination Among Neural Modules Through a Shared Global Workspace

Nan Rosemary Ke

Nasim Rahaman

Jonathan Binas

Charles Blundell

Michael Mozer

Yoshua Bengio

Deep learning has seen a movement away from representing examples with a monolithic hidden state towards a richly structured state. For exam… (voir plus)ple, Transformers segment by position, and object-centric architectures decompose images into entities. In all these architectures, interactions between different elements are modeled via pairwise interactions: Transformers make use of self-attention to incorporate information from other positions; object-centric architectures make use of graph neural networks to model interactions among entities. However, pairwise interactions may not achieve global coordination or a coherent, integrated representation that can be used for downstream tasks. In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information through a common, bandwidth-limited communication channel. We explore the use of such a communication channel in the context of deep learning for modeling the structure of complex environments. The proposed method includes a shared workspace through which communication among different specialist modules takes place but due to limits on the communication bandwidth, specialist modules must compete for access. We show that capacity limitations have a rational basis in that (1) they encourage specialization and compositionality and (2) they facilitate the synchronization of otherwise independent specialists.

2022-01-27

ICLR.cc/2022/Conference (présentation orale)

doi.org

openreview.net

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

Jongmin Lee

Cosmin Paduraru

Daniel J Mankowitz

Nicolas Heess

Doina Precup

Kee-Eung Kim

Arthur Guez

We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected… (voir plus) return while satisfying given cost constraints, learning only from a pre-collected dataset. This problem setting is appealing in many real-world scenarios, where direct interaction with the environment is costly or risky, and where the resulting policy should comply with safety constraints. However, it is challenging to compute a policy that guarantees satisfying the cost constraints in the offline RL setting, since the off-policy evaluation inherently has an estimation error. In this paper, we present an offline constrained RL algorithm that optimizes the policy in the space of the stationary distribution. Our algorithm, COptiDICE, directly estimates the stationary distribution corrections of the optimal policy with respect to returns, while constraining the cost upper bound, with the goal of yielding a cost-conservative policy for actual constraint satisfaction. Experimental results show that COptiDICE attains better policies in terms of constraint satisfaction and return-maximization, outperforming baseline algorithms.

2022-01-27

ICLR.cc/2022/Conference (spotlight)

doi.org

openreview.net

Deep ReLU Networks Preserve Expected Length

Boris Hanin

Ryan Jeong

David Rolnick

Assessing the complexity of functions computed by a neural network helps us understand how the network will learn and generalize. One natura… (voir plus)l measure of complexity is how the network distorts length - if the network takes a unit-length curve as input, what is the length of the resulting curve of outputs? It has been widely believed that this length grows exponentially in network depth. We prove that in fact this is not the case: the expected length distortion does not grow with depth, and indeed shrinks slightly, for ReLU networks with standard random initialization. We also generalize this result by proving upper bounds both for higher moments of the length distortion and for the distortion of higher-dimensional volumes. These theoretical results are corroborated by our experiments.

2022-01-27

ICLR.cc/2022/Conference (poster)

openreview.net

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

Aviral Kumar

Rishabh Agarwal

Tengyu Ma

Aaron Courville

George Tucker

Sergey Levine

Despite overparameterization, deep networks trained via supervised learning are surprisingly easy to optimize and exhibit excellent generali… (voir plus)zation. One hypothesis to explain this is that overparameterized deep networks enjoy the benefits of implicit regularization induced by stochastic gradient descent, which favors parsimonious solutions that generalize well on test inputs. It is reasonable to surmise that deep reinforcement learning (RL) methods could also benefit from this effect. In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations. Our theoretical analysis shows that when existing models of implicit regularization are applied to temporal difference learning, the resulting derived regularizer favors degenerate solutions with excessive aliasing, in stark contrast to the supervised learning case. We back up these findings empirically, showing that feature representations learned by a deep network value function trained via bootstrapping can indeed become degenerate, aliasing the representations for state-action pairs that appear on either side of the Bellman backup. To address this issue, we derive the form of this implicit regularizer and, inspired by this derivation, propose a simple and effective explicit regularizer, called DR3, that counteracts the undesirable effects of this implicit regularizer. When combined with existing offline RL methods, DR3 substantially improves performance and stability, alleviating unlearning in Atari 2600 games, D4RL domains, and robotic manipulation from images.

2022-01-27

ICLR.cc/2022/Conference (spotlight)

openreview.net

Fortuitous Forgetting in Connectionist Networks

Forgetting is often seen as an unwanted characteristic in both human and machine learning. However, we propose that forgetting can in fact b… (voir plus)e favorable to learning. We introduce "forget-and-relearn" as a powerful paradigm for shaping the learning trajectories of artificial neural networks. In this process, the forgetting step selectively removes undesirable information from the model, and the relearning step reinforces features that are consistently useful under different conditions. The forget-and-relearn framework unifies many existing iterative training algorithms in the image classification and language emergence literature, and allows us to understand the success of these algorithms in terms of the disproportionate forgetting of undesirable information. We leverage this understanding to improve upon existing algorithms by designing more targeted forgetting operations. Insights from our analysis provide a coherent view on the dynamics of iterative training in neural networks and offer a clear path towards performance improvements.

2022-01-27

ICLR.cc/2022/Conference (poster)

doi.org

openreview.net

Graph Neural Networks with Learnable Structural and Positional Representations

Vijay Prakash Dwivedi

Anh Tuan Luu

Thomas Laurent

Yoshua Bengio

Xavier Bresson

Graph neural networks (GNNs) have become the standard learning architectures for graphs. GNNs have been applied to numerous domains ranging … (voir plus)from quantum chemistry, recommender systems to knowledge graphs and natural language processing. A major issue with arbitrary graphs is the absence of canonical positional information of nodes, which decreases the representation power of GNNs to distinguish e.g. isomorphic nodes and other graph symmetries. An approach to tackle this issue is to introduce Positional Encoding (PE) of nodes, and inject it into the input layer, like in Transformers. Possible graph PE are Laplacian eigenvectors. In this work, we propose to decouple structural and positional representations to make easy for the network to learn these two essential properties. We introduce a novel generic architecture which we call LSPE (Learnable Structural and Positional Encodings). We investigate several sparse and fully-connected (Transformer-like) GNNs, and observe a performance increase for molecular datasets, from 1.79% up to 64.14% when considering learnable PE for both GNN classes.

2022-01-27

ICLR.cc/2022/Conference (poster)

doi.org

openreview.net

Learning to Guide and to Be Guided in the Architect-Builder Problem

Paul Barde

Tristan Karch

Derek Nowrouzezahrai

Clément Moulin-Frier

Christopher Pal

Pierre-Yves Oudeyer

We are interested in interactive agents that learn to coordinate, namely, a …

2022-01-27

ICLR.cc/2022/Conference (poster)

doi.org

openreview.net

Medical Doctors in Health Reforms

Jean-Louis Denis

Sabrina Germain

Catherine Régis

Gianluca Veronesi

Health and legal experts from England and Canada consider the influence of medical doctors on reforms in this comparative study. With reflec… (voir plus)tions on participation since the inception of publicly funded healthcare systems, they show how the status of doctors affects change.

2022-01-27

(publié)

doi.org

Mila Techaide 2026

Propulsion d'entrepreneurs scientifiques

Avantage IA : productivité dans la fonction publique

Publications

Mila Techaide 2026

Propulsion d'entrepreneurs scientifiques

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications