Publications

Curriculum in Gradient-Based Meta-Reinforcement Learning

Bhairav Mehta

Christopher Pal

Gradient-based meta-learners such as Model-Agnostic Meta-Learning (MAML) have shown strong few-shot performance in supervised and reinforcem… (see more)ent learning settings. However, specifically in the case of meta-reinforcement learning (meta-RL), we can show that gradient-based meta-learners are sensitive to task distributions. With the wrong curriculum, agents suffer the effects of meta-overfitting, shallow adaptation, and adaptation instability. In this work, we begin by highlighting intriguing failure cases of gradient-based meta-RL and show that task distributions can wildly affect algorithmic outputs, stability, and performance. To address this problem, we leverage insights from recent literature on domain randomization and propose meta Active Domain Randomization (meta-ADR), which learns a curriculum of tasks for gradient-based meta-RL in a similar as ADR does for sim2real transfer. We show that this approach induces more stable policies on a variety of simulated locomotion and navigation tasks. We assess in- and out-of-distribution generalization and find that the learned task distributions, even in an unstructured task space, greatly improve the adaptation performance of MAML. Finally, we motivate the need for better benchmarking in meta-RL that prioritizes \textit{generalization} over single-task adaption performance.

2020-02-18

ArXiv (preprint)

arxiv.org

The Geometry of Sign Gradient Descent

Lukas Balles

Fabian Pedregosa

Nicolas Roux

2020-02-18

ArXiv (preprint)

openreview.net

Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models

Chin-wei Huang

Laurent Dinh

Aaron Courville

In this work, we propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drasticall… (see more)y increasing the computational cost of sampling and evaluation of a lower bound on the likelihood. Theoretically, we prove the proposed flow can approximate a Hamiltonian ODE as a universal transport map. Empirically, we demonstrate state-of-the-art performance on standard benchmarks of flow-based generative modeling.

2020-02-16

ArXiv (preprint)

arxiv.org

HighRes-net: Recursive Fusion for Multi-Frame Super-Resolution of Satellite Imagery

Michel Deudon

Alfredo Kalaitzis

Israel Goytom

Md Rifat Arefin

Zhichao Lin

Kris Sankaran

Vincent Michalski

S Ebrahimi Kahou

Julien Cornebise

Yoshua Bengio

Generative deep learning has sparked a new wave of Super-Resolution (SR) algorithms that enhance single images with impressive aesthetic res… (see more)ults, albeit with imaginary details. Multi-frame Super-Resolution (MFSR) offers a more grounded approach to the ill-posed problem, by conditioning on multiple low-resolution views. This is important for satellite monitoring of human impact on the planet -- from deforestation, to human rights violations -- that depend on reliable imagery. To this end, we present HighRes-net, the first deep learning approach to MFSR that learns its sub-tasks in an end-to-end fashion: (i) co-registration, (ii) fusion, (iii) up-sampling, and (iv) registration-at-the-loss. Co-registration of low-resolution views is learned implicitly through a reference-frame channel, with no explicit registration mechanism. We learn a global fusion operator that is applied recursively on an arbitrary number of low-resolution pairs. We introduce a registered loss, by learning to align the SR output to a ground-truth through ShiftNet. We show that by learning deep representations of multiple views, we can super-resolve low-resolution signals and enhance Earth Observation data at scale. Our approach recently topped the European Space Agency's MFSR competition on real-world satellite imagery.

2020-02-14

ArXiv (preprint)

arxiv.org

Minimax Theorem for Latent Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

Gauthier Gidel

D. Balduzzi

Wojciech M. Czarnecki

M. Garnelo

Yoram Bachrach

Adversarial training, a special case of multi-objective optimization, is an increasingly useful tool in machine learning. For example, two-p… (see more)layer zero-sum games are important for generative modeling (GANs) and for mastering games like Go or Poker via self-play. A classic result in Game Theory states that one must mix strategies, as pure equilibria may not exist. Surprisingly, machine learning practitioners typically train a \emph{single} pair of agents -- instead of a pair of mixtures -- going against Nash's principle. Our main contribution is a notion of limited-capacity-equilibrium for which, as capacity grows, optimal agents -- not mixtures -- can learn increasingly expressive and realistic behaviors. We define \emph{latent games}, a new class of game where agents are mappings that transform latent distributions. Examples include generators in GANs, which transform Gaussian noise into distributions on images, and StarCraft II agents, which transform sampled build orders into policies. We show that minimax equilibria in latent games can be approximated by a \emph{single} pair of dense neural networks. Finally, we apply our latent game approach to solve differentiable Blotto, a game with an infinite strategy space.

2020-02-13

ArXiv (preprint)

arxiv.org

Saliency Enhancement using Gradient Domain Edges Merging

Dominique Beaini

Sofiane Wozniak Achiche

Alexandre Duperre

Maxime Raison

In recent years, there has been a rapid progress in solving the binary problems in computer vision, such as edge detection which finds the b… (see more)oundaries of an image and salient object detection which finds the important object in an image. This progress happened thanks to the rise of deep-learning and convolutional neural networks (CNN) which allow to extract complex and abstract features. However, edge detection and saliency are still two different fields and do not interact together, although it is intuitive for a human to detect salient objects based on its boundaries. Those features are not well merged in a CNN because edges and surfaces do not intersect since one feature represents a region while the other represents boundaries between different regions. In the current work, the main objective is to develop a method to merge the edges with the saliency maps to improve the performance of the saliency. Hence, we developed the gradient-domain merging (GDM) which can be used to quickly combine the image-domain information of salient object detection with the gradient-domain information of the edge detection. This leads to our proposed saliency enhancement using edges (SEE) with an average improvement of the F-measure of at least 3.4 times higher on the DUT-OMRON dataset and 6.6 times higher on the ECSSD dataset, when compared to competing algorithm such as denseCRF and BGOF. The SEE algorithm is split into 2 parts, SEE-Pre for preprocessing and SEE-Post pour postprocessing.

2020-02-10

ArXiv (preprint)

arxiv.org

Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks

Mustafa Alghali

We introduce a conditional Generative Adversarial Network (cGAN) approach to generate cloud reflectance fields (CRFs) conditioned on large s… (see more)cale meteorological variables such as sea surface temperature and relative humidity. We show that our trained model can generate realistic CRFs from the corresponding meteorological observations, which represents a step towards a data-driven framework for stochastic cloud parameterization.

2020-02-09

ArXiv (preprint)

arxiv.org

Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces

Thang Doan

We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly embedded in a low-dimensional space while the embedded policy incurs almost no decrease in return.

2020-02-06

(published)

doi.org

arxiv.org

Structural Inductive Biases in Emergent Communication

Agnieszka Słowik

Abhinav Gupta

William L. Hamilton

Mateja Jamnik

Sean B. Holden

Christopher Pal

In order to communicate, humans flatten a complex representation of ideas and their attributes into a single word or a sentence. We investig… (see more)ate the impact of representation learning in artificial agents by developing graph referential games. We empirically show that agents parametrized by graph neural networks develop a more compositional language compared to bag-of-words and sequence models, which allows them to systematically generalize to new combinations of familiar features.

2020-02-03

arXiv.org (preprint)

doi.org

arxiv.org

Cybersanté : les tentatives juridiques pour objectiver un domaine en pleine effervescence

Vincent Gautrais

Catherine Régis

2020-01-31

(published)

www.semanticscholar.org

Resting-state connectivity stratifies premanifest Huntington’s disease by longitudinal cognitive decline rate

Pablo Polosecki

Eduardo Castro

Irina Rish

Dorian Pustina

John H. Warner

Andrew Wood

Cristina Sampaio

Guillermo Cecchi

2020-01-26

Scientific Reports (published)

doi.org

Using Simulated Data to Generate Images of Climate Change

Gautier Cosne

Adrien Juraver

Mélisande Teng

Victor Schmidt

Vahe Vardanyan

Alexandra Luccioni

Yoshua Bengio

Generative adversarial networks (GANs) used in domain adaptation tasks have the ability to generate images that are both realistic and perso… (see more)nalized, transforming an input image while maintaining its identifiable characteristics. However, they often require a large quantity of training data to produce high-quality images in a robust way, which limits their usability in cases when access to data is limited. In our paper, we explore the potential of using images from a simulated 3D environment to improve a domain adaptation task carried out by the MUNIT architecture, aiming to use the resulting images to raise awareness of the potential future impacts of climate change.

2020-01-25

ArXiv (preprint)

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications