Publications

Emergent Communication under Competition

Michael Noukhovitch

Travis LaCroix

Angeliki Lazaridou

Aaron Courville

2020-12-31

AAMAS (published)

doi.org

arxiv.org

An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming

Minkai Xu

Wujie Wang

Shitong Luo

Chence Shi

Yoshua Bengio

Rafael Gomez-Bombarelli

Jian Tang

Predicting molecular conformations (or 3D structures) from molecular graphs is a fundamental problem in many applications. Most existing app… (see more)roaches are usually divided into two steps by first predicting the distances between atoms and then generating a 3D structure through optimizing a distance geometry problem. However, the distances predicted with such two-stage approaches may not be able to consistently preserve the geometry of local atomic neighborhoods, making the generated structures unsatisfying. In this paper, we propose an end-to-end solution for molecular conformation prediction called ConfVAE based on the conditional variational autoencoder framework. Specifically, the molecular graph is first encoded in a latent space, and then the 3D structures are generated by solving a principled bilevel optimization program. Extensive experiments on several benchmark data sets prove the effectiveness of our proposed approach over existing state-of-the-art approaches. Code is available at https://github.com/MinkaiXu/ConfVAE-ICML21

2020-12-31

ICML (published)

doi.org

proceedings.mlr.press

Enjeux juridiques propres au modèle émergent des patients accompagnateurs dans les milieux de soins au Québec (Legal Issues Arising from the Emerging Model of Accompanying Patients in the Quebec Healthcare System)

Léa Boutrouille

Catherine Régis

Marie-Pascale Pomey

2020-12-31

SSRN Electronic Journal (published)

doi.org

Episodes Meta Sequence S 2 Fast Update Slow Update Fast Update Slow Update

Kanika Madan

Nan Rosemary Ke

Anirudh Goyal

Bernhard Schölkopf

Yoshua Bengio

Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution. A learning age… (see more)nt interacting with its environment is likely to be faced with situations requiring novel combinations of existing pieces of knowledge. We hypothesize that such a decomposition of knowledge is particularly relevant for being able to generalize in a systematic manner to out-of-distribution changes. To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks. An attention mechanism dynamically selects which modules can be adapted to the current task, and the parameters of the selected modules are allowed to change quickly as the learner is confronted with variations in what it experiences, while the parameters of the attention mechanisms act as stable, slowly changing, metaparameters.We focus on pieces of knowledge captured by an ensemble of modules sparsely communicating with each other via a bottleneck of attention. We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup involving navigation in a partially observed grid world with image-level input. We also find that reversing the role of parameters and meta-parameters does not work nearly as well, suggesting a particular role for fast adaptation of the dynamically selected modules.

2020-12-31

(published)

www.semanticscholar.org

Explaining by Analogy: Case-based Abductive Natural Language Inference

Ruben Cartuyvels

Graham Spinks

Marie Francine

Peter Clark

Isaac Cowhey

Oren Etzioni

Tushar Khot

Rajarshi Das

Ameya Godbole

Shehzaad Dhuliawala

Manzil Zaheer

Andrew McCallum

Dung Ngoc Thai

Ameya

Ethan Godbole

Jay-Yoon Perez

Lee

Lizhen

Ramón López De Mántaras

David Mcsherry … (see 37 more)

David Bridge

Barry Leake

Susan Smyth

Craw.

Boi

Maryalice Faltings

Michael T Maher

Ken-552 Cox

Dorottya Demszky

Kelvin Guu

Percy Liang

Jacob Devlin

Ming-Wei Chang

Kenton Lee

Daniel Fried

Peter Jansen

Gus Hahn-Powell

Higher-575

Rebecca Emilie Sharp

M. Surdeanu

Zhengnan Xie

Sebastian Thiem

Jaycie Ryrholm Martin

Eliz-721 abeth Wainwright

Steven Marmorstein

Wenhan Xiong

Xiang Lorraine Li

Srini Iyer

Jingfei Du

Vikas Yadav

Steven Bethard

Zhilin Yang

Peng Qi

Saizheng Zhang

Yoshua Bengio

William W Cohen

Russ Salakhutdinov

Existing accounts of explanation emphasise 001 the role of prior experience and analogy in 002 the solution of new problems. However, most 0… (see more)03 of the contemporary models for multi-hop tex-004 tual inference construct explanations consider-005 ing each test case in isolation. This paradigm 006 is known to suffer from semantic drift, which 007 causes the construction of spurious explana-008 tions leading to wrong predictions. In con-009 trast, we propose an abductive framework for 010 multi-hop inference that adopts the retrieve - 011 reuse - revise paradigm largely studied in case-012 based reasoning . Speciﬁcally, we present 013 ETNA ( E xplana t io n by A nalogy), a novel 014 model that addresses unseen inference prob-015 lems by retrieving and adapting prior expla-016 nations from similar training examples. We 017 empirically evaluate the case-based abductive 018 framework on downstream commonsense and 019 scientiﬁc reasoning tasks. Our experiments 020 demonstrate that ETNA can be effectively in-021 tegrated with sparse and dense encoding mech-022 anisms or downstream transformers, achiev-023 ing strong performance when compared to ex-024 isting explainable approaches. Moreover, we 025 study the impact of the retrieve - reuse - revise 026 paradigm on explainability and semantic drift, 027 showing that it boosts the quality of the con-028 structed explanations, resulting in improved 029 downstream inference performance. 030

2020-12-31

(published)

www.semanticscholar.org

Exploring the Wasserstein metric for survival analysis

Tristan Sylvain

Margaux Luck

Joseph Paul Cohen

Heloise Cardinal

Andrea Lodi

Yoshua Bengio

2020-12-31

SPACA (published)

proceedings.mlr.press

Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments

Anirudh Goyal

Alex Lamb

Phanideep Gampa

Philippe Beaudoin

Charles Blundell

Sergey Levine

Yoshua Bengio

Michael Mozer

2020-12-31

ICLR (published)

openreview.net

Fast and Slow Learning of Recurrent Independent Mechanisms

Bernhard Schölkopf

Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution. A learning age… (see more)nt interacting with its environment is likely to be faced with situations requiring novel combinations of existing pieces of knowledge. We hypothesize that such a decomposition of knowledge is particularly relevant for being able to generalize in a systematic manner to out-of-distribution changes. To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks. An attention mechanism dynamically selects which modules can be adapted to the current task, and the parameters of the selected modules are allowed to change quickly as the learner is confronted with variations in what it experiences, while the parameters of the attention mechanisms act as stable, slowly changing, meta-parameters. We focus on pieces of knowledge captured by an ensemble of modules sparsely communicating with each other via a bottleneck of attention. We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup involving navigation in a partially observed grid world with image-level input. We also find that reversing the role of parameters and meta-parameters does not work nearly as well, suggesting a particular role for fast adaptation of the dynamically selected modules.

2020-12-31

ICLR (published)

doi.org

openreview.net

Finite time analysis of temporal difference learning with linear function approximation: the tail averaged case

Gandharv Patil

Prashanth L.A.

Doina Precup

In this paper, we study the ﬁnite-time behaviour of temporal difference (TD) learning algorithms when combined with tail-averaging, and pr… (see more)esent instance dependent bounds on the parameter error of the tail-averaged TD iterate. Our error bounds hold in expectation as well as with high probability, exhibit a sharper rate of decay for the initial error (bias), and are comparable with existing bounds in the literature.

2020-12-31

(published)

www.semanticscholar.org

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

This paper is about the problem of learning a stochastic policy for generating an object (like a molecular graph) from a sequence of actions… (see more), such that the probability of generating an object is proportional to a given positive reward for that object. Whereas standard return maximization tends to converge to a single return-maximizing sequence, there are cases where we would like to sample a diverse set of high-return solutions. These arise, for example, in black-box function optimization when few rounds are possible, each with large batches of queries, where the batches should be diverse, e.g., in the design of new molecules. One can also see this as a problem of approximately converting an energy function to a generative distribution. While MCMC methods can achieve that, they are expensive and generally only perform local exploration. Instead, training a generative policy amortizes the cost of search during training and yields to fast generation. Using insights from Temporal Difference learning, we propose GFlowNet, based on a view of the generative process as a flow network, making it possible to handle the tricky case where different trajectories can yield the same final state, e.g., there are many ways to sequentially add atoms to generate some molecular graph. We cast the set of trajectories as a flow and convert the flow consistency equations into a learning objective, akin to the casting of the Bellman equations into Temporal Difference methods. We prove that any global minimum of the proposed objectives yields a policy which samples from the desired distribution, and demonstrate the improved performance and diversity of GFlowNet on a simple domain where there are many modes to the reward function, and on a molecule synthesis task.

2020-12-31

NeurIPS (published)

doi.org

openreview.net

Geo-Spatiotemporal Features and Shape-Based Prior Knowledge for Fine-grained Imbalanced Data Classification

Charles (A.) Kantor

Marta Skreta

Brice Rauby

Léonard Boussioux

Emmanuel Jehanno

Alexandra Luccioni

David Rolnick

Hugues Talbot

Fine-grained classification aims at distinguishing between items with similar global perception and patterns, but that differ by minute deta… (see more)ils. Our primary challenges come from both small inter-class variations and large intra-class variations. In this article, we propose to combine several innovations to improve fine-grained classification within the use-case of wildlife, which is of practical interest for experts. We utilize geo-spatiotemporal data to enrich the picture information and further improve the performance. We also investigate state-of-the-art methods for handling the imbalanced data issue.

2020-12-31

arXiv (preprint)

doi.org

arxiv.org

Guest Editorial Explainable AI: Towards Fairness, Accountability, Transparency and Trust in Healthcare

Arash Shaban-Nejad

Martin Michalowski

John S. Brownstein

David L Buckeridge

2020-12-31

IEEE journal of biomedical and health informatics (published)

doi.org

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Publications

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Popular keywords:

Publications