Publications

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

Haque Ishfaq

Qiwen Cui

Viet Nguyen

Alex Ayoub

Zhuoran Yang

Zhaoran Wang

Doina Precup

Lin F. Yang

2021-06-14

ArXiv (preprint)

proceedings.mlr.press

Variational Causal Networks: Approximate Bayesian Inference over Causal Structures

Yashas Annadani

Jonas Rothfuss

Alexandre Lacoste

Learning the causal structure that underlies data is a crucial step towards robust real-world decision making. The majority of existing work… (see more) in causal inference focuses on determining a single directed acyclic graph (DAG) or a Markov equivalence class thereof. However, a crucial aspect to acting intelligently upon the knowledge about causal structure which has been inferred from finite data demands reasoning about its uncertainty. For instance, planning interventions to find out more about the causal mechanisms that govern our data requires quantifying epistemic uncertainty over DAGs. While Bayesian causal inference allows to do so, the posterior over DAGs becomes intractable even for a small number of variables. Aiming to overcome this issue, we propose a form of variational inference over the graphs of Structural Causal Models (SCMs). To this end, we introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs. Its number of parameters does not grow exponentially with the number of variables and can be tractably learned by maximising an Evidence Lower Bound (ELBO). In our experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.

2021-06-13

ArXiv (preprint)

arxiv.org

Comparative Study of Learning Outcomes for Online Learning Platforms

Francois St-Hilaire

Nathan J. Burns

Robert Belfer

Muhammad Shayan

Ariella Smofsky

Dung D. Vu

Antoine Frau

Joseph Potochny

Farid Faraji

Vincent Pavero

Neroli Ko

Ansona Onyi Ching

Sabina Elkins

A. Stepanyan

Adela Matajova

Laurent Charlin

Yoshua Bengio

Iulian V. Serban

Ekaterina Kochmar

2021-06-11

Lecture Notes in Computer Science (published)

doi.org

arxiv.org

Understanding Capacity Saturation in Incremental Learning

Shenyang Huang

Vincent François-Lavet

Guillaume Rabusseau

2021-06-07

Canadian Conference on AI (published)

doi.org

Learning Brain Dynamics With Coupled Low-Dimensional Nonlinear Oscillators and Deep Recurrent Networks.

Germán Abrevaya

Guillaume Dumas

Aleksandr Y. Aravkin

Peng Zheng

Jean-christophe Gagnon-audet

James R. Kozloski

Pablo Polosecki

Guillaume Lajoie

David D. Cox

Silvina Ponce Dawson

Guillermo A. Cecchi

Irina Rish

Many natural systems, especially biological ones, exhibit complex multivariate nonlinear dynamical behaviors that can be hard to capture by … (see more)linear autoregressive models. On the other hand, generic nonlinear models such as deep recurrent neural networks often require large amounts of training data, not always available in domains such as brain imaging; also, they often lack interpretability. Domain knowledge about the types of dynamics typically observed in such systems, such as a certain type of dynamical systems models, could complement purely data-driven techniques by providing a good prior. In this work, we consider a class of ordinary differential equation (ODE) models known as van der Pol (VDP) oscil lators and evaluate their ability to capture a low-dimensional representation of neural activity measured by different brain imaging modalities, such as calcium imaging (CaI) and fMRI, in different living organisms: larval zebrafish, rat, and human. We develop a novel and efficient approach to the nontrivial problem of parameters estimation for a network of coupled dynamical systems from multivariate data and demonstrate that the resulting VDP models are both accurate and interpretable, as VDP's coupling matrix reveals anatomically meaningful excitatory and inhibitory interactions across different brain subsystems. VDP outperforms linear autoregressive models (VAR) in terms of both the data fit accuracy and the quality of insight provided by the coupling matrices and often tends to generalize better to unseen data when predicting future brain activity, being comparable to and sometimes better than the recurrent neural networks (LSTMs). Finally, we demonstrate that our (generative) VDP model can also serve as a data-augmentation tool leading to marked improvements in predictive accuracy of recurrent neural networks. Thus, our work contributes to both basic and applied dimensions of neuroimaging: gaining scientific insights and improving brain-based predictive models, an area of potentially high practical importance in clinical diagnosis and neurotechnology.

2021-06-06

Neural Computation (unknown)

doi.org

CMIM: Cross-Modal Information Maximization For Medical Imaging

Tristan Sylvain

Francis Dutil

Tess Berthier

Lisa Di Jorio

Margaux Luck

R Devon Hjelm

Yoshua Bengio

In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as th… (see more)e different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.) and their associated radiology reports. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.In this paper, we propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time, using recent advances in mutual information maximization. By maximizing cross-modal information at train time, we are able to outperform several state-of-the-art baselines in two different settings, medical image classification, and segmentation. In particular, our method is shown to have a strong impact on the inference-time performance of weaker modalities.

2021-06-05

IEEE International Conference on Acoustics, Speech, and Signal Processing (published)

doi.org

Double-Linear Thompson Sampling for Context-Attentive Bandits

Djallel Bouneffouf

Raphael Feraud

Sohini Upadhyay

Yasaman Khazaeni

Irina Rish

In this paper, we analyze and extend an online learning frame-work known as Context-Attentive Bandit, motivated by various practical applica… (see more)tions, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration; however, the agent has a freedom to choose which variables to observe. We derive a novel algorithm, called Context-Attentive Thompson Sampling (CATS), which builds upon the Linear Thompson Sampling approach, adapting it to Context-Attentive Bandit setting. We provide a theoretical regret analysis and an extensive empirical evaluation demonstrating advantages of the proposed approach over several baseline methods on a variety of real-life datasets.

2021-06-05

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)

doi.org

arxiv.org

Toward Skills Dialog Orchestration with Online Learning

Djallel Bouneffouf

Raphael Feraud

Sohini Upadhyay

Mayank Agarwal

Yasaman Khazaeni

Irina Rish

Building multi-domain AI agents is a challenging task and an open problem in the area of AI. Within the domain of dialog, the ability to orc… (see more)hestrate multiple independently trained dialog agents, or skills, to create a unified system is of particular significance. In this work, we study the task of online posterior dialog orchestration, where we define posterior orchestration as the task of selecting a subset of skills which most appropriately answer a user input using features extracted from both the user input and the individual skills. To account for the various costs associated with extracting skill features, we consider online posterior orchestration under a skill execution budget. We formalize this setting as Context Attentive Bandit with Observations (CABO), a variant of context attentive bandits, and evaluate it on proprietary conversational datasets.

2021-06-05

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)

doi.org

Multimodal dynamics modeling for off-road autonomous vehicles

Jean-François Tremblay

Travis Manderson

Aurélio Noca

Gregory Dudek

David Meger

Dynamics modeling in outdoor and unstructured environments is difficult because different elements in the environment interact with the robo… (see more)t in ways that can be hard to predict. Leveraging multiple sensors to perceive maximal information about the robot's environment is thus crucial when building a model to perform predictions about the robot's dynamics with the goal of doing motion planning. We design a model capable of long-horizon motion predictions, leveraging vision, lidar and proprioception, which is robust to arbitrarily missing modalities at test time. We demonstrate in simulation that our model is able to leverage vision to predict traction changes. We then test our model using a real-world challenging dataset of a robot navigating through a forest, performing predictions in trajectories unseen during training. We try different modality combinations at test time and show that, while our model performs best when all modalities are present, it is still able to perform better than the baseline even when receiving only raw vision input and no proprioception, as well as when only receiving proprioception. Overall, our study demonstrates the importance of leveraging multiple sensors when doing dynamics modeling in outdoor conditions.

2021-06-04

2021 IEEE International Conference on Robotics and Automation (ICRA) (published)

doi.org

arxiv.org

Encoder-Decoder Neural Architecture Optimization for Keyword Spotting

Tong Mo

Bang Liu

2021-06-03

ArXiv (preprint)

arxiv.org

Hierarchical Video Generation for Complex Data

Lluis Castrejon

Nicolas Ballas

Aaron Courville

2021-06-03

ArXiv (preprint)

arxiv.org

SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization

Soroosh Shahtalebi

Jean-christophe Gagnon-audet

A major bottleneck in the real-world applications of machine learning models is their failure in generalizing to unseen domains whose data d… (see more)istribution is not i.i.d to the training domains. This failure often stems from learning non-generalizable features in the training domains that are spuriously correlated with the label of data. To address this shortcoming, there has been a growing surge of interest in learning good explanations that are hard to vary, which is studied under the notion of Out-of-Distribution (OOD) Generalization. The search for good explanations that are \textit{invariant} across different domains can be seen as finding local (global) minimas in the loss landscape that hold true across all of the training domains. In this paper, we propose a masking strategy, which determines a continuous weight based on the agreement of gradients that flow in each edge of network, in order to control the amount of update received by the edge in each step of optimization. Particularly, our proposed technique referred to as"Smoothed-AND (SAND)-masking", not only validates the agreement in the direction of gradients but also promotes the agreement among their magnitudes to further ensure the discovery of invariances across training domains. SAND-mask is validated over the Domainbed benchmark for domain generalization and significantly improves the state-of-the-art accuracy on the Colored MNIST dataset while providing competitive results on other domain generalization datasets.

2021-06-03

ArXiv (preprint)

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications