Publications

Counterexamples on the Monotonicity of Delay Optimal Strategies for Energy Harvesting Transmitters

We consider cross-layer design of delay optimal transmission strategies for energy harvesting transmitters where the data and energy arrival… (see more) processes are stochastic. Using Markov decision theory, we show that the value function is weakly increasing in the queue state and weakly decreasing in the battery state. It is natural to expect that the delay optimal policy should be weakly increasing in the queue and battery states. We show via counterexamples that this is not the case. In fact, we show that for some sample scenarios the delay optimal policy may perform 5–13% better than the best monotone policy.

2020-06-30

IEEE Wireless Communications Letters (published)

doi.org

arxiv.org

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

Wenyu Du

Zhouhan Lin

Yikang Shen

Timothy J. O’Donnell

Yoshua Bengio

Yue Zhang

It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally eff… (see more)iciently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances", where information between these two separate objectives shares the same intermediate representation. Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.

2020-06-30

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (published)

doi.org

arxiv.org

Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition

Claude Perreault

The recent development of sequencing technologies revolutionized our understanding of the inner workings of the cell as well as the way dise… (see more)ase is treated. A single RNA sequencing (RNA-Seq) experiment, however, measures tens of thousands of parameters simultaneously. While the results are information rich, data analysis provides a challenge. Dimensionality reduction methods help with this task by extracting patterns from the data by compressing it into compact vector representations. We present the factorized embeddings (FE) model, a self-supervised deep learning algorithm that learns simultaneously, by tensor factorization, gene and sample representation spaces. We ran the model on RNA-Seq data from two large-scale cohorts and observed that the sample representation captures information on single gene and global gene expression patterns. Moreover, we found that the gene representation space was organized such that tissue-specific genes, highly correlated genes as well as genes participating in the same GO terms were grouped. Finally, we compared the vector representation of samples learned by the FE model to other similar models on 49 regression tasks. We report that the representations trained with FE rank first or second in all of the tasks, surpassing, sometimes by a considerable margin, other representations. A toy example in the form of a Jupyter Notebook as well as the code and trained embeddings for this project can be found at: https://github.com/TrofimovAssya/FactorizedEmbeddings. Supplementary data are available at Bioinformatics online.

2020-06-30

Bioinform. (published)

doi.org

Geometric Wavelet Scattering Networks on Compact Riemannian Manifolds

Michael Perlmutter

Feng Gao

Guy Wolf

Matthew Hirn

The Euclidean scattering transform was introduced nearly a decade ago to improve the mathematical understanding of convolutional neural netw… (see more)orks. Inspired by recent interest in geometric deep learning, which aims to generalize convolutional neural networks to manifold and graph-structured domains, we define a geometric scattering transform on manifolds. Similar to the Euclidean scattering transform, the geometric scattering transform is based on a cascade of wavelet filters and pointwise nonlinearities. It is invariant to local isometries and stable to certain types of diffeomorphisms. Empirical results demonstrate its utility on several geometric learning tasks. Our results generalize the deformation stability and local translation invariance of Euclidean scattering, and demonstrate the importance of linking the used filter structures to the underlying geometry of the data.

2020-06-30

PubMed (unknown)

doi.org

proceedings.mlr.press

Handling Black Swan Events in Deep Learning with Diversely Extrapolated Neural Networks

Maxime Wabartha

Audrey Durand

Vincent François-Lavet

Joelle Pineau

By virtue of their expressive power, neural networks (NNs) are well suited to fitting large, complex datasets, yet they are also known to … (see more)produce similar predictions for points outside the training distribution. As such, they are, like humans, under the influence of the Black Swan theory: models tend to be extremely "surprised" by rare events, leading to potentially disastrous consequences, while justifying these same events in hindsight. To avoid this pitfall, we introduce DENN, an ensemble approach building a set of Diversely Extrapolated Neural Networks that fits the training data and is able to generalize more diversely when extrapolating to novel data points. This leads DENN to output highly uncertain predictions for unexpected inputs. We achieve this by adding a diversity term in the loss function used to train the model, computed at specific inputs. We first illustrate the usefulness of the method on a low-dimensional regression problem. Then, we show how the loss can be adapted to tackle anomaly detection during classification, as well as safe imitation learning problems.

2020-06-30

International Joint Conference on Artificial Intelligence (published)

doi.org

Interactive Machine Comprehension with Information Seeking Agents

Xingdi Yuan

Jie Fu

Marc-Alexandre Côté

Yi Tay

Christopher Pal

Adam Trischler

Existing machine reading comprehension (MRC) models do not scale effectively to real-world applications like web-level information retrieval… (see more) and question answering (QA). We argue that this stems from the nature of MRC datasets: most of these are static environments wherein the supporting documents and all necessary information are fully observed. In this paper, we propose a simple method that reframes existing MRC datasets as interactive, partially observable environments. Specifically, we “occlude” the majority of a document’s text and add context-sensitive commands that reveal “glimpses” of the hidden text to a model. We repurpose SQuAD and NewsQA as an initial case study, and then show how the interactive corpora can be used to train a model that seeks relevant information through sequential decision making. We believe that this setting can contribute in scaling models to web-level QA scenarios.

2020-06-30

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (published)

doi.org

arxiv.org

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)

Vincent François-Lavet

Guillaume Rabusseau

Joelle Pineau

Damien Ernst

Raphael Fonteneau

When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: … (see more)a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.

2020-06-30

International Joint Conference on Artificial Intelligence (published)

doi.org

Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions

Arjun Reddy Akula

Spandana Gella

Yaser Al-Onaizan

Song-Chun Zhu

Siva Reddy

Visual referring expression recognition is a challenging task that requires natural language understanding in the context of an image. We cr… (see more)itically examine RefCOCOg, a standard benchmark for this task, using a human study and show that 83.7% of test instances do not require reasoning on linguistic structure, i.e., words are enough to identify the target object, the word order doesn’t matter. To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn’t. Additionally, we create an out-of-distribution dataset Ref-Adv by asking crowdworkers to perturb in-domain examples such that the target object changes. Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task. We also propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT, the current state-of-the-art model for this task. Our datasets are publicly available at https://github.com/aws/aws-refcocog-adv.

2020-06-30

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (published)

doi.org

arxiv.org

Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences

Yi Tay

Donovan Ong

Jie Fu

Alvin Chan

Nancy Chen

Anh Tuan Luu

Christopher Pal

2020-06-30

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (published)

doi.org

Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

Anirudh Goyal

Alex Lamb

Phanideep Gampa

Philippe Beaudoin

Sergey Levine

Charles Blundell

Yoshua Bengio

Michael Curtis Mozer

2020-06-28

ArXiv (preprint)

arxiv.org

Inherent privacy limitations of decentralized contact tracing apps

Yoshua Bengio

Daphne Ippolito

Richard Janda

Max Jarvie

Benjamin Prud'homme

Jean-François Rousseau

Abhinav Sharma

Yun William Yu

2020-06-24

J. Am. Medical Informatics Assoc. (published)

doi.org

The Invariant Rauch-Tung-Striebel Smoother

Niels van der Laan

Mitchell Cohen

Jonathan Arsenault

James Richard Forbes

This paper presents an invariant Rauch-Tung- Striebel (IRTS) smoother applicable to systems with states that are an element of a matrix Lie … (see more)group. In particular, the extended Rauch-Tung-Striebel (RTS) smoother is adapted to work within a matrix Lie group framework. The main advantage of the invariant RTS (IRTS) smoother is that the linearization of the process and measurement models is independent of the state estimate resulting in state-estimate-independent Jacobians when certain technical requirements are met. A sample problem is considered that involves estimation of the three dimensional pose of a rigid body on SE(3), along with sensor biases. The multiplicative RTS (MRTS) smoother is also reviewed and is used as a direct comparison to the proposed IRTS smoother using experimental data. Both smoothing methods are also compared to invariant and multiplicative versions of the Gauss-Newton approach to solving the batch state estimation problem.

2020-06-24

IEEE Robotics and Automation Letters (published)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications