Publications

Vision-Based Goal-Conditioned Policies for Underwater Navigation in the Presence of Obstacles.

Travis Manderson

Juan Camilo Gamboa Higuera

Stefan Wapnick

Florian Shkurti

We present Nav2Goal, a data-efficient and end-to-end learning method for goal-conditioned visual navigation. Our technique is used to train … (voir plus)a navigation policy that enables a robot to navigate close to sparse geographic waypoints provided by a user without any prior map, all while avoiding obstacles and choosing paths that cover user-informed regions of interest. Our approach is based on recent advances in conditional imitation learning. General-purpose, safe and informative actions are demonstrated by a human expert. The learned policy is subsequently extended to be goal-conditioned by training with hindsight relabelling, guided by the robot's relative localization system, which requires no additional manual annotation. We deployed our method on an underwater vehicle in the open ocean to collect scientifically relevant data of coral reefs, which allowed our robot to operate safely and autonomously, even at very close proximity to the coral. Our field deployments have demonstrated over a kilometer of autonomous visual navigation, where the robot reaches on the order of 40 waypoints, while collecting scientifically relevant data. This is done while travelling within 0.5 m altitude from sensitive corals and exhibiting significant learned agility to overcome turbulent ocean conditions and to actively avoid collisions.

2020-07-11

Robotics: Science and Systems XVI (publié)

doi.org

arxiv.org

SVRG for Policy Evaluation with Fewer Gradient Evaluations

Stochastic variance-reduced gradient (SVRG) is an optimization method originally designed for tackling machine learning problems with a fini… (voir plus)te sum structure. SVRG was later shown to work for policy evaluation, a problem in reinforcement learning in which one aims to estimate the value function of a given policy. SVRG makes use of gradient estimates at two scales. At the slower scale, SVRG computes a full gradient over the whole dataset, which could lead to prohibitive computation costs. In this work, we show that two variants of SVRG for policy evaluation could significantly diminish the number of gradient calculations while preserving a linear convergence speed. More importantly, our theoretical result implies that one does not need to use the entire dataset in every epoch of SVRG when it is applied to policy evaluation with linear function approximation. Our experiments demonstrate large computational savings provided by the proposed methods.

2020-07-10

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (publié)

doi.org

arxiv.org

Material for IEEE Software paper "How Do Open Source Software Contributors Perceive and Address Usability?"

Wenting Wang

Jinghui Cheng

Jin L.C. Guo

2020-07-09

(publié)

doi.org

Attenuated Anticipation of Social and Monetary Rewards in Autism Spectrum Disorders

Sarah Baumeister

Carolin Moessnang

Nico Bast

Sarah Hohmann

Julian Tillmann

David Goyard

Tony Charman

Sara Ambrosino

Simon Baron-Cohen

Christian Beckmann

Sven Bölte

Thomas Bourgeron

Annika Rausch

Daisy Crawley

Flavio Dell’Acqua

Guillaume Dumas

Sarah Durston

Christine Ecker

Dorothea L. Floris

Vincent Frouin … (voir 19 de plus)

Hannah Hayward

Rosemary Holt

Mark Johnson

Emily J. H. Jones

Meng-Chuan Lai

Michael V. Lombardo

Luke Mason

Marianne Oldehinkel

Tony Persico

Antonia San José Cáceres

Thomas Wolfers

Will Spooren

Eva Loth

Declan Murphy

Jan K. Buitelaar

Heike Tost

Andreas Meyer-Lindenberg

Tobias Banaschewski

Daniel Brandeis

Background Reward processing has been proposed to underpin atypical social behavior, a core feature of autism spectrum disorder (ASD). Howev… (voir plus)er, previous neuroimaging studies have yielded inconsistent results regarding the specificity of atypicalities for social rewards in ASD. Utilizing a large sample, we aimed to assess altered reward processing in response to reward type (social, monetary) and reward phase (anticipation, delivery) in ASD. Methods Functional magnetic resonance imaging during social and monetary reward anticipation and delivery was performed in 212 individuals with ASD (7.6-30.5 years) and 181 typically developing (TD) participants (7.6-30.8 years). Results Across social and monetary reward anticipation, whole-brain analyses (p0.05, family-wise error-corrected) showed hypoactivation of the right ventral striatum (VS) in ASD. Further, region of interest (ROI) analy

2020-07-05

bioRxiv (prépublication)

doi.org

Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

Kian Ahrabian

Daniel Tarlow

Hehuimin Cheng

Jin L.C. Guo

We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest socia… (voir plus)l coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.

2020-07-01

ArXiv (prépublication)

arxiv.org

Compositional Generalization by Factorizing Alignment and Translation

Jacob Russin

Jason Jo

R. O’Reilly

Yoshua Bengio

2020-06-30

Annual Meeting of the Association for Computational Linguistics (publié)

doi.org

Counterexamples on the Monotonicity of Delay Optimal Strategies for Energy Harvesting Transmitters

Borna Sayedana

Aditya Mahajan

We consider cross-layer design of delay optimal transmission strategies for energy harvesting transmitters where the data and energy arrival… (voir plus) processes are stochastic. Using Markov decision theory, we show that the value function is weakly increasing in the queue state and weakly decreasing in the battery state. It is natural to expect that the delay optimal policy should be weakly increasing in the queue and battery states. We show via counterexamples that this is not the case. In fact, we show that for some sample scenarios the delay optimal policy may perform 5–13% better than the best monotone policy.

2020-06-30

IEEE Wireless Communications Letters (publié)

doi.org

arxiv.org

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

Wenyu Du

Zhouhan Lin

Yikang Shen

Timothy J. O’Donnell

Yoshua Bengio

Yue Zhang

It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally eff… (voir plus)iciently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called "syntactic distances", where information between these two separate objectives shares the same intermediate representation. Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.

2020-06-30

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (publié)

doi.org

arxiv.org

Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition

Claude Perreault

The recent development of sequencing technologies revolutionized our understanding of the inner workings of the cell as well as the way dise… (voir plus)ase is treated. A single RNA sequencing (RNA-Seq) experiment, however, measures tens of thousands of parameters simultaneously. While the results are information rich, data analysis provides a challenge. Dimensionality reduction methods help with this task by extracting patterns from the data by compressing it into compact vector representations. We present the factorized embeddings (FE) model, a self-supervised deep learning algorithm that learns simultaneously, by tensor factorization, gene and sample representation spaces. We ran the model on RNA-Seq data from two large-scale cohorts and observed that the sample representation captures information on single gene and global gene expression patterns. Moreover, we found that the gene representation space was organized such that tissue-specific genes, highly correlated genes as well as genes participating in the same GO terms were grouped. Finally, we compared the vector representation of samples learned by the FE model to other similar models on 49 regression tasks. We report that the representations trained with FE rank first or second in all of the tasks, surpassing, sometimes by a considerable margin, other representations. A toy example in the form of a Jupyter Notebook as well as the code and trained embeddings for this project can be found at: https://github.com/TrofimovAssya/FactorizedEmbeddings. Supplementary data are available at Bioinformatics online.

2020-06-30

Bioinform. (publié)

doi.org

Handling Black Swan Events in Deep Learning with Diversely Extrapolated Neural Networks

Maxime Wabartha

Audrey Durand

Vincent François-Lavet

Joelle Pineau

By virtue of their expressive power, neural networks (NNs) are well suited to fitting large, complex datasets, yet they are also known to … (voir plus)produce similar predictions for points outside the training distribution. As such, they are, like humans, under the influence of the Black Swan theory: models tend to be extremely "surprised" by rare events, leading to potentially disastrous consequences, while justifying these same events in hindsight. To avoid this pitfall, we introduce DENN, an ensemble approach building a set of Diversely Extrapolated Neural Networks that fits the training data and is able to generalize more diversely when extrapolating to novel data points. This leads DENN to output highly uncertain predictions for unexpected inputs. We achieve this by adding a diversity term in the loss function used to train the model, computed at specific inputs. We first illustrate the usefulness of the method on a low-dimensional regression problem. Then, we show how the loss can be adapted to tackle anomaly detection during classification, as well as safe imitation learning problems.

2020-06-30

International Joint Conference on Artificial Intelligence (publié)

doi.org

Interactive Machine Comprehension with Information Seeking Agents

Xingdi Yuan

Jie Fu

Marc-Alexandre Côté

Yi Tay

Christopher Pal

Adam Trischler

Existing machine reading comprehension (MRC) models do not scale effectively to real-world applications like web-level information retrieval… (voir plus) and question answering (QA). We argue that this stems from the nature of MRC datasets: most of these are static environments wherein the supporting documents and all necessary information are fully observed. In this paper, we propose a simple method that reframes existing MRC datasets as interactive, partially observable environments. Specifically, we “occlude” the majority of a document’s text and add context-sensitive commands that reveal “glimpses” of the hidden text to a model. We repurpose SQuAD and NewsQA as an initial case study, and then show how the interactive corpora can be used to train a model that seeks relevant information through sequential decision making. We believe that this setting can contribute in scaling models to web-level QA scenarios.

2020-06-30

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (publié)

doi.org

arxiv.org

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)

Vincent François-Lavet

Guillaume Rabusseau

Joelle Pineau

Damien Ernst

Raphael Fonteneau

When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: … (voir plus)a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.

2020-06-30

International Joint Conference on Artificial Intelligence (publié)

doi.org

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Publications

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Mots-clés populaires:

Publications