Publications

Active Domain Randomization

Bhairav Mehta

Manfred Diaz

Christopher Pal

Domain randomization is a popular technique for improving domain transfer, often used in a zero-shot setting when the target domain is unkno… (voir plus)wn or cannot easily be used for training. In this work, we empirically examine the effects of domain randomization on agent generalization. Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters. We propose Active Domain Randomization, a novel algorithm that learns a parameter sampling strategy. Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances. We find that training more frequently on these instances leads to better overall agent generalization. In addition, when domain randomization and policy transfer fail, Active Domain Randomization offers more insight into the deficiencies of both the chosen parameter ranges and the learned policy, allowing for more focused debugging. Our experiments across various physics-based simulated and a real-robot task show that this enhancement leads to more robust, consistent policies.

2020-05-11

Proceedings of the Conference on Robot Learning (publié)

proceedings.mlr.press

Leveraging exploration in off-policy algorithms via normalizing flows

Bogdan Mazoure

Thang Doan

Audrey Durand

R Devon Hjelm

Joelle Pineau

The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in … (voir plus)many real-world scenarios. Approaches such as neural density models and continuous exploration (e.g., Go-Explore) have been proposed to maintain the high exploration rate necessary to find high performing and generalizable policies. Soft actor-critic(SAC) is another method for improving exploration that aims to combine efficient learning via off-policy updates while maximizing the policy entropy. In this work, we extend SAC to a richer class of probability distributions (e.g., multimodal) through normalizing flows (NF) and show that this significantly improves performance by accelerating the discovery of good policies while using much smaller policy representations. Our approach, which we call SAC-NF, is a simple, efficient,easy-to-implement modification and improvement to SAC on continuous control baselines such as MuJoCo and PyBullet Roboschool domains. Finally, SAC-NF does this while being significantly parameter efficient, using as few as 5.5% the parameters for an equivalent SAC model.

2020-05-11

Proceedings of the Conference on Robot Learning (publié)

doi.org

proceedings.mlr.press

Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments

Martin Weiss

Simon Chamorro

Roger Girgis

Margaux Luck

Samira E. Kahou

Joseph P. Cohen

Millions of blind and visually-impaired (BVI) people navigate urban environments every day, using smartphones for high-level path-planning a… (voir plus)nd white canes or guide dogs for local information. However, many BVI people still struggle to travel to new places. In our endeavor to create a navigation assistant for the BVI, we found that existing Reinforcement Learning (RL) environments were unsuitable for the task. This work introduces SEVN, a sidewalk simulation environment and a neural network-based approach to creating a navigation agent. SEVN contains panoramic images with labels for house numbers, doors, and street name signs, and formulations for several navigation tasks. We study the performance of an RL algorithm (PPO) in this setting. Our policy model fuses multi-modal observations in the form of variable resolution images, visible text, and simulated GPS data to navigate to a goal door. We hope that this dataset, simulator, and experimental results will provide a foundation for further research into the creation of agents that can assist members of the BVI community with outdoor navigation.

2020-05-11

Proceedings of the Conference on Robot Learning (publié)

doi.org

proceedings.mlr.press

Differential neural circuitry behind autism subtypes with imbalanced social-communicative and restricted repetitive behavior symptoms

Natasha Bertelsen

Isotta Landi

Richard A.I. Bethlehem

Jakob Seidlitz

Elena Maria Busuoli

Veronica Mandelli

Eleonora Satta

Stavros Trakoshis

Bonnie Auyeung

Prantik Kundu

Eva Loth

Guillaume Dumas

Sarah Baumeister

Christian Beckmann

Sven Bölte

Thomas Bourgeron

Tony Charman

Sarah Durston

Christine Ecker

Rosemary Holt … (voir 15 de plus)

Mark Johnson

Emily J. H. Jones

Luke Mason

Andreas Meyer-Lindenberg

Carolin Moessnang

Marianne Oldehinkel

Antonio Persico

Julian Tillmann

Steven C. R. Williams

Will Spooren

Declan Murphy

Jan K. Buitelaar

Simon Baron-Cohen

Meng-Chuan Lai

Michael V. Lombardo

Social-communication (SC) and restricted repetitive behaviors (RRB) are autism diagnostic symptom domains. SC and RRB severity can markedly … (voir plus)differ within and between individuals and may be underpinned by different neural circuitry and genetic mechanisms. Modeling SC-RRB balance could help identify how neural circuitry and genetic mechanisms map onto such phenotypic heterogeneity. Here we developed a phenotypic stratification model that makes highly accurate (97-99%) out-of-sample SC=RRB, SC>RRB, and RRB>SC subtype predictions. Applying this model to resting state fMRI data from the EU-AIMS LEAP dataset (n=509), we find that while the phenotypic subtypes share many commonalities in terms of intrinsic functional connectivity, they also show subtype-specific qualitative differences compared to a typically-developing group (TD). Specifically, the somatomotor network is hypoconnected with perisylvian circuitry in SC>RRB and visual association circuitry in SC=RRB. The SC=RRB subtype also showed hyperconnectivity between medial motor and anterior salience circuitry. Genes that are highly expressed within these subtype-specific networks show a differential enrichment pattern with known ASD associated genes, indicating that such circuits are affected by differing autism-associated genomic mechanisms. These results suggest that SC-RRB imbalance subtypes share some commonalities but also express subtle differences in functional neural circuitry and the genomic underpinnings behind such circuitry.

2020-05-09

bioRxiv (prépublication)

doi.org

An Empirical Study of Human Behavioral Agents in Bandits, Contextual Bandits and Reinforcement Learning.

Baihan Lin

Guillermo Cecchi

Djallel Bouneffouf

Jenna Reinen

Irina Rish

Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an enviro… (voir plus)nment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting.

2020-05-09

(publié)

www.semanticscholar.org

Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL

Baihan Lin

Guillermo Cecchi

Djallel Bouneffouf

Jenna Reinen

Irina Rish

2020-05-09

ArXiv (prépublication)

arxiv.org

Desirable features in a decision aid for prenatal screening – what do pregnant women and their partners think? A mixed methods pilot study

Titilayo Tatiana Agbadje

S. A. Rahimi

Mélissa Côté

Andrée-Anne Tremblay

Mariama Penda Diallo

Hélène Elidor

Alex Poulin Herron

Codjo Djignefa Djade

France Légaré

Background To help pregnant women and their partners make informed value-congruent decisions about Down syndrome prenatal screening, our te… (voir plus)am developed two successive versions of a decision aid (DAv2017 and DAv2014). We aimed to assess pregnant women and their partners’ perceptions of the usefulness of the two DAs for preparing for decision making, their relative acceptability and their most desirable features. Methods This is a mixed methods pilot study. We recruited participants of study (women and their partners) when consulting for prenatal care in three clinical sites in Quebec City. To be eligible, women had to: (a) be at least 18 years old; (b) be more than 16 weeks pregnant; or having given birth in the previous year and (c) be able to speak and write in French or English. Both women and partners were invited to give their informed consent. We collected quantitative data on the usefulness of the DAs for preparing for decision making and their relative acceptability. We developed an interview grid based on the Technology Acceptance Model and Acceptability questionnaire to explore their perceptions of the most desirable features. We performed descriptive statistics and deductive analysis. Results Overall, 23 couples and 16 individual women participated in the study. The majority of participants were between 25 and 34 years old (79% of women and 59% of partners) and highly educated (66.7% of women and 54% of partners had a university-level education). DAv2017 scored higher for usefulness for preparing for decision making (86.2 ± 13 out of 100 for DAv2017 and 77.7 ± 14 for DAv2014). For most dimensions, DAv2017 was more acceptable than DAv2014 (e.g. the amount of information was found “just right” by 80% of participants for DAv2017 against 56% for DAv2014). However, participants preferred the presentation and the values clarification exercise of DAv2014. In their opinion, neither DA presented information in a completely balanced manner. They suggested adding more information about raising Down syndrome children, replacing frequencies with percentages, different values clarification methods, and a section for the partner. Conclusions A new user-centered version of the prenatal screening DA will integrate participants’ suggestions to reflect end users’ priorities.

2020-05-07

(publié)

doi.org

Leveraging cluster backbones for improving MAP inference in statistical relational models

Mohamed Hamza Ibrahim

Christopher Pal

Gilles Pesant

2020-05-06

Annals of Mathematics and Artificial Intelligence (publié)

doi.org

Option-Critic in Cooperative Multi-Agent Systems

Jhelum Chakravorty

Nadeem Ward

Julien Roy

Maxime Chevalier-Boisvert

Sumana Basu

Andrei Lupu

Doina Precup

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, … (voir plus)1999). First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a \emph{common information approach}. We use the notion of \emph{common beliefs} and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, which uses centralized option evaluation and decentralized intra-option improvement. We theoretically analyze the asymptotic convergence of DOC and build a new multi-agent environment to demonstrate its validity. Our experiments empirically show that DOC performs competitively against baselines and scales with the number of agents.

2020-05-04

International Joint Conference on Autonomous Agents and Multiagent Systems (publié)

doi.org

arxiv.org

Multi-Task Self-Supervised Learning for Robust Speech Recognition

Mirco Ravanelli

Jianyuan Zhong

Santiago Pascual

Pawel Swietojanski

Joao Monteiro

Jan Trmal

Yoshua Bengio

Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To t… (voir plus)ake a step in this direction, we recently proposed a problem-agnostic speech encoder (PASE), that combines a convolutional encoder followed by multiple neural networks, called workers, tasked to solve self-supervised problems (i.e., ones that do not require manual annotations as ground truth). PASE was shown to capture relevant speech information, including speaker voice-print and phonemes. This paper proposes PASE+, an improved version of PASE for robust speech recognition in noisy and reverberant environments. To this end, we employ an online speech distortion module, that contaminates the input signals with a variety of random disturbances. We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks. Finally, we refine the set of workers used in self-supervision to encourage better cooperation. Results on TIMIT, DIRHA and CHiME-5 show that PASE+ significantly outperforms both the previous version of PASE as well as common acoustic features. Interestingly, PASE+ learns transferable representations suitable for highly mismatched acoustic conditions.

2020-05-03

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (publié)

doi.org

arxiv.org

Suitable e-Health Solutions for Older Adults with Dementia or Mild Cognitive Impairment: Perceptions of Health and Social Care Providers in Quebec City

Marie-Pierre Gagnon

Mame Ndiaye

Mylène Boucher

Samantha Dequanter

Ronald Buyl

Ellen Gorus

Anne Bourbonnais

Anik Giguère

S. A. Rahimi

: e-Health solutions offer a potential to improve the quality of life and safety of older adults with dementia or mild cognitive impairment … (voir plus)(MCI). In making better decisions for using eHealth technologies, health professionals should be aware and well informed about existing tools. Recent research shows the lack of knowledge on these technologies for older adults with dementia. In Quebec, current market offer for these technologies is supply-based, and not need-based. This study is part of a larger project and aims to understand the perceptions and needs of health and social care providers regarding e-health technologies for older adults with dementia or MCI. One focus group was carried out with six health and social care professionals at the St-Sacrement Hospital in Quebec City, Canada. The focus group enquired about the use of Information and Communication Technology (ICT) with older adults with cognitive impairment. Relevant examples of ICTs were presented to assess their knowledge level. The discussion was tape-recorded and transcripts were coded using the Nvivo software. Results revealed that aside from fall safety technologies, there is a lack of knowledge about other e-Health technologies for this population. Respondents acknowledged the value of ICTs and were willing to recommend some of them. Economic reasons, blind trust on ICTs and lack of confidence in patients’ capacity to use the solutions were the major limitations identified.

2020-05-02

Proceedings of the 6th International Conference on Information and Communication Technologies for Ageing Well and e-Health (publié)

doi.org

Bringing proportional recovery into proportion: Bayesian modelling of post-stroke motor impairment

Anna K Bonkhoff

Thomas Hope

Danilo Bzdok

Adrian G Guggisberg

Rachel L Hawe

Sean P Dukelow

Anne K Rehme

Gereon R Fink

Christian Grefkes

Howard Bowman

Accurate predictions of motor impairment after stroke are of cardinal importance for the patient, clinician, and healthcare system. More tha… (voir plus)n 10 years ago, the proportional recovery rule was introduced by promising that high-fidelity predictions of recovery following stroke were based only on the initially lost motor function, at least for a specific fraction of patients. However, emerging evidence suggests that this recovery rule is subject to various confounds and may apply less universally than previously assumed. Here, we systematically revisited stroke outcome predictions by applying strategies to avoid confounds and fitting hierarchical Bayesian models. We jointly analysed 385 post-stroke trajectories from six separate studies—one of the largest overall datasets of upper limb motor recovery. We addressed confounding ceiling effects by introducing a subset approach and ensured correct model estimation through synthetic data simulations. Subsequently, we used model comparisons to assess the underlying nature of recovery within our empirical recovery data. The first model comparison, relying on the conventional fraction of patients called ‘fitters’, pointed to a combination of proportional to lost function and constant recovery. ‘Proportional to lost’ here describes the original notion of proportionality, indicating greater recovery in case of a more severe initial impairment. This combination explained only 32% of the variance in recovery, which is in stark contrast to previous reports of >80%. When instead analysing the complete spectrum of subjects, ‘fitters’ and ‘non-fitters’, a combination of proportional to spared function and constant recovery was favoured, implying a more significant improvement in case of more preserved function. Explained variance was at 53%. Therefore, our quantitative findings suggest that motor recovery post-stroke may exhibit some characteristics of proportionality. However, the variance explained was substantially reduced compared to what has previously been reported. This finding motivates future research moving beyond solely behaviour scores to explain stroke recovery and establish robust and discriminating single-subject predictions.

2020-04-30

Brain (publié)

doi.org

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications