Publications

Leveraging exploration in off-policy algorithms via normalizing flows

Bogdan Mazoure

Thang Doan

Exploration is a crucial component for discovering approximately optimal policies in most high-dimensional reinforcement learning (RL) setti… (see more)ngs with sparse rewards. Approaches such as neural density models and continuous exploration (e.g., Go-Explore) have been instrumental in recent advances. Soft actor-critic (SAC) is a method for improving exploration that aims to combine off-policy updates while maximizing the policy entropy. We extend SAC to a richer class of probability distributions through normalizing flows, which we show improves performance in exploration, sample complexity, and convergence. Finally, we show that not only the normalizing flow policy outperforms SAC on MuJoCo domains, it is also significantly lighter, using as low as 5.6% of the original network's parameters for similar performance.

2020-05-12

Proceedings of the Conference on Robot Learning (published)

proceedings.mlr.press

arxiv.org

Differential neural circuitry behind autism subtypes with imbalanced social-communicative and restricted repetitive behavior symptoms

Natasha Bertelsen

Isotta Landi

Richard A.I. Bethlehem

Jakob Seidlitz

Elena Maria Busuoli

Veronica Mandelli

Eleonora Satta

Stavros Trakoshis

Bonnie Auyeung

Prantik Kundu

Eva Loth

Guillaume Dumas

Sarah Baumeister

Christian Beckmann

Sven Bölte

Thomas Bourgeron

Tony Charman

Sarah Durston

Christine Ecker

Rosemary Holt … (see 15 more)

Mark Johnson

Emily J. H. Jones

Luke Mason

Andreas Meyer-Lindenberg

Carolin Moessnang

Marianne Oldehinkel

Antonio Persico

Julian Tillmann

Steven C. R. Williams

Will Spooren

Declan Murphy

Jan K. Buitelaar

Simon Baron-Cohen

Meng-Chuan Lai

Michael V. Lombardo

Social-communication (SC) and restricted repetitive behaviors (RRB) are autism diagnostic symptom domains. SC and RRB severity can markedly … (see more)differ within and between individuals and may be underpinned by different neural circuitry and genetic mechanisms. Modeling SC-RRB balance could help identify how neural circuitry and genetic mechanisms map onto such phenotypic heterogeneity. Here we developed a phenotypic stratification model that makes highly accurate (97-99%) out-of-sample SC=RRB, SC>RRB, and RRB>SC subtype predictions. Applying this model to resting state fMRI data from the EU-AIMS LEAP dataset (n=509), we find that while the phenotypic subtypes share many commonalities in terms of intrinsic functional connectivity, they also show subtype-specific qualitative differences compared to a typically-developing group (TD). Specifically, the somatomotor network is hypoconnected with perisylvian circuitry in SC>RRB and visual association circuitry in SC=RRB. The SC=RRB subtype also showed hyperconnectivity between medial motor and anterior salience circuitry. Genes that are highly expressed within these subtype-specific networks show a differential enrichment pattern with known ASD associated genes, indicating that such circuits are affected by differing autism-associated genomic mechanisms. These results suggest that SC-RRB imbalance subtypes share some commonalities but also express subtle differences in functional neural circuitry and the genomic underpinnings behind such circuitry.

2020-05-10

bioRxiv (preprint)

doi.org

An Empirical Study of Human Behavioral Agents in Bandits, Contextual Bandits and Reinforcement Learning.

Baihan Lin

Guillermo Cecchi

Djallel Bouneffouf

Jenna Reinen

Irina Rish

Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an enviro… (see more)nment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting.

2020-05-10

(published)

www.semanticscholar.org

Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL

Baihan Lin

Guillermo Cecchi

Djallel Bouneffouf

Jenna Reinen

Irina Rish

2020-05-10

ArXiv (preprint)

arxiv.org

Desirable features in a decision aid for prenatal screening – what do pregnant women and their partners think? A mixed methods pilot study

Titilayo Tatiana Agbadje

Samira Abbasgholizadeh-Rahimi

Mélissa Côté

Andrée-Anne Tremblay

Mariama Penda Diallo

Hélène Elidor

Alex Poulin Herron

Codjo Djignefa Djade

France Légaré

Background To help pregnant women and their partners make informed value-congruent decisions about Down syndrome prenatal screening, our te… (see more)am developed two successive versions of a decision aid (DAv2017 and DAv2014). We aimed to assess pregnant women and their partners’ perceptions of the usefulness of the two DAs for preparing for decision making, their relative acceptability and their most desirable features. Methods This is a mixed methods pilot study. We recruited participants of study (women and their partners) when consulting for prenatal care in three clinical sites in Quebec City. To be eligible, women had to: (a) be at least 18 years old; (b) be more than 16 weeks pregnant; or having given birth in the previous year and (c) be able to speak and write in French or English. Both women and partners were invited to give their informed consent. We collected quantitative data on the usefulness of the DAs for preparing for decision making and their relative acceptability. We developed an interview grid based on the Technology Acceptance Model and Acceptability questionnaire to explore their perceptions of the most desirable features. We performed descriptive statistics and deductive analysis. Results Overall, 23 couples and 16 individual women participated in the study. The majority of participants were between 25 and 34 years old (79% of women and 59% of partners) and highly educated (66.7% of women and 54% of partners had a university-level education). DAv2017 scored higher for usefulness for preparing for decision making (86.2 ± 13 out of 100 for DAv2017 and 77.7 ± 14 for DAv2014). For most dimensions, DAv2017 was more acceptable than DAv2014 (e.g. the amount of information was found “just right” by 80% of participants for DAv2017 against 56% for DAv2014). However, participants preferred the presentation and the values clarification exercise of DAv2014. In their opinion, neither DA presented information in a completely balanced manner. They suggested adding more information about raising Down syndrome children, replacing frequencies with percentages, different values clarification methods, and a section for the partner. Conclusions A new user-centered version of the prenatal screening DA will integrate participants’ suggestions to reflect end users’ priorities.

2020-05-08

(published)

doi.org

Suitable e-Health Solutions for Older Adults with Dementia or Mild Cognitive Impairment: Perceptions of Health and Social Care Providers in Quebec City

Marie-Pierre Gagnon

Mame Ndiaye

Mylène Boucher

Samantha Dequanter

Ronald Buyl

Ellen Gorus

Anne Bourbonnais

Anik Giguère

Samira Abbasgholizadeh-Rahimi

: e-Health solutions offer a potential to improve the quality of life and safety of older adults with dementia or mild cognitive impairment … (see more)(MCI). In making better decisions for using eHealth technologies, health professionals should be aware and well informed about existing tools. Recent research shows the lack of knowledge on these technologies for older adults with dementia. In Quebec, current market offer for these technologies is supply-based, and not need-based. This study is part of a larger project and aims to understand the perceptions and needs of health and social care providers regarding e-health technologies for older adults with dementia or MCI. One focus group was carried out with six health and social care professionals at the St-Sacrement Hospital in Quebec City, Canada. The focus group enquired about the use of Information and Communication Technology (ICT) with older adults with cognitive impairment. Relevant examples of ICTs were presented to assess their knowledge level. The discussion was tape-recorded and transcripts were coded using the Nvivo software. Results revealed that aside from fall safety technologies, there is a lack of knowledge about other e-Health technologies for this population. Respondents acknowledged the value of ICTs and were willing to recommend some of them. Economic reasons, blind trust on ICTs and lack of confidence in patients’ capacity to use the solutions were the major limitations identified.

2020-05-03

Proceedings of the 6th International Conference on Information and Communication Technologies for Ageing Well and e-Health (published)

doi.org

HipoRank: Incorporating Hierarchical and Positional Information into Graph-based Unsupervised Long Document Extractive Summarization

Yue Dong

Andrei Mircea

Jackie Cheung

We propose a novel graph-based ranking model for unsupervised extractive summarization of long documents. Graph-based ranking models typical… (see more)ly represent documents as undirected fully-connected graphs, where a node is a sentence, an edge is weighted based on sentence-pair similarity, and sentence importance is measured via node centrality. Our method leverages positional and hierarchical information grounded in discourse structure to augment a document's graph representation with hierarchy and directionality. Experimental results on PubMed and arXiv datasets show that our approach outperforms strong unsupervised baselines by wide margins and performs comparably to some of the state-of-the-art supervised models that are trained on hundreds of thousands of examples. In addition, we find that our method provides comparable improvements with various distributional sentence representations; including BERT and RoBERTa models fine-tuned on sentence similarity.

2020-05-01

ArXiv (preprint)

arxiv.org

Decentralized Linear Quadratic Systems With Major and Minor Agents and Non-Gaussian Noise

Mohammad Afshari

Aditya Mahajan

A decentralized linear quadratic system with a major agent and a collection of minor agents is considered. The major agent affects the minor… (see more) agents, but not vice versa. The state of the major agent is observed by all agents. In addition, the minor agents have a noisy observation of their local state. The noise process is not assumed to be Gaussian. The structures of the optimal strategy and the best linear strategy are characterized. It is shown that the major agent's optimal control action is a linear function of the major agent's minimum mean-squared error (MMSE) estimate of the system state while the minor agent's optimal control action is a linear function of the major agent's MMSE estimate of the system state and a “correction term” that depends on the difference of the minor agent's MMSE estimate of its local state and the major agent's MMSE estimate of the minor agent's local state. Since the noise is non-Gaussian, the minor agent's MMSE estimate is a nonlinear function of its observation. It is shown that replacing the minor agent's MMSE estimate with its linear least mean square estimate gives the best linear control strategy. The results are proved using a direct method based on conditional independence, common-information-based splitting of state and control actions, and simplifying the per-step cost based on conditional independence, orthogonality principle, and completion of squares.

2020-04-24

ArXiv (preprint)

doi.org

arxiv.org

ArguLens: Anatomy of Community Opinions On Usability Issues Using Argumentation Models

Wenting Wang

Deeksha M. Arya

Nicole Novielli

Jinghui Cheng

Jin Guo

In open-source software (OSS), the design of usability is often influenced by the discussions among community members on platforms such as i… (see more)ssue tracking systems (ITSs). However, digesting the rich information embedded in issue discussions can be a major challenge due to the vast number and diversity of the comments. We propose and evaluate ArguLens, a conceptual framework and automated technique leveraging an argumentation model to support effective understanding and consolidation of community opinions in ITSs. Through content analysis, we anatomized highly discussed usability issues from a large, active OSS project, into their argumentation components and standpoints. We then experimented with supervised machine learning techniques for automated argument extraction. Finally, through a study with experienced ITS users, we show that the information provided by ArguLens supported the digestion of usability-related opinions and facilitated the review of lengthy issues. ArguLens provides the direction of designing valuable tools for high-level reasoning and effective discussion about usability.

2020-04-23

Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (published)

doi.org

arxiv.org

Inference for travel time on transportation networks

Mohamad Elmasri

Aurélie Labbe

Denis Larocque

Laurent Charlin

Travel time is essential for making travel decisions in real-world transportation networks. Understanding its distribution can resolve many … (see more)fundamental problems in transportation. Empirically, single-edge travel-time is well studied, but how to aggregate such information over many edges to arrive at the distribution of travel time over a route is still daunting. A range of statistical tools have been developed for network analysis; tools to study statistical behaviors of processes on dynamical networks are still lacking. This paper develops a novel statistical perspective to specific type of mixing ergodic processes (travel time), that mimic the behavior of travel time on real-world networks. Under general conditions on the single-edge speed (resistance) distribution, we show that travel time, normalized by distance, follows a Gaussian distribution with universal mean and variance parameters. We propose efficient inference methods for such parameters, and consequently asymptotic universal confidence and prediction intervals of travel time. We further develop path(route)-specific parameters that enable tighter Gaussian-based prediction intervals. We illustrate our methods with a real-world case study using mobile GPS data, where we show that the route-specific and universal intervals both achieve the 95\% theoretical coverage levels. Moreover, the route-specific prediction intervals result in tighter bounds that outperform competing models.

2020-04-23

(published)

www.semanticscholar.org

Prediction intervals for travel time on transportation networks

Mohamad Elmasri

Aurélie Labbe

Denis Larocque

Laurent Charlin

Estimating travel-time is essential for making travel decisions in transportation networks. Empirically, single road-segment travel-time is … (see more)well studied, but how to aggregate such information over many edges to arrive at the distribution of travel time over a route is still theoretically challenging. Understanding travel-time distribution can help resolve many fundamental problems in transportation, quantifying travel uncertainty as an example. We develop a novel statistical perspective to specific types of dynamical processes that mimic the behavior of travel time on real-world networks. We show that, under general conditions, travel-time normalized by distance, follows a Gaussian distribution with route-invariant (universal) location and scale parameters. We develop efficient inference methods for such parameters, with which we propose asymptotic universal confidence and prediction intervals of travel time. We further develop our theory to include road-segment level information to construct route-specific location and scale parameter sequences that produce tighter route-specific Gaussian-based prediction intervals. We illustrate our methods with a real-world case study using precollected mobile GPS data, where we show that the route-specific and route-invariant intervals both achieve the 95\% theoretical coverage levels, where the former result in tighter bounds that also outperform competing models.

2020-04-23

(published)

www.semanticscholar.org

Distinct roles of parvalbumin and somatostatin interneurons in gating the synchronization of spike times in the neocortex

Hyun Jae Jang

Hyowon Chung

James M. Rowland

Blake Richards

Michael M Kohl

Jeehyun Kwag

Sensory information–driven spikes are synchronized across cortical layers by distinct subtypes of interneurons. Synchronization of precise… (see more) spike times across multiple neurons carries information about sensory stimuli. Inhibitory interneurons are suggested to promote this synchronization, but it is unclear whether distinct interneuron subtypes provide different contributions. To test this, we examined single-unit recordings from barrel cortex in vivo and used optogenetics to determine the contribution of parvalbumin (PV)– and somatostatin (SST)–positive interneurons to the synchronization of spike times across cortical layers. We found that PV interneurons preferentially promote the synchronization of spike times when instantaneous firing rates are low (12 Hz), whereas SST interneurons preferentially promote the synchronization of spike times when instantaneous firing rates are high (>12 Hz). Furthermore, using a computational model, we demonstrate that these effects can be explained by PV and SST interneurons having preferential contributions to feedforward and feedback inhibition, respectively. Our findings demonstrate that distinct subtypes of inhibitory interneurons have frequency-selective roles in the spatiotemporal synchronization of precise spike times.

2020-04-22

Science Advances (published)

doi.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications