We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Leveraging exploration in off-policy algorithms via normalizing flows
Exploration is a crucial component for discovering approximately optimal policies in most high-dimensional reinforcement learning (RL) setti… (see more)ngs with sparse rewards. Approaches such as neural density models and continuous exploration (e.g., Go-Explore) have been instrumental in recent advances. Soft actor-critic (SAC) is a method for improving exploration that aims to combine off-policy updates while maximizing the policy entropy. We extend SAC to a richer class of probability distributions through normalizing flows, which we show improves performance in exploration, sample complexity, and convergence. Finally, we show that not only the normalizing flow policy outperforms SAC on MuJoCo domains, it is also significantly lighter, using as low as 5.6% of the original network's parameters for similar performance.
2020-05-12
Proceedings of the Conference on Robot Learning (published)
Social-communication (SC) and restricted repetitive behaviors (RRB) are autism diagnostic symptom domains. SC and RRB severity can markedly … (see more)differ within and between individuals and may be underpinned by different neural circuitry and genetic mechanisms. Modeling SC-RRB balance could help identify how neural circuitry and genetic mechanisms map onto such phenotypic heterogeneity. Here we developed a phenotypic stratification model that makes highly accurate (97-99%) out-of-sample SC=RRB, SC>RRB, and RRB>SC subtype predictions. Applying this model to resting state fMRI data from the EU-AIMS LEAP dataset (n=509), we find that while the phenotypic subtypes share many commonalities in terms of intrinsic functional connectivity, they also show subtype-specific qualitative differences compared to a typically-developing group (TD). Specifically, the somatomotor network is hypoconnected with perisylvian circuitry in SC>RRB and visual association circuitry in SC=RRB. The SC=RRB subtype also showed hyperconnectivity between medial motor and anterior salience circuitry. Genes that are highly expressed within these subtype-specific networks show a differential enrichment pattern with known ASD associated genes, indicating that such circuits are affected by differing autism-associated genomic mechanisms. These results suggest that SC-RRB imbalance subtypes share some commonalities but also express subtle differences in functional neural circuitry and the genomic underpinnings behind such circuitry.
Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an enviro… (see more)nment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting.
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL
Background
To help pregnant women and their partners make informed value-congruent decisions about Down syndrome prenatal screening, our te… (see more)am developed two successive versions of a decision aid (DAv2017 and DAv2014). We aimed to assess pregnant women and their partners’ perceptions of the usefulness of the two DAs for preparing for decision making, their relative acceptability and their most desirable features.
Methods
This is a mixed methods pilot study. We recruited participants of study (women and their partners) when consulting for prenatal care in three clinical sites in Quebec City. To be eligible, women had to: (a) be at least 18 years old; (b) be more than 16 weeks pregnant; or having given birth in the previous year and (c) be able to speak and write in French or English. Both women and partners were invited to give their informed consent. We collected quantitative data on the usefulness of the DAs for preparing for decision making and their relative acceptability. We developed an interview grid based on the Technology Acceptance Model and Acceptability questionnaire to explore their perceptions of the most desirable features. We performed descriptive statistics and deductive analysis.
Results
Overall, 23 couples and 16 individual women participated in the study. The majority of participants were between 25 and 34 years old (79% of women and 59% of partners) and highly educated (66.7% of women and 54% of partners had a university-level education). DAv2017 scored higher for usefulness for preparing for decision making (86.2 ± 13 out of 100 for DAv2017 and 77.7 ± 14 for DAv2014). For most dimensions, DAv2017 was more acceptable than DAv2014 (e.g. the amount of information was found “just right” by 80% of participants for DAv2017 against 56% for DAv2014). However, participants preferred the presentation and the values clarification exercise of DAv2014. In their opinion, neither DA presented information in a completely balanced manner. They suggested adding more information about raising Down syndrome children, replacing frequencies with percentages, different values clarification methods, and a section for the partner.
Conclusions
A new user-centered version of the prenatal screening DA will integrate participants’ suggestions to reflect end users’ priorities.
Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To t… (see more)ake a step in this direction, we recently proposed a problem-agnostic speech encoder (PASE), that combines a convolutional encoder followed by multiple neural networks, called workers, tasked to solve self-supervised problems (i.e., ones that do not require manual annotations as ground truth). PASE was shown to capture relevant speech information, including speaker voice-print and phonemes. This paper proposes PASE+, an improved version of PASE for robust speech recognition in noisy and reverberant environments. To this end, we employ an online speech distortion module, that contaminates the input signals with a variety of random disturbances. We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks. Finally, we refine the set of workers used in self-supervision to encourage better cooperation.Results on TIMIT, DIRHA and CHiME-5 show that PASE+ significantly outperforms both the previous version of PASE as well as common acoustic features. Interestingly, PASE+ learns transferable representations suitable for highly mismatched acoustic conditions.
2020-05-04
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)
Suitable e-Health Solutions for Older Adults with Dementia or Mild Cognitive Impairment: Perceptions of Health and Social Care Providers in Quebec City
: e-Health solutions offer a potential to improve the quality of life and safety of older adults with dementia or mild cognitive impairment … (see more)(MCI). In making better decisions for using eHealth technologies, health professionals should be aware and well informed about existing tools. Recent research shows the lack of knowledge on these technologies for older adults with dementia. In Quebec, current market offer for these technologies is supply-based, and not need-based. This study is part of a larger project and aims to understand the perceptions and needs of health and social care providers regarding e-health technologies for older adults with dementia or MCI. One focus group was carried out with six health and social care professionals at the St-Sacrement Hospital in Quebec City, Canada. The focus group enquired about the use of Information and Communication Technology (ICT) with older adults with cognitive impairment. Relevant examples of ICTs were presented to assess their knowledge level. The discussion was tape-recorded and transcripts were coded using the Nvivo software. Results revealed that aside from fall safety technologies, there is a lack of knowledge about other e-Health technologies for this population. Respondents acknowledged the value of ICTs and were willing to recommend some of them. Economic reasons, blind trust on ICTs and lack of confidence in patients’ capacity to use the solutions were the major limitations identified.
2020-05-03
Proceedings of the 6th International Conference on Information and Communication Technologies for Ageing Well and e-Health (published)
We propose a novel graph-based ranking model for unsupervised extractive summarization of long documents. Graph-based ranking models typical… (see more)ly represent documents as undirected fully-connected graphs, where a node is a sentence, an edge is weighted based on sentence-pair similarity, and sentence importance is measured via node centrality. Our method leverages positional and hierarchical information grounded in discourse structure to augment a document's graph representation with hierarchy and directionality. Experimental results on PubMed and arXiv datasets show that our approach outperforms strong unsupervised baselines by wide margins and performs comparably to some of the state-of-the-art supervised models that are trained on hundreds of thousands of examples. In addition, we find that our method provides comparable improvements with various distributional sentence representations; including BERT and RoBERTa models fine-tuned on sentence similarity.
Equilibrium Propagation (EP) is a biologically inspired alternative algorithm to backpropagation (BP) for training neural networks. It appli… (see more)es to RNNs fed by a static input x that settle to a steady state, such as Hopfield networks. EP is similar to BP in that in the second phase of training, an error signal propagates backwards in the layers of the network, but contrary to BP, the learning rule of EP is spatially local. Nonetheless, EP suffers from two major limitations. On the one hand, due to its formulation in terms of real-time dynamics, EP entails long simulation times, which limits its applicability to practical tasks. On the other hand, the biological plausibility of EP is limited by the fact that its learning rule is not local in time: the synapse update is performed after the dynamics of the second phase have converged and requires information of the first phase that is no longer available physically. Our work addresses these two issues and aims at widening the spectrum of EP from standard machine learning models to more bio-realistic neural networks. First, we propose a discrete-time formulation of EP which enables to simplify equations, speed up training and extend EP to CNNs. Our CNN model achieves the best performance ever reported on MNIST with EP. Using the same discrete-time formulation, we introduce Continual Equilibrium Propagation (C-EP): the weights of the network are adjusted continually in the second phase of training using local information in space and time. We show that in the limit of slow changes of synaptic strengths and small nudging, C-EP is equivalent to BPTT (Theorem 1). We numerically demonstrate Theorem 1 and C-EP training on MNIST and generalize it to the bio-realistic situation of a neural network with asymmetric connections between neurons.