Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Desirable features in a decision aid for prenatal screening – what do pregnant women and their partners think? A mixed methods pilot study
Titilayo Tatiana Agbadje
Mélissa Côté
Andrée-Anne Tremblay
Mariama Penda Diallo
Hélène Elidor
Alex Poulin Herron
Codjo Djignefa Djade
France Légaré
Background To help pregnant women and their partners make informed value-congruent decisions about Down syndrome prenatal screening, our te… (see more)am developed two successive versions of a decision aid (DAv2017 and DAv2014). We aimed to assess pregnant women and their partners’ perceptions of the usefulness of the two DAs for preparing for decision making, their relative acceptability and their most desirable features. Methods This is a mixed methods pilot study. We recruited participants of study (women and their partners) when consulting for prenatal care in three clinical sites in Quebec City. To be eligible, women had to: (a) be at least 18 years old; (b) be more than 16 weeks pregnant; or having given birth in the previous year and (c) be able to speak and write in French or English. Both women and partners were invited to give their informed consent. We collected quantitative data on the usefulness of the DAs for preparing for decision making and their relative acceptability. We developed an interview grid based on the Technology Acceptance Model and Acceptability questionnaire to explore their perceptions of the most desirable features. We performed descriptive statistics and deductive analysis. Results Overall, 23 couples and 16 individual women participated in the study. The majority of participants were between 25 and 34 years old (79% of women and 59% of partners) and highly educated (66.7% of women and 54% of partners had a university-level education). DAv2017 scored higher for usefulness for preparing for decision making (86.2 ± 13 out of 100 for DAv2017 and 77.7 ± 14 for DAv2014). For most dimensions, DAv2017 was more acceptable than DAv2014 (e.g. the amount of information was found “just right” by 80% of participants for DAv2017 against 56% for DAv2014). However, participants preferred the presentation and the values clarification exercise of DAv2014. In their opinion, neither DA presented information in a completely balanced manner. They suggested adding more information about raising Down syndrome children, replacing frequencies with percentages, different values clarification methods, and a section for the partner. Conclusions A new user-centered version of the prenatal screening DA will integrate participants’ suggestions to reflect end users’ priorities.
Multi-Task Self-Supervised Learning for Robust Speech Recognition
Jianyuan Zhong
Santiago Pascual
Pawel Swietojanski
Joao Monteiro
Jan Trmal
Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To t… (see more)ake a step in this direction, we recently proposed a problem-agnostic speech encoder (PASE), that combines a convolutional encoder followed by multiple neural networks, called workers, tasked to solve self-supervised problems (i.e., ones that do not require manual annotations as ground truth). PASE was shown to capture relevant speech information, including speaker voice-print and phonemes. This paper proposes PASE+, an improved version of PASE for robust speech recognition in noisy and reverberant environments. To this end, we employ an online speech distortion module, that contaminates the input signals with a variety of random disturbances. We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks. Finally, we refine the set of workers used in self-supervision to encourage better cooperation.Results on TIMIT, DIRHA and CHiME-5 show that PASE+ significantly outperforms both the previous version of PASE as well as common acoustic features. Interestingly, PASE+ learns transferable representations suitable for highly mismatched acoustic conditions.
Suitable e-Health Solutions for Older Adults with Dementia or Mild Cognitive Impairment: Perceptions of Health and Social Care Providers in Quebec City
Marie-Pierre Gagnon
Mame Ndiaye
Mylène Boucher
Samantha Dequanter
Ronald Buyl
Ellen Gorus
Anne Bourbonnais
Anik Giguère
: e-Health solutions offer a potential to improve the quality of life and safety of older adults with dementia or mild cognitive impairment … (see more)(MCI). In making better decisions for using eHealth technologies, health professionals should be aware and well informed about existing tools. Recent research shows the lack of knowledge on these technologies for older adults with dementia. In Quebec, current market offer for these technologies is supply-based, and not need-based. This study is part of a larger project and aims to understand the perceptions and needs of health and social care providers regarding e-health technologies for older adults with dementia or MCI. One focus group was carried out with six health and social care professionals at the St-Sacrement Hospital in Quebec City, Canada. The focus group enquired about the use of Information and Communication Technology (ICT) with older adults with cognitive impairment. Relevant examples of ICTs were presented to assess their knowledge level. The discussion was tape-recorded and transcripts were coded using the Nvivo software. Results revealed that aside from fall safety technologies, there is a lack of knowledge about other e-Health technologies for this population. Respondents acknowledged the value of ICTs and were willing to recommend some of them. Economic reasons, blind trust on ICTs and lack of confidence in patients’ capacity to use the solutions were the major limitations identified.
HipoRank: Incorporating Hierarchical and Positional Information into Graph-based Unsupervised Long Document Extractive Summarization
Yue Dong
Andrei Mircea
We propose a novel graph-based ranking model for unsupervised extractive summarization of long documents. Graph-based ranking models typical… (see more)ly represent documents as undirected fully-connected graphs, where a node is a sentence, an edge is weighted based on sentence-pair similarity, and sentence importance is measured via node centrality. Our method leverages positional and hierarchical information grounded in discourse structure to augment a document's graph representation with hierarchy and directionality. Experimental results on PubMed and arXiv datasets show that our approach outperforms strong unsupervised baselines by wide margins and performs comparably to some of the state-of-the-art supervised models that are trained on hundreds of thousands of examples. In addition, we find that our method provides comparable improvements with various distributional sentence representations; including BERT and RoBERTa models fine-tuned on sentence similarity.
Continual Weight Updates and Convolutional Architectures for Equilibrium Propagation
Maxence Ernoult
Julie Grollier
Damien Querlioz
Benjamin Scellier
Equilibrium Propagation (EP) is a biologically inspired alternative algorithm to backpropagation (BP) for training neural networks. It appli… (see more)es to RNNs fed by a static input x that settle to a steady state, such as Hopfield networks. EP is similar to BP in that in the second phase of training, an error signal propagates backwards in the layers of the network, but contrary to BP, the learning rule of EP is spatially local. Nonetheless, EP suffers from two major limitations. On the one hand, due to its formulation in terms of real-time dynamics, EP entails long simulation times, which limits its applicability to practical tasks. On the other hand, the biological plausibility of EP is limited by the fact that its learning rule is not local in time: the synapse update is performed after the dynamics of the second phase have converged and requires information of the first phase that is no longer available physically. Our work addresses these two issues and aims at widening the spectrum of EP from standard machine learning models to more bio-realistic neural networks. First, we propose a discrete-time formulation of EP which enables to simplify equations, speed up training and extend EP to CNNs. Our CNN model achieves the best performance ever reported on MNIST with EP. Using the same discrete-time formulation, we introduce Continual Equilibrium Propagation (C-EP): the weights of the network are adjusted continually in the second phase of training using local information in space and time. We show that in the limit of slow changes of synaptic strengths and small nudging, C-EP is equivalent to BPTT (Theorem 1). We numerically demonstrate Theorem 1 and C-EP training on MNIST and generalize it to the bio-realistic situation of a neural network with asymmetric connections between neurons.
Decentralized Linear Quadratic Systems With Major and Minor Agents and Non-Gaussian Noise
Mohammad Afshari
A decentralized linear quadratic system with a major agent and a collection of minor agents is considered. The major agent affects the minor… (see more) agents, but not vice versa. The state of the major agent is observed by all agents. In addition, the minor agents have a noisy observation of their local state. The noise process is not assumed to be Gaussian. The structures of the optimal strategy and the best linear strategy are characterized. It is shown that the major agent's optimal control action is a linear function of the major agent's minimum mean-squared error (MMSE) estimate of the system state while the minor agent's optimal control action is a linear function of the major agent's MMSE estimate of the system state and a “correction term” that depends on the difference of the minor agent's MMSE estimate of its local state and the major agent's MMSE estimate of the minor agent's local state. Since the noise is non-Gaussian, the minor agent's MMSE estimate is a nonlinear function of its observation. It is shown that replacing the minor agent's MMSE estimate with its linear least mean square estimate gives the best linear control strategy. The results are proved using a direct method based on conditional independence, common-information-based splitting of state and control actions, and simplifying the per-step cost based on conditional independence, orthogonality principle, and completion of squares.
ArguLens: Anatomy of Community Opinions On Usability Issues Using Argumentation Models
Wenting Wang
Deeksha M. Arya
Nicole Novielli
Jinghui Cheng
In open-source software (OSS), the design of usability is often influenced by the discussions among community members on platforms such as i… (see more)ssue tracking systems (ITSs). However, digesting the rich information embedded in issue discussions can be a major challenge due to the vast number and diversity of the comments. We propose and evaluate ArguLens, a conceptual framework and automated technique leveraging an argumentation model to support effective understanding and consolidation of community opinions in ITSs. Through content analysis, we anatomized highly discussed usability issues from a large, active OSS project, into their argumentation components and standpoints. We then experimented with supervised machine learning techniques for automated argument extraction. Finally, through a study with experienced ITS users, we show that the information provided by ArguLens supported the digestion of usability-related opinions and facilitated the review of lengthy issues. ArguLens provides the direction of designing valuable tools for high-level reasoning and effective discussion about usability.
Inference for travel time on transportation networks
Mohamad Elmasri
Aurélie Labbe
Denis Larocque
Travel time is essential for making travel decisions in real-world transportation networks. Understanding its distribution can resolve many … (see more)fundamental problems in transportation. Empirically, single-edge travel-time is well studied, but how to aggregate such information over many edges to arrive at the distribution of travel time over a route is still daunting. A range of statistical tools have been developed for network analysis; tools to study statistical behaviors of processes on dynamical networks are still lacking. This paper develops a novel statistical perspective to specific type of mixing ergodic processes (travel time), that mimic the behavior of travel time on real-world networks. Under general conditions on the single-edge speed (resistance) distribution, we show that travel time, normalized by distance, follows a Gaussian distribution with universal mean and variance parameters. We propose efficient inference methods for such parameters, and consequently asymptotic universal confidence and prediction intervals of travel time. We further develop path(route)-specific parameters that enable tighter Gaussian-based prediction intervals. We illustrate our methods with a real-world case study using mobile GPS data, where we show that the route-specific and universal intervals both achieve the 95\% theoretical coverage levels. Moreover, the route-specific prediction intervals result in tighter bounds that outperform competing models.
Prediction intervals for travel time on transportation networks
Mohamad Elmasri
Aurélie Labbe
Denis Larocque
Estimating travel-time is essential for making travel decisions in transportation networks. Empirically, single road-segment travel-time is … (see more)well studied, but how to aggregate such information over many edges to arrive at the distribution of travel time over a route is still theoretically challenging. Understanding travel-time distribution can help resolve many fundamental problems in transportation, quantifying travel uncertainty as an example. We develop a novel statistical perspective to specific types of dynamical processes that mimic the behavior of travel time on real-world networks. We show that, under general conditions, travel-time normalized by distance, follows a Gaussian distribution with route-invariant (universal) location and scale parameters. We develop efficient inference methods for such parameters, with which we propose asymptotic universal confidence and prediction intervals of travel time. We further develop our theory to include road-segment level information to construct route-specific location and scale parameter sequences that produce tighter route-specific Gaussian-based prediction intervals. We illustrate our methods with a real-world case study using precollected mobile GPS data, where we show that the route-specific and route-invariant intervals both achieve the 95\% theoretical coverage levels, where the former result in tighter bounds that also outperform competing models.
Distinct roles of parvalbumin and somatostatin interneurons in gating the synchronization of spike times in the neocortex
Hyun Jae Jang
Hyowon Chung
James M. Rowland
Michael M Kohl
Jeehyun Kwag
Sensory information–driven spikes are synchronized across cortical layers by distinct subtypes of interneurons. Synchronization of precise… (see more) spike times across multiple neurons carries information about sensory stimuli. Inhibitory interneurons are suggested to promote this synchronization, but it is unclear whether distinct interneuron subtypes provide different contributions. To test this, we examined single-unit recordings from barrel cortex in vivo and used optogenetics to determine the contribution of parvalbumin (PV)– and somatostatin (SST)–positive interneurons to the synchronization of spike times across cortical layers. We found that PV interneurons preferentially promote the synchronization of spike times when instantaneous firing rates are low (12 Hz), whereas SST interneurons preferentially promote the synchronization of spike times when instantaneous firing rates are high (>12 Hz). Furthermore, using a computational model, we demonstrate that these effects can be explained by PV and SST interneurons having preferential contributions to feedforward and feedback inhibition, respectively. Our findings demonstrate that distinct subtypes of inhibitory interneurons have frequency-selective roles in the spatiotemporal synchronization of precise spike times.
To Write Code: The Cultural Fabrication of Programming Notation and Practice