We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
On the Systematicity of Probing Contextualized Word Representations: The Case of Hypernymy in BERT.
We consider differentiable games where the goal is to find a Nash equilibrium. The machine learning community has recently started using v… (see more)ariants of the gradient method ( GD ). Prime examples are extragradient ( EG ), the optimistic gradient method ( OG ) and consensus optimization ( CO ), which enjoy linear convergence in cases like bilinear games, where the standard GD fails. The full bene-fits of theses relatively new methods are not known as there is no unified analysis for both strongly monotone and bilinear games. We provide new analyses of the EG ’s local and global convergence properties and use is to get a tighter global convergence rate for OG and CO . Our analysis covers the whole range of settings between bilinear and strongly monotone games. It reveals that these methods converges via different mechanisms at these extremes; in between, it exploits the most favorable mechanism for the given problem. We then prove that EG achieves the optimal rate for a wide class of algorithms with any number of extrapolations. Our tight analysis of EG ’s convergence rate in games shows that, unlike in convex minimization, EG may be much faster than GD .
2020-01-01
International Conference on Artificial Intelligence and Statistics (published)
Title : Differential functional neural circuitry behind autism subtypes with marked imbalance between social-communicative and restricted repetitive behavior symptom domains
Social-communication (SC) and restricted repetitive behaviors (RRB) are autism diagnostic symptom domains. SC and RRB severity can markedly … (see more)differ within and between individuals and is underpinned by different neural circuitry and genetic mechanisms. Modeling SC-RRB balance could help identify how neural circuitry and genetic mechanisms map onto such phenotypic heterogeneity. Here we developed a phenotypic stratification model that makes highly accurate (96-98%) out-of-sample SC=RRB, SC>RRB, and RRB>SC subtype predictions. Applying this model to resting state fMRI data from the EU-AIMS LEAP dataset (n=509), we find replicable somatomotor-perisylvian hypoconnectivity in the SC>RRB subtype versus a typically-developing (TD) comparison group. In contrast, replicable motor-anterior salience hyperconnectivity is apparent in the SC=RRB subtype versus TD. Autism-associated genes affecting astrocytes, excitatory, and inhibitory neurons are highly expressed specifically within SC>RRB hypoconnected networks, but not SC=RRB hyperconnected networks. SC-RRB balance subtypes may indicate different paths individuals take from genome, neural circuitry, to the clinical phenotype. (CIMH). Procedures were undertaken to optimize the MRI sequences for the best scanner-specific options, and phantoms and travelling heads were employed to assure standardization and quality assurance of the multisite image-acquisition 20 . Structural images were obtained using a 5.5 minute MPRAGE sequence (TR=2300ms, TE=2.93ms, T1=900ms, voxels size=1.1x1.1x1.2mm, flip angle=9°, matrix size=256x256, FOV=270mm, 176 slices). An eight-to-ten minute resting-state fMRI (rsfMRI) scan was acquired using a multi-echo planar imaging (ME-EPI) sequence 65,66 ; TR=2300ms, TE~12ms, 31ms, and 48ms (slight variations are present across centers), flip angle=80°, matrix size=64x64, (UMCU), 215 (KCL, CIMH), 265 (RUMC, UCAM). were to relax, with eyes open and fixate on a cross presented on the screen for the duration of the rsfMRI scan.
Model-Driven Software Engineering encompasses various modelling formalisms for supporting software development. One such formalism is domain… (see more) modelling which bridges the gap between requirements expressed in natural language and analyzable and more concise domain models expressed in class diagrams. Due to the lack of modelling skills among novice modellers and time constraints in industrial projects, it is often not possible to build an accurate domain model manually. To address this challenge, we aim to develop an approach to extract domain models from problem descriptions written in natural language by combining rules based on natural language processing with machine learning. As a first step, we report on an automated and tool-supported approach with an accuracy of extracted domain models higher than existing approaches. In addition, the approach generates trace links for each model element of a domain model. The trace links enable novice modellers to execute queries on the extracted domain models to gain insights into the modelling decisions taken for improving their modelling skills. Furthermore, to evaluate our approach, we propose a novel comparison metric and discuss our experimental design. Finally, we present a research agenda detailing research directions and discuss corresponding challenges.
2020-01-01
2020 IEEE 28th International Requirements Engineering Conference (RE) (published)
84 Background: Marked sex differences in autism prevalence accentuate the need to understand 85 the role of biological sex-related factors i… (see more)n autism. Efforts to unravel sex differences in the 86 brain organization of autism have, however, been challenged by the limited availability of 87 female data. Methods: We addressed this gap by using a large sample of males and females 88 with autism and neurotypical (NT) control individuals (ABIDE; Autism: 362 males, 82 89 females; NT: 409 males, 166 females; 7-18 years). Discovery analyses examined main effects 90 of diagnosis, sex and their interaction across five resting-state fMRI (R-fMRI) metrics 91 (voxel-level Z > 3.1, cluster-level P 0.01, gaussian random field corrected). Secondary 92 analyses assessed the robustness of the results to different pre-processing approaches and 93 their replicability in two independent samples: the EU-AIMS Longitudinal European Autism 94 Project (LEAP) and the Gender Explorations of Neurogenetics and Development to Advance 95 Autism Research (GENDAAR). Results: Discovery analyses in ABIDE revealed significant 96 main effects across the intrinsic functional connectivity (iFC) of the posterior cingulate 97 cortex, regional homogeneity and voxel-mirrored homotopic connectivity (VMHC) in several 98 cortical regions, largely converging in the default network midline. Sex-by-diagnosis 99 interactions were confined to the dorsolateral occipital cortex, with reduced VMHC in 100 females with autism. All findings were robust to different pre-processing steps. Replicability 101 in independent samples varied by R-fMRI measures and effects with the targeted sex-by102 diagnosis interaction being replicated in the larger of the two replication samples – EU-AIMS 103 LEAP. Limitations: Given the lack of a priori harmonization among the discovery and 104 replication datasets available to date, sample-related variation remained and may have 105 affected replicability. Conclusions: Atypical cross-hemispheric interactions are 106 neurobiologically relevant to autism. They likely result from the combination of sex107
The variational autoencoder (VAE) can learn the manifold of natural images on certain datasets, as evidenced by meaningful interpolating or … (see more)extrapolating in the continuous latent space. However, on discrete data such as text, it is unclear if unsupervised learning can discover similar latent space that allows controllable manipulation. In this work, we find that sequence VAEs trained on text fail to properly decode when the latent codes are manipulated, because the modified codes often land in holes or vacant regions in the aggregated posterior latent space, where the decoding network fails to generalize. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method outperforms unsupervised baselines and strong supervised approaches on text style transfer, and is capable of performing more flexible fine-grained control over text generation than existing methods.
The ubiquitous nature of dialogue systems and their interaction with users generate an enormous amount of data. Can we improve chatbots usin… (see more)g this data? A self-feeding chatbot improves itself by asking natural language feedback when a user is dissatisfied with its response and uses this feedback as an additional training sample. However, user feedback in most cases contains extraneous sequences hindering their usefulness as a training sample. In this work, we propose a generative adversarial model that converts noisy feedback into a plausible natural response in a conversation. The generator’s goal is to convert the feedback into a response that answers the user’s previous utterance and to fool the discriminator which distinguishes feedback from natural responses. We show that augmenting original training data with these modified feedback responses improves the original chatbot performance from 69.94%to 75.96% in ranking correct responses on the PERSONACHATdataset, a large improvement given that the original model is already trained on 131k samples.
2020-01-01
Conference on Empirical Methods in Natural Language Processing (published)
16p11.2 and 22q11.2 Copy Number Variants (CNVs) confer high risk for Autism Spectrum Disorder (ASD), schizophrenia (SZ), and Attention-Defic… (see more)it-Hyperactivity-Disorder (ADHD), but their impact on functional connectivity (FC) remains unclear. We analyzed resting-state functional magnetic resonance imaging data from 101 CNV carriers, 755 individuals with idiopathic ASD, SZ, or ADHD and 1,072 controls. We used CNV FC-signatures to identify dimensions contributing to complex idiopathic conditions. CNVs had large mirror effects on FC at the global and regional level. Thalamus, somatomotor, and posterior insula regions played a critical role in dysconnectivity shared across deletions, duplications, idiopathic ASD, SZ but not ADHD. Individuals with higher similarity to deletion FC-signatures exhibited worse cognitive and behavioral symptoms. Deletion similarities identified at the connectivity level could be related to the redundant associations observed genome-wide between gene expression spatial patterns and FC-signatures. Results may explain why many CNVs affect a similar range of neuropsychiatric symptoms.
The standard approach for modeling partially observed systems is to model them as partially observable Markov decision processes (POMDPs) an… (see more)d obtain a dynamic program in terms of a belief state. The belief state formulation works well for planning but is not ideal for online reinforcement learning because the belief state depends on the model and, as such, is not observable when the model is unknown.In this paper, we present an alternative notion of an information state for obtaining a dynamic program in partially observed models. In particular, an information state is a sufficient statistic for the current reward which evolves in a controlled Markov manner. We show that such an information state leads to a dynamic programming decomposition. Then we present a notion of an approximate information state and present an approximate dynamic program based on the approximate information state. Approximate information state is defined in terms of properties that can be estimated using sampled trajectories. Therefore, they provide a constructive method for reinforcement learning in partially observed systems. We present one such construction and show that it performs better than the state of the art for three benchmark models.
2019-12-01
IEEE Conference on Decision and Control (published)
Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in th… (see more)e development of hidden quantum Markov models (HQMMs) to model stochastic processes. However, there has been little progress in characterizing the expressiveness of such models and learning them from data. We tackle these problems by showing that HQMMs are a special subclass of the general class of observable operator models (OOMs) that do not suffer from the \emph{negative probability problem} by design. We also provide a feasible retraction-based learning algorithm for HQMMs using constrained gradient descent on the Stiefel manifold of model parameters. We demonstrate that this approach is faster and scales to larger models than previous learning algorithms.