Publications

On the interplay between noise and curvature and its effect on optimization and generalization
Valentin Thomas
Fabian Pedregosa
Bart van Merriënboer
Pierre-Antoine Manzagol
The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the v… (see more)ariance of the gradients. While most previous works focus on one or the other of these properties, we explore how their interaction affects optimization speed. Further, as the ultimate goal is good generalization performance, we clarify how both curvature and noise are relevant to properly estimate the generalization gap. Realizing that the limitations of some existing works stems from a confusion between these matrices, we also clarify the distinction between the Fisher matrix, the Hessian, and the covariance matrix of the gradients.
On the Systematicity of Probing Contextualized Word Representations: The Case of Hypernymy in BERT.
Abhilasha Ravichander
Eduard Hovy
Kaheer Suleman
Adam Trischler
A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games.
A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games
We consider differentiable games where the goal is to find a Nash equilibrium. The machine learning community has recently started using v… (see more)ariants of the gradient method ( GD ). Prime examples are extragradient ( EG ), the optimistic gradient method ( OG ) and consensus optimization ( CO ), which enjoy linear convergence in cases like bilinear games, where the standard GD fails. The full bene-fits of theses relatively new methods are not known as there is no unified analysis for both strongly monotone and bilinear games. We provide new analyses of the EG ’s local and global convergence properties and use is to get a tighter global convergence rate for OG and CO . Our analysis covers the whole range of settings between bilinear and strongly monotone games. It reveals that these methods converges via different mechanisms at these extremes; in between, it exploits the most favorable mechanism for the given problem. We then prove that EG achieves the optimal rate for a wide class of algorithms with any number of extrapolations. Our tight analysis of EG ’s convergence rate in games shows that, unlike in convex minimization, EG may be much faster than GD .
Title : Differential functional neural circuitry behind autism subtypes with marked imbalance between social-communicative and restricted repetitive behavior symptom domains
Natasha Bertelsen
Isotta Landi
Richard A.I. Bethlehem
Jakob Seidlitz
Elena
Maria Busuoli
Veronica Mandelli
Eleonora Satta
Stavros Trakoshis
Bonnie Auyeung
Prantik Kundu
Eva Loth
Sarah Baumeister
Christian Beckmann
Sven Bölte
Thomas Bourgeron
Tony Charman
Sarah Durston
Christine Ecker … (see 22 more)
Rosemary Holt
Mark Johnson
Emily J. H. Jones
Luke Mason
-. AndreasMeyer
Lindenberg
Carolin
Moessnang
Marianne
Oldehinkel
Antonio
Persico
Julian
Tillmann
Steven C. R. Williams
Will Spooren
Declan Murphy
Katherine Jan
Buitelaar
Simon Baron-Cohen
Meng-Chuan Lai
Michael V. Lombardo
Social-communication (SC) and restricted repetitive behaviors (RRB) are autism diagnostic symptom domains. SC and RRB severity can markedly … (see more)differ within and between individuals and is underpinned by different neural circuitry and genetic mechanisms. Modeling SC-RRB balance could help identify how neural circuitry and genetic mechanisms map onto such phenotypic heterogeneity. Here we developed a phenotypic stratification model that makes highly accurate (96-98%) out-of-sample SC=RRB, SC>RRB, and RRB>SC subtype predictions. Applying this model to resting state fMRI data from the EU-AIMS LEAP dataset (n=509), we find replicable somatomotor-perisylvian hypoconnectivity in the SC>RRB subtype versus a typically-developing (TD) comparison group. In contrast, replicable motor-anterior salience hyperconnectivity is apparent in the SC=RRB subtype versus TD. Autism-associated genes affecting astrocytes, excitatory, and inhibitory neurons are highly expressed specifically within SC>RRB hypoconnected networks, but not SC=RRB hyperconnected networks. SC-RRB balance subtypes may indicate different paths individuals take from genome, neural circuitry, to the clinical phenotype. (CIMH). Procedures were undertaken to optimize the MRI sequences for the best scanner-specific options, and phantoms and travelling heads were employed to assure standardization and quality assurance of the multisite image-acquisition 20 . Structural images were obtained using a 5.5 minute MPRAGE sequence (TR=2300ms, TE=2.93ms, T1=900ms, voxels size=1.1x1.1x1.2mm, flip angle=9°, matrix size=256x256, FOV=270mm, 176 slices). An eight-to-ten minute resting-state fMRI (rsfMRI) scan was acquired using a multi-echo planar imaging (ME-EPI) sequence 65,66 ; TR=2300ms, TE~12ms, 31ms, and 48ms (slight variations are present across centers), flip angle=80°, matrix size=64x64, (UMCU), 215 (KCL, CIMH), 265 (RUMC, UCAM). were to relax, with eyes open and fixate on a cross presented on the screen for the duration of the rsfMRI scan.
Towards Queryable and Traceable Domain Models
Rijul Saini
Gunter Mussbacher
Jörg Kienzle
Model-Driven Software Engineering encompasses various modelling formalisms for supporting software development. One such formalism is domain… (see more) modelling which bridges the gap between requirements expressed in natural language and analyzable and more concise domain models expressed in class diagrams. Due to the lack of modelling skills among novice modellers and time constraints in industrial projects, it is often not possible to build an accurate domain model manually. To address this challenge, we aim to develop an approach to extract domain models from problem descriptions written in natural language by combining rules based on natural language processing with machine learning. As a first step, we report on an automated and tool-supported approach with an accuracy of extracted domain models higher than existing approaches. In addition, the approach generates trace links for each model element of a domain model. The trace links enable novice modellers to execute queries on the extracted domain models to gain insights into the modelling decisions taken for improving their modelling skills. Furthermore, to evaluate our approach, we propose a novel comparison metric and discuss our experimental design. Finally, we present a research agenda detailing research directions and discuss corresponding challenges.
Towards robust and replicable sex differences in the intrinsic brain 1 function of autism 2 3
Dorothea L. Floris
José O. A. Filho
Meng-Chuan Lai
Steve
Giavasis
Marianne Oldehinkel
Maarten Mennes
Tony Charman
Julian
Tillmann
Christine Ecker
Flavio Dell’Acqua
Tobias Banaschewski
Carolin Moessnang
Simon Baron-Cohen
Sarah
Durston
Eva Loth
Declan Murphy … (see 4 more)
Jan K. Buitelaar
Christian Beckmann
Michael P. Milham
A. Martino
84 Background: Marked sex differences in autism prevalence accentuate the need to understand 85 the role of biological sex-related factors i… (see more)n autism. Efforts to unravel sex differences in the 86 brain organization of autism have, however, been challenged by the limited availability of 87 female data. Methods: We addressed this gap by using a large sample of males and females 88 with autism and neurotypical (NT) control individuals (ABIDE; Autism: 362 males, 82 89 females; NT: 409 males, 166 females; 7-18 years). Discovery analyses examined main effects 90 of diagnosis, sex and their interaction across five resting-state fMRI (R-fMRI) metrics 91 (voxel-level Z > 3.1, cluster-level P 0.01, gaussian random field corrected). Secondary 92 analyses assessed the robustness of the results to different pre-processing approaches and 93 their replicability in two independent samples: the EU-AIMS Longitudinal European Autism 94 Project (LEAP) and the Gender Explorations of Neurogenetics and Development to Advance 95 Autism Research (GENDAAR). Results: Discovery analyses in ABIDE revealed significant 96 main effects across the intrinsic functional connectivity (iFC) of the posterior cingulate 97 cortex, regional homogeneity and voxel-mirrored homotopic connectivity (VMHC) in several 98 cortical regions, largely converging in the default network midline. Sex-by-diagnosis 99 interactions were confined to the dorsolateral occipital cortex, with reduced VMHC in 100 females with autism. All findings were robust to different pre-processing steps. Replicability 101 in independent samples varied by R-fMRI measures and effects with the targeted sex-by102 diagnosis interaction being replicated in the larger of the two replication samples – EU-AIMS 103 LEAP. Limitations: Given the lack of a priori harmonization among the discovery and 104 replication datasets available to date, sample-related variation remained and may have 105 affected replicability. Conclusions: Atypical cross-hemispheric interactions are 106 neurobiologically relevant to autism. They likely result from the combination of sex107
On Variational Learning of Controllable Representations for Text without Supervision
Peng Xu
Yanshuai Cao
The variational autoencoder (VAE) can learn the manifold of natural images on certain datasets, as evidenced by meaningful interpolating or … (see more)extrapolating in the continuous latent space. However, on discrete data such as text, it is unclear if unsupervised learning can discover similar latent space that allows controllable manipulation. In this work, we find that sequence VAEs trained on text fail to properly decode when the latent codes are manipulated, because the modified codes often land in holes or vacant regions in the aggregated posterior latent space, where the decoding network fails to generalize. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method outperforms unsupervised baselines and strong supervised approaches on text style transfer, and is capable of performing more flexible fine-grained control over text generation than existing methods.
You could have said that instead: Improving Chatbots with Natural Language Feedback
Makesh Narsimhan Sreedhar
Kun Ni
The ubiquitous nature of dialogue systems and their interaction with users generate an enormous amount of data. Can we improve chatbots usin… (see more)g this data? A self-feeding chatbot improves itself by asking natural language feedback when a user is dissatisfied with its response and uses this feedback as an additional training sample. However, user feedback in most cases contains extraneous sequences hindering their usefulness as a training sample. In this work, we propose a generative adversarial model that converts noisy feedback into a plausible natural response in a conversation. The generator’s goal is to convert the feedback into a response that answers the user’s previous utterance and to fool the discriminator which distinguishes feedback from natural responses. We show that augmenting original training data with these modified feedback responses improves the original chatbot performance from 69.94%to 75.96% in ranking correct responses on the PERSONACHATdataset, a large improvement given that the original model is already trained on 131k samples.
Interactive Psychometrics for Autism with the Human Dynamic Clamp: Interpersonal Synchrony from Sensory-motor to Socio-cognitive Domains
Florence Baillin
Aline Lefebvre
Amandine Pedoux
Yann Beauxis
Denis-Alexander Engemann
Anna Maruani
Frederique Amsellem
Thomas Bourgeron
Richard Delorme
Neuropsychiatric mutations delineate functional brain connectivity dimensions contributing to autism and schizophrenia
Clara A. Moreau
Sebastian Urchs
Pierre Orban
Catherine Schramm
Aurélie Labbe
Guillaume Huguet
Elise Douard
Pierre-Olivier Quirion
Amy Lin
Leila Kushan
Stephanie Grot
David Luck
Adrianna Mendrek
Stephane Potvin
Emmanuel Stip
Thomas Bourgeron
Alan C. Evans
Carrie E. Bearden
Sébastien Jacquemont
16p11.2 and 22q11.2 Copy Number Variants (CNVs) confer high risk for Autism Spectrum Disorder (ASD), schizophrenia (SZ), and Attention-Defic… (see more)it-Hyperactivity-Disorder (ADHD), but their impact on functional connectivity (FC) remains unclear. We analyzed resting-state functional magnetic resonance imaging data from 101 CNV carriers, 755 individuals with idiopathic ASD, SZ, or ADHD and 1,072 controls. We used CNV FC-signatures to identify dimensions contributing to complex idiopathic conditions. CNVs had large mirror effects on FC at the global and regional level. Thalamus, somatomotor, and posterior insula regions played a critical role in dysconnectivity shared across deletions, duplications, idiopathic ASD, SZ but not ADHD. Individuals with higher similarity to deletion FC-signatures exhibited worse cognitive and behavioral symptoms. Deletion similarities identified at the connectivity level could be related to the redundant associations observed genome-wide between gene expression spatial patterns and FC-signatures. Results may explain why many CNVs affect a similar range of neuropsychiatric symptoms.
Approximate information state for partially observed systems
Jayakumar Subramanian
The standard approach for modeling partially observed systems is to model them as partially observable Markov decision processes (POMDPs) an… (see more)d obtain a dynamic program in terms of a belief state. The belief state formulation works well for planning but is not ideal for online reinforcement learning because the belief state depends on the model and, as such, is not observable when the model is unknown.In this paper, we present an alternative notion of an information state for obtaining a dynamic program in partially observed models. In particular, an information state is a sufficient statistic for the current reward which evolves in a controlled Markov manner. We show that such an information state leads to a dynamic programming decomposition. Then we present a notion of an approximate information state and present an approximate dynamic program based on the approximate information state. Approximate information state is defined in terms of properties that can be estimated using sampled trajectories. Therefore, they provide a constructive method for reinforcement learning in partially observed systems. We present one such construction and show that it performs better than the state of the art for three benchmark models.