Publications

A Survey of Data Augmentation Approaches for NLP
Steven Y. Feng
Varun Gangal
Jason Wei
Soroush Vosoughi
Teruko Mitamura
Eduard Hovy
Data augmentation has recently seen increased interest in NLP due to more work in low-resource domains, new tasks, and the popularity of lar… (see more)ge-scale neural networks that require large amounts of training data. Despite this recent upsurge, this area is still relatively underexplored, perhaps due to the challenges posed by the discrete nature of language data. In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner. We first introduce and motivate data augmentation for NLP, and then discuss major methodologically representative approaches. Next, we highlight techniques that are used for popular NLP applications and tasks. We conclude by outlining current challenges and directions for future research. Overall, our paper aims to clarify the landscape of existing literature in data augmentation for NLP and motivate additional work in this area. We also present a GitHub repository with a paper list that will be continuously updated at https://github.com/styfeng/DataAug4NLP
A systematic analysis of ICSD-3 diagnostic criteria and proposal for further structured iteration.
Christophe Gauld
Régis Lopez
Pierre A. GEOFFROY
Charles Morin
Kelly Guichard
Elodie Giroux
Yves Dauvilliers
Pierre Philip
Jean‐Arthur Micoulaud‐Franchi
Temporal Profiles of Social Attention Are Different Across Development in Autistic and Neurotypical People.
Teresa Del Bianco
Luke Mason
Tony Charman
Julianne Tillman
Eva Loth
Hannah Hayward
F. Shic
Jan K. Buitelaar
Mark Johnson
Emily J. H. Jones
Jumana Ahmad
Sara Ambrosino
Tobias Banaschewski
Simon Baron-Cohen
Sarah Baumeister
Christian Beckmann
Sven Bölte
Thomas Bourgeron
Carsten Bours
M. Brammer … (see 46 more)
Daniel Brandeis
Claudia Brogna
Yvette de Bruijn
Ineke Cornelissen
Daisy Crawley
Flavio Dell’Acqua
Sarah Durston
Christine Ecker
Jessica Faulkner
Vincent Frouin
Pilar Garcés
David Goyard
Lindsay Ham
Joerg F. Hipp
Rosemary Holt
Meng-Chuan Lai
Xavier Liogier D’ardhuy
Michael V. Lombardo
David J. Lythgoe
René Mandl
Andre Marquand
Maarten Mennes
Andreas Meyer-Lindenberg
Carolin Moessnang
Nico Mueller
Declan Murphy
Beth Oakley
Larry O’Dwyer
Marianne Oldehinkel
Bob Oranje
Gahan Pandina
Antonio Persico
Barbara Ruggeri
Amber N. V. Ruigrok
Jessica Sabet
Roberto Sacco
Antonia San José Cáceres
Emily Simonoff
Will Spooren
Roberto Toro
Heike Tost
Jack Waldman
Steve C. R. Williams
Caroline Wooldridge
Marcel P. Zwiers
Why do sleep disorders belong to mental disorder classifications? A network analysis of the "Sleep-Wake Disorders" section of the DSM-5.
Christophe Gauld
Régis Lopez
Charles Morin
Julien Maquet
Aileen McGonigal
Pierre A. GEOFFROY
Eric Fakra
Pierre Philip
Jean‐Arthur Micoulaud‐Franchi
Atlas-Based Quantification of DTI Measures in a Typically Developing Pediatric Spinal Cord
Shiva Shahrampour
Benjamin De Leener
Mahdi Alizadeh
D. Middleton
Laura Krisa
Adam E. Flanders
S. Faro
F. Mohamed
Automated Data-Driven Generation of Personalized Pedagogical Interventions in Intelligent Tutoring Systems
Ekaterina Kochmar
Dung D. Vu
Robert Belfer
Varun Gupta
Iulian V. Serban
Modelling Latent Translations for Cross-Lingual Transfer
Automatic multiclass intramedullary spinal cord tumor segmentation on MRI with deep learning
Charley Gros
Zhizheng Zhuo
Jie Zhang
Yunyun Duan
Yaou Liu
Exploration-Driven Representation Learning in Reinforcement Learning
Mingde Zhao
Marlos C. Machado
Sainbayar Sukhbaatar
Ludovic Denoyer
Alessandro Lazaric
Learning reward-agnostic representations is an emerging paradigm in reinforcement learning. These representations can be leveraged for sever… (see more)al purposes ranging from reward shaping to skill discovery. Nevertheless, in order to learn such representations, existing methods often rely on assuming uniform access to the state space. With such a privilege, the agent’s coverage of the environment can be limited which hurts the quality of the learned representations. In this work, we introduce a method that explicitly couples representation learning with exploration when the agent is not provided with a uniform prior over the state space. Our method learns representations that constantly drive exploration while the data generated by the agent’s exploratory behavior drives the learning of better representations. We empirically validate our approach in goal-achieving tasks, demonstrating that the learned representation captures the dynamics of the environment, leads to more accurate value estimation, and to faster credit assignment, both when used for control and for reward shaping. Finally, the exploratory policy that emerges from our approach proves to be successful at continuous navigation tasks with sparse rewards.
Diffusion magnetic resonance imaging reveals tract‐specific microstructural correlates of electrophysiological impairments in non‐myelopathic and myelopathic spinal cord compression
René Labounek
Tomáš Horák
Magda Horáková
Petr Bednařík
Miloš Keřkovský
Jan Kočica
Tomáš Rohan
Christophe Lenglet
Julien Cohen‐Adad
Petr Hluštı́k
Eva Vlčková
Zdeněk Kadaňka
Josef Bednařík
Alena Svátková
Non‐myelopathic degenerative cervical spinal cord compression (NMDC) frequently occurs throughout aging and may progress to potentially ir… (see more)reversible degenerative cervical myelopathy (DCM). Whereas standard clinical magnetic resonance imaging (MRI) and electrophysiological measures assess compression severity and neurological dysfunction, respectively, underlying microstructural deficits still have to be established in NMDC and DCM patients. The study aims to establish tract‐specific diffusion MRI markers of electrophysiological deficits to predict the progression of asymptomatic NMDC to symptomatic DCM.
Combating False Negatives in Adversarial Imitation Learning
Léonard Boussioux
David Y. T. Hui
Maxime Chevalier-Boisvert
In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the de… (see more)sired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's trajectories, the discriminator is trained to output low values for them. We hypothesize that this inconsistent training signal for the discriminator can impede its learning, and consequently leads to worse overall performance of the agent. We show experimental evidence for this hypothesis and that the ‘False Negatives’ (i.e. successful agent episodes) significantly hinder adversarial imitation learning, which is the first contribution of this paper. Then, we propose a method to alleviate the impact of false negatives and test it on the BabyAI environment. This method consistently improves sample efficiency over the baselines by at least an order of magnitude.
VirtualGAN: Reducing Mode Collapse in Generative Adversarial Networks Using Virtual Mapping
Adel Abusitta
Omar Abdel Wahab
Benjamin C. M. Fung
This paper introduces a new framework for reducing mode collapse in Generative adversarial networks (GANs). The problem occurs when the gene… (see more)rator learns to map several various input values (z) to the same output value, which makes the generator fail to capture all modes of the true data distribution. As a result, the diversity of synthetically produced data is lower than that of the real data. To address this problem, we propose a new and simple framework for training GANs based on the concept of virtual mapping. Our framework integrates two processes into GANs: merge and split. The merge process merges multiple data points (samples) into one before training the discriminator. In this way, the generator would be trained to capture the merged-data distribution rather than the (unmerged) data distribution. After the training, the split process is applied to the generator's output in order to split its contents and produce diverse modes. The proposed framework increases the chance of capturing diverse modes through enabling an indirect or virtual mapping between an input z value and multiple data points. This, in turn, enhances the chance of generating more diverse modes. Our results show the effectiveness of our framework compared to the existing approaches in terms of reducing the mode collapse problem.