Publications

Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion
Suyuchen Wang
Ruihui Zhao
Xi Chen
Yefeng Zheng
Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims … (voir plus)to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure’s properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy’s hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy’s subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy’s coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.
The Surprising Performance of Simple Baselines for Misinformation Detection
Kellin Pelrine
Jacob Danovitch
As social media becomes increasingly prominent in our day to day lives, it is increasingly important to detect informative content and preve… (voir plus)nt the spread of disinformation and unverified rumours. While many sophisticated and successful models have been proposed in the literature, they are often compared with older NLP baselines such as SVMs, CNNs, and LSTMs. In this paper, we examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models are competitive with and can even significantly outperform recently proposed state-of-the-art methods. We present our framework as a baseline for creating and evaluating new methods for misinformation detection. We further study a comprehensive set of benchmark datasets, and discuss potential data leakage and the need for careful design of the experiments and understanding of datasets to account for confounding variables. As an extreme case example, we show that classifying only based on the first three digits of tweet ids, which contain information on the date, gives state-of-the-art performance on a commonly used benchmark dataset for fake news detection –Twitter16. We provide a simple tool to detect this problem and suggest steps to mitigate it in future datasets.
Brainhack: Developing a culture of open, inclusive, community-driven neuroscience
Rémi Gau
Stephanie Noble
Katja Heuer
Katherine L. Bottenhorn
Isil P. Bilgin
Yu-Fang Yang
Julia M. Huntenburg
Johanna M.M. Bayer
Richard A.I. Bethlehem
Shawn A. Rhoads
Christoph Vogelbacher
V. Borghesani
Elizabeth Levitis
Hao-Ting Wang
Sofie Van Den Bossche
Xenia Kobeleva
Jon Haitz Legarreta
Samuel Guay
Selim Melvin Atay
Gael Varoquaux … (voir 199 de plus)
Dorien C. Huijser
Malin S. Sandström
Peer Herholz
Samuel A. Nastase
AmanPreet Badhwar
Simon Schwab
Stefano Moia
Michael Dayan
Yasmine Bassil
Paula P. Brooks
Matteo Mancini
James M. Shine
David O’Connor
Xihe Xie
Davide Poggiali
Patrick Friedrich
Anibal S. Heinsfeld
Lydia Riedl
Roberto Toro
César Caballero-Gaudes
Anders Eklund
Kelly G. Garner
Christopher R. Nolan
Damion V. Demeter
Fernando A. Barrios
Junaid S. Merchant
Elizabeth A. McDevitt
Robert Oostenveld
R. Cameron Craddock
Ariel Rokem
Andrew Doyle
Satrajit S. Ghosh
Aki Nikolaidis
Olivia W. Stanley
Eneko Uruñuela
Nasim Anousheh
Aurina Arnatkeviciute
Guillaume Auzias
Dipankar Bachar
Elise Bannier
Ruggero Basanisi
Arshitha Basavaraj
Marco Bedini
R. Austin Benn
Kathryn Berluti
Steffen Bollmann
Saskia Bollmann
Claire Bradley
Jesse Brown
Augusto Buchweitz
Patrick Callahan
Micaela Y. Chan
Bramsh Q. Chandio
Theresa Cheng
Sidhant Chopra
Ai Wern Chung
Thomas G. Close
Etienne Combrisson
Giorgia Cona
R. Todd Constable
Claire Cury
Kamalaker Dadi
Pablo F. Damasceno
Samir Das
Fabrizio De Vico Fallani
Krista DeStasio
Erin W. Dickie
Lena Dorfschmidt
Eugene P. Duff
Elizabeth DuPre
Sarah Dziura
Nathalia B. Esper
Oscar Esteban
Shreyas Fadnavis
Guillaume Flandin
Jessica E. Flannery
John Flournoy
Stephanie J. Forkel
Alexandre R. Franco
Saampras Ganesan
Siyuan Gao
José C. García Alanis
Eleftherios Garyfallidis
Tristan Glatard
Enrico Glerean
Javier Gonzalez-Castillo
Cassandra D. Gould van Praag
Abigail S. Greene
Geetika Gupta
Catherine Alice Hahn
Yaroslav O. Halchenko
Daniel Handwerker
Thomas S. Hartmann
Valérie Hayot-Sasson
Stephan Heunis
Felix Hoffstaedter
Daniela M. Hohmann
Corey Horien
Horea-Ioan Ioanas
Alexandru Iordan
Chao Jiang
Michael Joseph
Jason Kai
Agâh Karakuzu
David N. Kennedy
Anisha Keshavan
Ali R. Khan
Gregory Kiar
P. Christiaan Klink
Vincent Koppelmans
Serge Koudoro
Angela R. Laird
Georg Langs
Marissa Laws
Roxane Licandro
Sook-Lei Liew
Tomislav Lipic
Krisanne Litinas
Daniel J. Lurie
Désirée Lussier
Christopher R. Madan
Lea-Theresa Mais
Sina Mansour L
J.P. Manzano-Patron
Dimitra Maoutsa
Matheus Marcon
Daniel S. Margulies
Giorgio Marinato
Daniele Marinazzo
Christopher J. Markiewicz
Camille Maumet
Felipe Meneguzzi
David Meunier
Michael P. Milham
Kathryn L. Mills
Davide Momi
Clara A. Moreau
Aysha Motala
Iska Moxon-Emre
Thomas E. Nichols
Dylan M. Nielson
Gustav Nilsonne
Lisa Novello
Caroline O’Brien
Emily Olafson
Lindsay D. Oliver
John A. Onofrey
Edwina R. Orchard
Kendra Oudyk
Patrick J. Park
Mahboobeh Parsapoor
Lorenzo Pasquini
Scott Peltier
Cyril R. Pernet
Rudolph Pienaar
Pedro Pinheiro-Chagas
Jean-Baptiste Poline
Anqi Qiu
Tiago Quendera
Laura C. Rice
Joscelin Rocha-Hidalgo
Saige Rutherford
Mathias Scharinger
Dustin Scheinost
Deena Shariq
Thomas B. Shaw
Viviana Siless
Molly Simmonite
Nikoloz Sirmpilatze
Hayli Spence
Julia Sprenger
Andrija Stajduhar
Martin Szinte
Sylvain Takerkart
Angela Tam
Link Tejavibulya
Michel Thiebaut de Schotten
Ina Thome
Laura Tomaz da Silva
Nicolas Traut
Lucina Q. Uddin
Antonino Vallesi
John W. VanMeter
Nandita Vijayakumar
Matteo Visconti di Oleggio Castello
Jakub Vohryzek
Jakša Vukojević
Kirstie Jane Whitaker
Lucy Whitmore
Steve Wideman
Suzanne T. Witt
Hua Xie
Ting Xu
Chao-Gan Yan
Fang-Cheng Yeh
B.T. Thomas Yeo
Xi-Nian Zuo
Explicitly Modeling Syntax in Language Models with Incremental Parsing and a Dynamic Oracle
Syntax is fundamental to our thinking about language. Failing to capture the structure of input language could lead to generalization proble… (voir plus)ms and over-parametrization. In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM). The model explicitly models the structure with an incremental parser and maintains the conditional probability setting of a standard language model (left-to-right). To train the incremental parser and avoid exposure bias, we also propose a novel dynamic oracle, so that SOM is more robust to wrong parsing decisions. Experiments show that SOM can achieve strong results in language modeling, incremental parsing, and syntactic generalization tests while using fewer parameters than other models.
Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management
Zhengxu Hou
Ruihui Zhao
Zijing Ou
Yafei Liu
Xi Chen
Yefeng Zheng
For task-oriented dialog systems, training a Reinforcement Learning (RL) based Dialog Management module suffers from low sample efficiency a… (voir plus)nd slow convergence speed due to the sparse rewards in RL. To solve this problem, many strategies have been proposed to give proper rewards when training RL, but their rewards lack interpretability and cannot accurately estimate the distribution of state-action pairs in real dialogs. In this paper, we propose a multi-level reward modeling approach that factorizes a reward into a three-level hierarchy: domain, act, and slot. Based on inverse adversarial reinforcement learning, our designed reward model can provide more accurate and explainable reward signals for state-action pairs. Extensive evaluations show that our approach can be applied to a wide range of reinforcement learning-based dialog systems and significantly improves both the performance and the speed of convergence.
Modeling Event Plausibility with Consistent Conceptual Abstraction
Ian Porada
Kaheer Suleman
Adam Trischler
Understanding by Understanding Not: Modeling Negation in Language Models
Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language mode… (voir plus)ls often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top 1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.
On the benefits of representation regularization in invariance based domain generalization
Changjian Shui
Boyu Wang
Gotta Go Fast When Generating Data with Score-Based Models
Alexia Jolicoeur-Martineau
Ke Li
Rémi Piché-Taillefer
Tal Kachman
Score-based (denoising diffusion) generative models have recently gained a lot of success in generating realistic and diverse data. These ap… (voir plus)proaches define a forward diffusion process for transforming data to noise and generate data by reversing it (thereby going from noise to data). Unfortunately, current score-based models generate data very slowly due to the sheer number of score network evaluations required by numerical SDE solvers. In this work, we aim to accelerate this process by devising a more efficient SDE solver. Existing approaches rely on the Euler-Maruyama (EM) solver, which uses a fixed step size. We found that naively replacing it with other SDE solvers fares poorly - they either result in low-quality samples or become slower than EM. To get around this issue, we carefully devise an SDE solver with adaptive step sizes tailored to score-based generative models piece by piece. Our solver requires only two score function evaluations, rarely rejects samples, and leads to high-quality samples. Our approach generates data 2 to 10 times faster than EM while achieving better or equal sample quality. For high-resolution images, our method leads to significantly higher quality samples than all other methods tested. Our SDE solver has the benefit of requiring no step size tuning.
Noised Consistency Training for Text Summarization
J. Liu
Qianren Mao
Hao Peng
Hongdong Zhu
Jianxin Li
Neural abstractive summarization methods often require large quantities of labeled training data. However, labeling large amounts of summari… (voir plus)zation data is often prohibitive due to time, financial, and expertise constraints, which has limited the usefulness of summarization systems to practical applications. In this paper, we argue that this limitation can be overcome by a semi-supervised approach: consistency training which is to leverage large amounts of unlabeled data to improve the performance of supervised learning over a small corpus. The consistency regularization semi-supervised learning can regularize model predictions to be invariant to small noise applied to input articles. By adding noised unlabeled corpus to help regularize consistency training, this framework obtains comparative performance without using the full dataset. In particular, we have verified that leveraging large amounts of unlabeled data decently improves the performance of supervised learning over an insufficient labeled dataset.
Learning Brain Dynamics With Coupled Low-Dimensional Nonlinear Oscillators and Deep Recurrent Networks
Germán Abrevaya
Aleksandr Y. Aravkin
Peng Zheng
Jean-Christophe Gagnon-Audet
James Kozloski
Pablo Polosecki
David Cox
Silvina Ponce Dawson
Guillermo Cecchi
Many natural systems, especially biological ones, exhibit complex multivariate nonlinear dynamical behaviors that can be hard to capture by … (voir plus)linear autoregressive models. On the other hand, generic nonlinear models such as deep recurrent neural networks often require large amounts of training data, not always available in domains such as brain imaging; also, they often lack interpretability. Domain knowledge about the types of dynamics typically observed in such systems, such as a certain type of dynamical systems models, could complement purely data-driven techniques by providing a good prior. In this work, we consider a class of ordinary differential equation (ODE) models known as van der Pol (VDP) oscil lators and evaluate their ability to capture a low-dimensional representation of neural activity measured by different brain imaging modalities, such as calcium imaging (CaI) and fMRI, in different living organisms: larval zebrafish, rat, and human. We develop a novel and efficient approach to the nontrivial problem of parameters estimation for a network of coupled dynamical systems from multivariate data and demonstrate that the resulting VDP models are both accurate and interpretable, as VDP's coupling matrix reveals anatomically meaningful excitatory and inhibitory interactions across different brain subsystems. VDP outperforms linear autoregressive models (VAR) in terms of both the data fit accuracy and the quality of insight provided by the coupling matrices and often tends to generalize better to unseen data when predicting future brain activity, being comparable to and sometimes better than the recurrent neural networks (LSTMs). Finally, we demonstrate that our (generative) VDP model can also serve as a data-augmentation tool leading to marked improvements in predictive accuracy of recurrent neural networks. Thus, our work contributes to both basic and applied dimensions of neuroimaging: gaining scientific insights and improving brain-based predictive models, an area of potentially high practical importance in clinical diagnosis and neurotechnology.
Inferring global-scale temporal latent topics from news reports to predict public health interventions for COVID-19
Zhi Wen
Guido Powell
Imane Chafi
Y. K. Li