Publications

RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Soumyasundar Pal

Liheng Ma

Yingxue Zhang

M. Coates

Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical mod… (see more)els often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.

2021-06-10

ArXiv (preprint)

Rapid simultaneous acquisition of macromolecular tissue volume, susceptibility, and relaxometry maps

Fang Frank Yu

Susie Y. Huang

T. Witzel

Ashwin S. Kumar

Congyu Liao

Tanguy Duval

Julien Cohen-Adad

Berkin Bilgic

Purpose A major obstacle to the clinical implementation of quantitative MR is the lengthy acquisition time required to derive multi-contrast… (see more) parametric maps. We sought to reduce the acquisition time for quantitative susceptibility mapping (QSM) and macromolecular tissue volume (MTV) by acquiring both contrasts simultaneously by leveraging their redundancies. The Joint Virtual Coil concept with generalized autocalibrating partially parallel acquisitions (JVC-GRAPPA) was applied to reduce acquisition time further. Methods Three adult volunteers were imaged on a 3T scanner using a multi-echo 3D GRE sequence acquired at three head orientations. MTV, QSM, R2*, T1, and proton density maps were reconstructed. The same sequence (GRAPPA R=4) was performed in subject #1 with a single head orientation for comparison. Fully sampled data was acquired in subject #2, from which retrospective undersampling was performed (R=6 GRAPPA and R=9 JVC-GRAPPA). Prospective undersampling was performed in subject #3 (R=6 GRAPPA and R=9 JVC-GRAPPA) using gradient blips to shift k-space sampling in later echoes. Results Subject #1’s multi-orientation and single-orientation MTV maps were not significantly different based on RMSE. For subject #2, the retrospectively undersampled JVC-GRAPPA and GRAPPA generated similar results as fully sampled data. This approach was validated with the prospectively undersampled images in subject #3. Using QSM, R2*, and MTV, the contributions of myelin and iron content to susceptibility was estimated. Conclusion We have developed a novel strategy to simultaneously acquire data for the reconstruction of five intrinsically co-registered 1-mm isotropic resolution multi-parametric maps, with a scan time of 6 minutes using JVC-GRAPPA.

2021-06-09

bioRxiv (preprint)

Understanding Capacity Saturation in Incremental Learning

Shenyang Huang

Vincent Francois-Lavet

Guillaume Rabusseau

2021-06-08

Canadian Conference on AI (published)

Double-Linear Thompson Sampling for Context-Attentive Bandits

Djallel Bouneffouf

Raphael Feraud

Sohini Upadhyay

Yasaman Khazaeni

In this paper, we analyze and extend an online learning frame-work known as Context-Attentive Bandit, motivated by various practical applica… (see more)tions, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration; however, the agent has a freedom to choose which variables to observe. We derive a novel algorithm, called Context-Attentive Thompson Sampling (CATS), which builds upon the Linear Thompson Sampling approach, adapting it to Context-Attentive Bandit setting. We provide a theoretical regret analysis and an extensive empirical evaluation demonstrating advantages of the proposed approach over several baseline methods on a variety of real-life datasets.

2021-06-06

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)

Toward Skills Dialog Orchestration with Online Learning

Djallel Bouneffouf

Raphael Feraud

Sohini Upadhyay

Mayank Agarwal

Yasaman Khazaeni

Building multi-domain AI agents is a challenging task and an open problem in the area of AI. Within the domain of dialog, the ability to orc… (see more)hestrate multiple independently trained dialog agents, or skills, to create a unified system is of particular significance. In this work, we study the task of online posterior dialog orchestration, where we define posterior orchestration as the task of selecting a subset of skills which most appropriately answer a user input using features extracted from both the user input and the individual skills. To account for the various costs associated with extracting skill features, we consider online posterior orchestration under a skill execution budget. We formalize this setting as Context Attentive Bandit with Observations (CABO), a variant of context attentive bandits, and evaluate it on proprietary conversational datasets.

2021-06-06

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (published)

Encoder-Decoder Neural Architecture Optimization for Keyword Spotting

Tong Mo

Bang Liu

2021-06-04

ArXiv (preprint)

SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization

Soroosh Shahtalebi

Jean-Christophe Gagnon-Audet

Touraj Laleh

Mojtaba Faramarzi

Kartik Ahuja

A major bottleneck in the real-world applications of machine learning models is their failure in generalizing to unseen domains whose data d… (see more)istribution is not i.i.d to the training domains. This failure often stems from learning non-generalizable features in the training domains that are spuriously correlated with the label of data. To address this shortcoming, there has been a growing surge of interest in learning good explanations that are hard to vary, which is studied under the notion of Out-of-Distribution (OOD) Generalization. The search for good explanations that are \textit{invariant} across different domains can be seen as finding local (global) minimas in the loss landscape that hold true across all of the training domains. In this paper, we propose a masking strategy, which determines a continuous weight based on the agreement of gradients that flow in each edge of network, in order to control the amount of update received by the edge in each step of optimization. Particularly, our proposed technique referred to as"Smoothed-AND (SAND)-masking", not only validates the agreement in the direction of gradients but also promotes the agreement among their magnitudes to further ensure the discovery of invariances across training domains. SAND-mask is validated over the Domainbed benchmark for domain generalization and significantly improves the state-of-the-art accuracy on the Colored MNIST dataset while providing competitive results on other domain generalization datasets.

2021-06-04

ArXiv (preprint)

Continual Learning in Deep Networks: an Analysis of the Last Layer

Timothee LESORT

Thomas George

We study how different output layers in a deep neural network learn and forget in continual learning settings. The following three factors… (see more) can affect catastrophic forgetting in the output layer: (1) weights modifications, (2) interference, and (3) projection drift. In this paper, our goal is to provide more insights into how changing the output layers may address (1) and (2). Some potential solutions to those issues are proposed and evaluated here in several continual learning scenarios. We show that the best-performing type of the output layer depends on the data distribution drifts and/or the amount of data available. In particular, in some cases where a standard linear layer would fail, it turns out that changing parameterization is sufficient in order to achieve a significantly better performance, whithout introducing a continual-learning algorithm and instead using the standard SGD to train a model. Our analysis and results shed light on the dynamics of the output layer in continual learning scenarios, and suggest a way of selecting the best type of output layer for a given scenario.

2021-06-03

ArXiv (preprint)

openreview.net

Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion

Suyuchen Wang

Ruihui Zhao

Xi Chen

Yefeng Zheng

Bang Liu

Taxonomy is a hierarchically structured knowledge graph that plays a crucial role in machine intelligence. The taxonomy expansion task aims … (see more)to find a position for a new term in an existing taxonomy to capture the emerging knowledge in the world and keep the taxonomy dynamically updated. Previous taxonomy expansion solutions neglect valuable information brought by the hierarchical structure and evaluate the correctness of merely an added edge, which downgrade the problem to node-pair scoring or mini-path classification. In this paper, we propose the Hierarchy Expansion Framework (HEF), which fully exploits the hierarchical structure’s properties to maximize the coherence of expanded taxonomy. HEF makes use of taxonomy’s hierarchical structure in multiple aspects: i) HEF utilizes subtrees containing most relevant nodes as self-supervision data for a complete comparison of parental and sibling relations; ii) HEF adopts a coherence modeling module to evaluate the coherence of a taxonomy’s subtree by integrating hypernymy relation detection and several tree-exclusive features; iii) HEF introduces the Fitting Score for position selection, which explicitly evaluates both path and level selections and takes full advantage of parental relations to interchange information for disambiguation and self-correction. Extensive experiments show that by better exploiting the hierarchical structure and optimizing taxonomy’s coherence, HEF vastly surpasses the prior state-of-the-art on three benchmark datasets by an average improvement of 46.7% in accuracy and 32.3% in mean reciprocal rank.

2021-06-03

Proceedings of the Web Conference 2021 (published)

The Surprising Performance of Simple Baselines for Misinformation Detection

Kellin Pelrine

Jacob Danovitch

Reihaneh Rabbany

As social media becomes increasingly prominent in our day to day lives, it is increasingly important to detect informative content and preve… (see more)nt the spread of disinformation and unverified rumours. While many sophisticated and successful models have been proposed in the literature, they are often compared with older NLP baselines such as SVMs, CNNs, and LSTMs. In this paper, we examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models are competitive with and can even significantly outperform recently proposed state-of-the-art methods. We present our framework as a baseline for creating and evaluating new methods for misinformation detection. We further study a comprehensive set of benchmark datasets, and discuss potential data leakage and the need for careful design of the experiments and understanding of datasets to account for confounding variables. As an extreme case example, we show that classifying only based on the first three digits of tweet ids, which contain information on the date, gives state-of-the-art performance on a commonly used benchmark dataset for fake news detection –Twitter16. We provide a simple tool to detect this problem and suggest steps to mitigate it in future datasets.

2021-06-03

Proceedings of the Web Conference 2021 (published)

Brainhack: Developing a culture of open, inclusive, community-driven neuroscience

Rémi Gau

Stephanie Noble

Katja Heuer

Katherine L. Bottenhorn

Isil P. Bilgin

Yu-Fang Yang

Julia M. Huntenburg

Johanna M.M. Bayer

Richard A.I. Bethlehem

Shawn A. Rhoads

Christoph Vogelbacher

V. Borghesani

Elizabeth Levitis

Hao-Ting Wang

Sofie Van Den Bossche

Xenia Kobeleva

Jon Haitz Legarreta

Samuel Guay

Selim Melvin Atay

Gael Varoquaux … (see 199 more)

Dorien C. Huijser

Malin S. Sandström

Peer Herholz

Samuel A. Nastase

AmanPreet Badhwar

Guillaume Dumas

Simon Schwab

Stefano Moia

Michael Dayan

Yasmine Bassil

Paula P. Brooks

Matteo Mancini

James M. Shine

David O’Connor

Xihe Xie

Davide Poggiali

Patrick Friedrich

Anibal S. Heinsfeld

Lydia Riedl

Roberto Toro

César Caballero-Gaudes

Anders Eklund

Kelly G. Garner

Christopher R. Nolan

Damion V. Demeter

Fernando A. Barrios

Junaid S. Merchant

Elizabeth A. McDevitt

Robert Oostenveld

R. Cameron Craddock

Ariel Rokem

Andrew Doyle

Satrajit S. Ghosh

Aki Nikolaidis

Olivia W. Stanley

Eneko Uruñuela

Nasim Anousheh

Aurina Arnatkeviciute

Guillaume Auzias

Dipankar Bachar

Elise Bannier

Ruggero Basanisi

Arshitha Basavaraj

Marco Bedini

Pierre (Louis) Bellec

R. Austin Benn

Kathryn Berluti

Steffen Bollmann

Saskia Bollmann

Claire Bradley

Jesse Brown

Augusto Buchweitz

Patrick Callahan

Micaela Y. Chan

Bramsh Q. Chandio

Theresa Cheng

Sidhant Chopra

Ai Wern Chung

Thomas G. Close

Etienne Combrisson

Giorgia Cona

R. Todd Constable

Claire Cury

Kamalaker Dadi

Pablo F. Damasceno

Samir Das

Fabrizio De Vico Fallani

Krista DeStasio

Erin W. Dickie

Lena Dorfschmidt

Eugene P. Duff

Elizabeth DuPre

Sarah Dziura

Nathalia B. Esper

Oscar Esteban

Shreyas Fadnavis

Guillaume Flandin

Jessica E. Flannery

John Flournoy

Stephanie J. Forkel

Alexandre R. Franco

Saampras Ganesan

Siyuan Gao

José C. García Alanis

Eleftherios Garyfallidis

Tristan Glatard

Enrico Glerean

Javier Gonzalez-Castillo

Cassandra D. Gould van Praag

Abigail S. Greene

Geetika Gupta

Catherine Alice Hahn

Yaroslav O. Halchenko

Daniel Handwerker

Thomas S. Hartmann

Valérie Hayot-Sasson

Stephan Heunis

Felix Hoffstaedter

Daniela M. Hohmann

Corey Horien

Horea-Ioan Ioanas

Alexandru Iordan

Chao Jiang

Michael Joseph

Jason Kai

Agâh Karakuzu

David N. Kennedy

Anisha Keshavan

Ali R. Khan

Gregory Kiar

P. Christiaan Klink

Vincent Koppelmans

Serge Koudoro

Angela R. Laird

Georg Langs

Marissa Laws

Roxane Licandro

Sook-Lei Liew

Tomislav Lipic

Krisanne Litinas

Daniel J. Lurie

Désirée Lussier

Christopher R. Madan

Lea-Theresa Mais

Sina Mansour L

J.P. Manzano-Patron

Dimitra Maoutsa

Matheus Marcon

Daniel S. Margulies

Giorgio Marinato

Daniele Marinazzo

Christopher J. Markiewicz

Camille Maumet

Felipe Meneguzzi

David Meunier

Michael P. Milham

Kathryn L. Mills

Davide Momi

Clara A. Moreau

Aysha Motala

Iska Moxon-Emre

Thomas E. Nichols

Dylan M. Nielson

Gustav Nilsonne

Lisa Novello

Caroline O’Brien

Emily Olafson

Lindsay D. Oliver

John A. Onofrey

Edwina R. Orchard

Kendra Oudyk

Patrick J. Park

Mahboobeh Parsapoor

Lorenzo Pasquini

Scott Peltier

Cyril R. Pernet

Rudolph Pienaar

Pedro Pinheiro-Chagas

Jean-Baptiste Poline

Anqi Qiu

Tiago Quendera

Laura C. Rice

Joscelin Rocha-Hidalgo

Saige Rutherford

Mathias Scharinger

Dustin Scheinost

Deena Shariq

Thomas B. Shaw

Viviana Siless

Molly Simmonite

Nikoloz Sirmpilatze

Hayli Spence

Julia Sprenger

Andrija Stajduhar

Martin Szinte

Sylvain Takerkart

Angela Tam

Link Tejavibulya

Michel Thiebaut de Schotten

Ina Thome

Laura Tomaz da Silva

Nicolas Traut

Lucina Q. Uddin

Antonino Vallesi

John W. VanMeter

Nandita Vijayakumar

Matteo Visconti di Oleggio Castello

Jakub Vohryzek

Jakša Vukojević

Kirstie Jane Whitaker

Lucy Whitmore

Steve Wideman

Suzanne T. Witt

Hua Xie

Ting Xu

Chao-Gan Yan

Fang-Cheng Yeh

B.T. Thomas Yeo

Xi-Nian Zuo

2021-06-01

Neuron (published)