Publications

A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix

Thang Doan

Mehdi Bennani

Pierre Alquier

Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime. Although maj… (voir plus)or advances have been made in the field, one recurring problem which remains unsolved is that of Catastrophic Forgetting (CF). While the issue has been extensively studied empirically, little attention has been paid from a theoretical angle. In this paper, we show that the impact of CF increases as two tasks increasingly align. We introduce a measure of task similarity called the NTK overlap matrix which is at the core of CF. We analyze common projected gradient algorithms and demonstrate how they mitigate forgetting. Then, we propose a variant of Orthogonal Gradient Descent (OGD) which leverages structure of the data through Principal Component Analysis (PCA). Experiments support our theoretical findings and show how our method can help reduce CF on classical CL datasets.

2020-12-31

AISTATS (publié)

doi.org

proceedings.mlr.press

Toward Causal Representation Learning

Bernhard Schölkopf

Francesco Locatello

Stefan Bauer

Nan Rosemary Ke

Nal Kalchbrenner

Anirudh Goyal

Yoshua Bengio

The two fields of machine learning and graphical causality arose and are developed separately. However, there is, now, cross-pollination and… (voir plus) increasing interest in both fields to benefit from the advances of the other. In this article, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, that is, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities.

2020-12-31

Proceedings of the IEEE (publié)

doi.org

Toward Tweet-Mining Framework for Extracting Terrorist Attack-Related Information and Reporting

Farkhund Iqbal

Rabia Batool

Benjamin C. M. Fung

Saiqa Aleem

Ahmed Abbasi

Abdul Rehman Javed

The widespread popularity of social networking is leading to the adoption of Twitter as an information dissemination tool. Existing research… (voir plus) has shown that information dissemination over Twitter has a much broader reach than traditional media and can be used for effective post-incident measures. People use informal language on Twitter, including acronyms, misspelled words, synonyms, transliteration, and ambiguous terms. This makes incident-related information extraction a non-trivial task. However, this information can be valuable for public safety organizations that need to respond in an emergency. This paper proposes an early event-related information extraction and reporting framework that monitors Twitter streams synthesizes event-specific information, e.g., a terrorist attack, and alerts law enforcement, emergency services, and media outlets. Specifically, the proposed framework, Tweet-to-Act (T2A), employs word embedding to transform tweets into a vector space model and then utilizes the Word Mover’s Distance (WMD) to cluster tweets for the identification of incidents. To extract reliable and valuable information from a large dataset of short and informal tweets, the proposed framework employs sequence labeling with bidirectional Long Short-Term Memory based Recurrent Neural Networks (bLSTM-RNN). Extensive experimental results suggest that our proposed framework, T2A, outperforms other state-of-the-art methods that use vector space modeling and distance calculation techniques, e.g., Euclidean and Cosine distance. T2A achieves an accuracy of 96% and an F1-score of 86.2% on real-life datasets.

2020-12-31

IEEE Access (publié)

doi.org

Towards a Trace-Preserving Tensor Network Representation of Quantum Channels

Siddarth Srinivasan

Sandesh M. Adhikary

Jacob Miller

Bibek Pokharel

Guillaume Rabusseau

Byron Boots

The problem of characterizing quantum channels arises in a number of contexts such as quantum process tomography and quantum error correctio… (voir plus)n. However, direct approaches to parameterizing and optimizing the Choi matrix representation of quantum channels face a curse of dimensionality: the number of parameters scales exponentially in the number of qubits. Recently, Torlai et al. [2020] proposed using locally puriﬁed density operators (LPDOs), a tensor network representation of Choi matrices, to overcome the unfavourable scaling in parameters. While the LPDO structure allows it to satisfy a ‘complete positivity’ (CP) constraint required of physically valid quantum channels, it makes no guarantees about a similarly required ‘trace preservation’ (TP) constraint. In practice, the TP constraint is violated, and the learned quantum channel may even be trace-increasing, which is non-physical. In this work, we present the problem of optimizing over TP LPDOs, discuss two approaches to characterizing the TP constraints on LPDOs, and outline the next steps for developing an optimization scheme.

2020-12-31

(publié)

www.semanticscholar.org

qu an tph ] 10 O ct 2 01 1 Quantum Communication in Rindler Spacetime

Kamil Brádler

P. Hayden

Prakash Panangaden

A state that an inertial observer in Minkowski space perceiv es to be the vacuum will appear to an accelerating observer to be a thermal ba … (voir plus)th of radiation. We study the impact of this Davies-Fulling-Unruh noise on comm unication, particularly quantum communication from an inertial sender to an ac celerating observer and private communication between two inertial observers i n the presence of an accelerating eavesdropper. In both cases, we establish com pact, tractable formulas for the associated communication capacities assuming enco dings that allow a single excitation in one of a fixed number of modes per use of the co mmunications channel. Our contributions include a rigorous presentatio n of the general theory of the private quantum capacity as well as a detailed analysis o f the structure of these channels, including their group-theoretic properties and proof that they are conjugate degradable. Connections between the Unruh channel a d optical amplifiers are also discussed.

2020-12-31

(publié)

www.semanticscholar.org

A Unified Few-Shot Classification Benchmark to Compare Transfer and Meta Learning Approaches

Vincent Dumoulin

Neil Houlsby

Utku Evci

Xiaohua Zhai

Ross Goroshin

Sylvain Gelly

Hugo Larochelle

Meta and transfer learning are two successful families of approaches to few-shot 1 learning. Despite highly related goals, state-of-the-art … (voir plus)advances in each family are 2 measured largely in isolation of each other. As a result of diverging evaluation 3 norms, a direct or thorough comparison of different approaches is challenging. 4 To bridge this gap, we introduce a few-shot classiﬁcation evaluation protocol 5 named VTAB+MD with the explicit goal of facilitating sharing of insights from 6 each community. We demonstrate its accessibility in practice by performing a 7 cross-family study of the best transfer and meta learners which report on both a 8 large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning 9 benchmark (Visual Task Adaptation Benchmark, VTAB). We ﬁnd that, on average, 10 large-scale transfer methods (Big Transfer, BiT) outperform competing approaches 11 on MD, even when trained only on ImageNet. In contrast, meta-learning approaches 12 struggle to compete on VTAB when trained and validated on MD. However, BiT 13 is not without limitations, and pushing for scale does not improve performance 14 on highly out-of-distribution MD tasks. We hope that this work contributes to 15 accelerating progress on few-shot learning research. 16

2020-12-31

NeurIPS Datasets and Benchmarks (publié)

openreview.net

Unifying Likelihood-free Inference with Black-box Sequence Design and Beyond

2020-12-31

arXiv.org (prépublication)

dblp.uni-trier.de

A Universal Representation Transformer Layer for Few-Shot Image Classification

Li Li

William L. Hamilton

Guodong Long

Jing Jiang

Hugo Larochelle

Few-shot classification aims to recognize unseen classes when presented with only a small number of samples. We consider the problem of mult… (voir plus)i-domain few-shot image classification, where unseen classes and examples come from diverse data sources. This problem has seen growing interest and has inspired the development of benchmarks such as Meta-Dataset. A key challenge in this multi-domain setting is to effectively integrate the feature representations from the diverse set of training domains. Here, we propose a Universal Representation Transformer (URT) layer, that meta-learns to leverage universal features for few-shot classification by dynamically re-weighting and composing the most appropriate domain-specific representations. In experiments, we show that URT sets a new state-of-the-art result on Meta-Dataset. Specifically, it achieves top-performance on the highest number of data sources compared to competing methods. We analyze variants of URT and present a visualization of the attention score heatmaps that sheds light on how the model performs cross-domain generalization.

2020-12-31

ICLR (publié)

openreview.net

Using Artificial Intelligence to Visualize the Impacts of Climate Change

Alexandra Luccioni

Victor Schmidt

Vahe Vardanyan

Yoshua Bengio

T. Rhyne

Public awareness and concern about climate change often do not match the magnitude of its threat to humans and our environment. One reason f… (voir plus)or this disagreement is that it is difficult to mentally simulate the effects of a process as complex as climate change and to have a concrete representation of the impact that our individual actions will have on our own future, especially if the consequences are long term and abstract. To overcome these challenges, we propose to use cutting-edge artificial intelligence (AI) approaches to develop an interactive personalized visualization tool, the AI climate impact visualizer. It will allow a user to enter an address—be it their house, their school, or their workplace—-and it will provide them with an AI-imagined possible visualization of the future of this location in 2050 following the detrimental effects of climate change such as floods, storms, and wildfires. This image will be accompanied by accessible information regarding the science behind climate change, i.e., why extreme weather events are becoming more frequent and what kinds of changes are happening on a local and global scale.

2020-12-31

IEEE Computer Graphics and Applications (publié)

doi.org

A Variational Perspective on Diffusion-Based Generative Models and Score Matching

Chin-wei Huang

Jae Hyun Lim

Aaron Courville

Discrete-time diffusion-based generative models and score matching methods have shown promising results in modeling high-dimensional image d… (voir plus)ata. Recently, Song et al. (2021) show that diffusion processes that transform data into noise can be reversed via learning the score function, i.e. the gradient of the log-density of the perturbed data. They propose to plug the learned score function into an inverse formula to define a generative diffusion process. Despite the empirical success, a theoretical underpinning of this procedure is still lacking. In this work, we approach the (continuous-time) generative diffusion directly and derive a variational framework for likelihood estimation, which includes continuous-time normalizing flows as a special case, and can be seen as an infinitely deep variational autoencoder. Under this framework, we show that minimizing the score-matching loss is equivalent to maximizing a lower bound of the likelihood of the plug-in reverse SDE proposed by Song et al. (2021), bridging the theoretical gap.

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (publié)

openreview.net

What Makes Machine Reading Comprehension Questions Difﬁcult? Investigating Variation in Passage Sources and Question Types

Susan Bartlett

Grzegorz Kondrak

Max Bartolo

Alastair Roberts

Johannes Welbl

Steven Bird

Ewan Klein

Edward Loper

Samuel R. Bowman

George Dahl. 2021

What

Chao Pang

Junyuan Shang

Jiaxiang Liu

Xuyi Chen

Yanbin Zhao

Yuxiang Lu

Weixin Liu

Zhi-901 hua Wu

Weibao Gong … (voir 21 de plus)

Jianzhong Liang

Zhizhou Shang

Peng Sun

Ouyang Xuan

Dianhai

Houwen Tian

Hua Wu

Haifeng Wang

Adam Trischler

Tong Wang

Xingdi Yuan

Justin Har-908

Alessandro Sordoni

Philip Bachman

Adina Williams

Nikita Nangia

Zhilin Yang

Peng Qi

Saizheng Zhang

Yoshua Bengio

ing. In

For a natural language understanding bench-001 mark to be useful in research, it has to con-002 sist of examples that are diverse and difﬁ… (voir plus)-003 cult enough to discriminate among current and 004 near-future state-of-the-art systems. However, 005 we do not yet know how best to select pas-006 sages to collect a variety of challenging exam-007 ples. In this study, we crowdsource multiple-008 choice reading comprehension questions for 009 passages taken from seven qualitatively dis-010 tinct sources, analyzing what attributes of pas-011 sages contribute to the difﬁculty and question 012 types of the collected examples. To our sur-013 prise, we ﬁnd that passage source, length, and 014 readability measures do not signiﬁcantly affect 015 question difﬁculty. Through our manual anno-016 tation of seven reasoning types, we observe 017 several trends between passage sources and 018 reasoning types, e.g., logical reasoning is more 019 often required in questions written for techni-020 cal passages. These results suggest that when 021 creating a new benchmark dataset, selecting a 022 diverse set of passages can help ensure a di-023 verse range of question types, but that passage 024 difﬁculty need not be a priority. 025

2020-12-31

(publié)

www.semanticscholar.org

Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning

Alexander Wong

to

2020-12-31

arXiv.org (prépublication)

dblp.uni-trier.de