Publications

Predicting Unreliable Predictions by Shattering a Neural Network

Xu Ji

Andrea Vedaldi

Balaji Lakshminarayanan

Piecewise linear neural networks can be split into subfunctions, each with its own activation pattern, domain, and empirical error. Empirica… (voir plus)l error for the full network can be written as an expectation over empirical error of subfunctions. Constructing a generalization bound on subfunction empirical error indicates that the more densely a subfunction is surrounded by training samples in representation space, the more reliable its predictions are. Further, it suggests that models with fewer activation regions generalize better, and models that abstract knowledge to a greater degree generalize better, all else equal. We propose not only a theoretical framework to reason about subfunction error bounds but also a pragmatic way of approximately evaluating it, which we apply to predicting which samples the network will not successfully generalize to. We test our method on detection of misclassiﬁcation and out-of-distribution samples, ﬁnding that it performs competitively in both cases. In short, some network activation patterns are associated with higher reliability than others, and these can be identiﬁed using subfunction error bounds.

2021-01-01

arXiv.org (prépublication)

openreview.net

Preferential Temporal Difference Learning

Nishanth Anand

Doina Precup

2021-01-01

ICML (publié)

proceedings.mlr.press

arxiv.org

Pretraining Representations for Data-Efficient Reinforcement Learning

Max Schwarzer

Nitarshan Rajkumar

Michael Noukhovitch

Ankesh Anand

Laurent Charlin

(Rex) Devon Hjelm

Philip Bachman

Aaron Courville

Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder w… (voir plus)hich is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data -- approaching human-level performance and data-efficiency on Atari in our best setting.

openreview.net

RAFFIC V IS : Fighting Human Trafﬁcking through Visualization

Catalina Vajiac

Andreas Olligschlaeger

Yifei Li

Pratheeksha Nair

Meng-Chieh Lee

Namyong Park

Reihaneh Rabbany

Duen Horng Chau

Christos Faloutsos

Law enforcement can detect human trafficking (HT) in online escort websites by analyzing suspicious clusters of connected ads. Given such cl… (voir plus)usters, how can we interactively visualize potential evidence for law enforcement and domain experts? We present TRAFFICVIS, which, to our knowledge, is the first interface for cluster-level HT detection and labeling. It builds on state-of-the-art HT clustering algorithms by incorporating metadata as a signal of organized and potentially suspicious activity. Also, domain experts can label clusters as HT, spam, and more, efficiently creating labeled datasets to enable further HT research. TRAFFICVIS has been built in close collaboration with domain experts, who estimate that TRAFFICVIS provides a median 36x speedup over manual labeling.

Randomized Exploration in Reinforcement Learning with General Value Function Approximation

Haque Ishfaq

Qiwen Cui

Viet Bang Nguyen

Alex Ayoub

Zhuoran Yang

Zhaoran Wang

Doina Precup

Lin Yang

2021-01-01

International Conference on Machine Learning (publié)

proceedings.mlr.press

Randomized Least Squares Policy Optimization

Haque Ishfaq

Zhuoran Yang

Andrei-Stefan Lupu

Viet Bang Nguyen

Lewis Liu

Riashat Islam

Zhaoran Wang

Doina Precup

Policy Optimization (PO) methods with function approximation are one of the most popular classes of Reinforcement Learning (RL) algorithms. … (voir plus)However, designing provably efﬁcient policy optimization algorithms remains a challenge. Recent work in this area has focused on incorporating upper conﬁdence bound (UCB)-style bonuses to drive exploration in policy optimization. In this paper, we present Randomized Least Squares Policy Optimization (RLSPO) which is inspired by Thompson Sampling. We prove that, in an episodic linear kernel MDP setting, RLSPO achieves (cid:101) O ( d 3 / 2 H 3 / 2 √ T ) worst-case (frequentist) regret, where H is the number of episodes, T is the total number of steps and d is the feature dimension. Finally, we evaluate RLSPO empirically and show that it is competitive with existing provably efﬁcient PO algorithms.

A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Mukul Gagrani

Sagar Sudhakara

Aditya Mahajan

Ashutosh Nayyar

Yi Ouyang

—We revisit the Thompson sampling algorithm to control an unknown linear quadratic (LQ) system recently proposed by Ouyang et al. [1]. The… (voir plus) regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system. In this technical note, we show that by making a minor modiﬁcation in the algorithm (in particular, ensuring that an episode does not end too soon), this technical assumption on the induced norm can be replaced by a milder assumption in terms of the spectral radius of the closed loop system. The modiﬁed algorithm has the same Bayesian regret of ˜ O ( √ T ) , where T is the time-horizon and the ˜ O ( · ) notation hides logarithmic terms in T .

2021-01-01

arXiv.org (prépublication)

dblp.uni-trier.de

Rethinking Graph Transformers with Spectral Attention

Devin Kreuzer

Dominique Beaini

William L. Hamilton

Vincent Létourneau

Prudencio Tossou

In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data str… (voir plus)uctures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the

openreview.net

Routine Bandits: Minimizing Regret on Recurring Problems

Hassan Saber

L'eo Saci

Odalric-Ambrym Maillard

Audrey Durand

2021-01-01

ECML/PKDD (publié)

doi.org

Saliency is a Possible Red Herring When Diagnosing Poor Generalization

Joseph D Viviano

Becks Simpson

Francis Dutil

Yoshua Bengio

Joseph Paul Cohen

Poor generalization is one symptom of models that learn to predict target variables using spuriously-correlated image features present only … (voir plus)in the training distribution instead of the true image features that denote a class. It is often thought that this can be diagnosed visually using attribution (aka saliency) maps. We study if this assumption is correct. In some prediction tasks, such as for medical images, one may have some images with masks drawn by a human expert, indicating a region of the image containing relevant information to make the prediction. We study multiple methods that take advantage of such auxiliary labels, by training networks to ignore distracting features which may be found outside of the region of interest. This mask information is only used during training and has an impact on generalization accuracy depending on the severity of the shift between the training and test distributions. Surprisingly, while these methods improve generalization performance in the presence of a covariate shift, there is no strong correspondence between the correction of attribution towards the features a human expert have labelled as important and generalization performance. These results suggest that the root cause of poor generalization may not always be spatially defined, and raise questions about the utility of masks as 'attribution priors' as well as saliency maps for explainable predictions.

2021-01-01

ICLR (publié)

openreview.net

Scalable Change Point Detection for Dynamic Graphs

Shenyang Huang

Guillaume Rabusseau

Reihaneh Rabbany

Real world networks often evolve in complex ways over time. Understanding anomalies in dynamic networks is crucial for applications such as … (voir plus)traffic accident detection, intrusion identification and detection of ecosystem disturbances. In this work, we focus on the problem of change point detection in dynamic graphs. The goal is to identify time steps where the graph structure deviates significantly from the norm. Despite empirical success of recent methods, building a change point detection method for real world dynamic graphs, which often scale to millions of nodes, remains an open question. To fill this gap, we propose LADdos, a scalable method for change point detection in dynamic graphs. LADdos brings together ideas from two recent works: an accurate change point detection method for graphs called LAD [10] which detects the changes in the full Laplacian spectrum of the graph in each timestamp, and the general framework of network density of states (DOS) [5] which models the distribution of the singular values through efficient approximation methods. In experiments with two common graph models –the Stochastic Block Model (SBM) and the Barabási-Albert (BA) model – we show that LADdos has equal performance to LAD, which is the current state-of-the-art, while being orders of magnitude faster. For instance, on a dynamic graph with total 21 million edges over 150 timestamps, LADdos achieves 100x speedup when compared to LAD.

Seeing things or seeing scenes: Investigating the capabilities of V&L models to align scene descriptions to images

Matt D Anderson

Erich W Graf

James H Elder

Peter Anderson

Xiaodong He

Chris Buehler

Mark Teney

Stephen Johnson

Gould Lei

Emily M. Bender

Timnit Gebru

Angelina McMillan-575

Alexander Koller. 2020

Climb-582

Yonatan Bisk

Ari Holtzman

Jesse Thomason

Yoshua Bengio

Joyce Chai

Angeliki Lazaridou … (voir 32 de plus)

Jonathan May

Aleksandr

Thomas Unterthiner

Mostafa Dehghani

Georg Minderer

Sylvain Heigold

Jakob Gelly

Uszkoreit Neil

Houlsby. 2020

An

Lisa Anne Hendricks

Gabriel Ilharco

Rowan Zellers

Ali Farhadi

John M. Henderson

Contextual

Thomas L. Grifﬁths. 2021

Are Convolutional

Neu-827

Melissa L.-H. Võ

Jeremy M. Wolfe

Differen-830

Jianfeng Wang

Xiaowei Hu

Xiu-834 Pengchuan Zhang

Roy Schwartz

Bolei Zhou

Àgata Lapedriza

Jianxiong Xiao

Hang Zhao

Xavier Puig

Sanja Fidler

Images can be described in terms of the objects 001 they contain, or in terms of the types of scene 002 or place that they instantiate. In t… (voir plus)his paper we 003 address to what extent pretrained Vision and 004 Language models can learn to align descrip-005 tions of both types with images. We com-006 pare 3 state-of-the-art models, VisualBERT, 007 LXMERT and CLIP. We ﬁnd that (i) V&L 008 models are susceptible to stylistic biases ac-009 quired during pretraining; (ii) only CLIP per-010 forms consistently well on both object-and 011 scene-level descriptions. A follow-up ablation 012 study shows that CLIP uses object-level infor-013 mation in the visual modality to align with 014 scene-level textual descriptions

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Bourse de recherche en politiques de l'IA de Mila

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Publications

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Bourse de recherche en politiques de l'IA de Mila

Avantage IA

Développement du groupe d'experts de l'ONU sur l'IA

Mots-clés populaires:

Publications