Publications

Post-Editing Extractive Summaries by Definiteness Prediction

Jackie Chi Kit Cheung

Extractive summarization has been the mainstay of automatic summarization for decades. Despite all the progress, extractive summarizers stil… (see more)l suffer from shortcomings including coreference issues arising from extracting sentences away from their original context in the source document. This affects the coherence and readability of extractive summaries. In this work, we propose a lightweight post-editing step for extractive summaries that centers around a single linguistic decision: the definiteness of noun phrases. We conduct human evaluation studies that show that human expert judges substantially prefer the output of our proposed system over the original summaries. Moreover, based on an automatic evaluation study, we provide evidence for our system’s ability to generate linguistic decisions that lead to improved extractive summaries. We also draw insights about how the automatic system is exploiting some local cues related to the writing style of the main article texts or summary texts to make the decisions, rather than reasoning about the contexts pragmatically.

2020-12-31

Conference on Empirical Methods in Natural Language Processing (published)

doi.org

Preferential Temporal Difference Learning

Nishanth Anand

Doina Precup

Temporal-Difference (TD) learning is a general and very useful tool for estimating the value function of a given policy, which in turn is re… (see more)quired to find good policies. Generally speaking, TD learning updates states whenever they are visited. When the agent lands in a state, its value can be used to compute the TD-error, which is then propagated to other states. However, it may be interesting, when computing updates, to take into account other information than whether a state is visited or not. For example, some states might be more important than others (such as states which are frequently seen in a successful trajectory). Or, some states might have unreliable value estimates (for example, due to partial observability or lack of data), making their values less desirable as targets. We propose an approach to re-weighting states used in TD updates, both when they are the input and when they provide the target for the update. We prove that our approach converges with linear function approximation and illustrate its desirable empirical behaviour compared to other TD-style methods.

2020-12-31

ICML (published)

doi.org

proceedings.mlr.press

Pretraining Representations for Data-Efficient Reinforcement Learning

Philip Bachman

Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder w… (see more)hich is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data -- approaching human-level performance and data-efficiency on Atari in our best setting. We provide code associated with this work at https://github.com/mila-iqia/SGI.

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (published)

doi.org

openreview.net

RAFFIC V IS : Fighting Human Trafﬁcking through Visualization

Catalina Vajiac

Andreas Olligschlaeger

Yifei Li

Pratheeksha Nair

Meng-Chieh Lee

Namyong Park

Reihaneh Rabbany

Duen Horng Chau

Christos Faloutsos

Law enforcement can detect human trafficking (HT) in online escort websites by analyzing suspicious clusters of connected ads. Given such cl… (see more)usters, how can we interactively visualize potential evidence for law enforcement and domain experts? We present TRAFFICVIS, which, to our knowledge, is the first interface for cluster-level HT detection and labeling. It builds on state-of-the-art HT clustering algorithms by incorporating metadata as a signal of organized and potentially suspicious activity. Also, domain experts can label clusters as HT, spam, and more, efficiently creating labeled datasets to enable further HT research. TRAFFICVIS has been built in close collaboration with domain experts, who estimate that TRAFFICVIS provides a median 36x speedup over manual labeling.

2020-12-31

(published)

www.semanticscholar.org

Randomized Least Squares Policy Optimization

Haque Ishfaq

Zhuoran Yang

Andrei Lupu

Viet Bang Nguyen

Lewis Liu

Riashat Islam

Zhaoran Wang

Doina Precup

Policy Optimization (PO) methods with function approximation are one of the most popular classes of Reinforcement Learning (RL) algorithms. … (see more)However, designing provably efﬁcient policy optimization algorithms remains a challenge. Recent work in this area has focused on incorporating upper conﬁdence bound (UCB)-style bonuses to drive exploration in policy optimization. In this paper, we present Randomized Least Squares Policy Optimization (RLSPO) which is inspired by Thompson Sampling. We prove that, in an episodic linear kernel MDP setting, RLSPO achieves (cid:101) O ( d 3 / 2 H 3 / 2 √ T ) worst-case (frequentist) regret, where H is the number of episodes, T is the total number of steps and d is the feature dimension. Finally, we evaluate RLSPO empirically and show that it is competitive with existing provably efﬁcient PO algorithms.

2020-12-31

(published)

www.semanticscholar.org

A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Mukul Gagrani

Sagar Sudhakara

Aditya Mahajan

Ashutosh Nayyar

Yi Ouyang

—We revisit the Thompson sampling algorithm to control an unknown linear quadratic (LQ) system recently proposed by Ouyang et al. [1]. The… (see more) regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system. In this technical note, we show that by making a minor modiﬁcation in the algorithm (in particular, ensuring that an episode does not end too soon), this technical assumption on the induced norm can be replaced by a milder assumption in terms of the spectral radius of the closed loop system. The modiﬁed algorithm has the same Bayesian regret of ˜ O ( √ T ) , where T is the time-horizon and the ˜ O ( · ) notation hides logarithmic terms in T .

2020-12-31

arXiv.org (preprint)

dblp.uni-trier.de

Rethinking Graph Transformers with Spectral Attention

William L. Hamilton

In recent years, the Transformer architecture has proven to be very successful in sequence processing, but its application to other data str… (see more)uctures, such as graphs, has remained limited due to the difficulty of properly defining positions. Here, we present the

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (published)

doi.org

openreview.net

Routine Bandits: Minimizing Regret on Recurring Problems

Hassan Saber

L'eo Saci

Odalric-Ambrym Maillard

Audrey Durand

2020-12-31

ECML/PKDD (published)

doi.org

Scalable Change Point Detection for Dynamic Graphs

Shenyang Huang

Guillaume Rabusseau

Reihaneh Rabbany

Real world networks often evolve in complex ways over time. Understanding anomalies in dynamic networks is crucial for applications such as … (see more)traffic accident detection, intrusion identification and detection of ecosystem disturbances. In this work, we focus on the problem of change point detection in dynamic graphs. The goal is to identify time steps where the graph structure deviates significantly from the norm. Despite empirical success of recent methods, building a change point detection method for real world dynamic graphs, which often scale to millions of nodes, remains an open question. To fill this gap, we propose LADdos, a scalable method for change point detection in dynamic graphs. LADdos brings together ideas from two recent works: an accurate change point detection method for graphs called LAD [10] which detects the changes in the full Laplacian spectrum of the graph in each timestamp, and the general framework of network density of states (DOS) [5] which models the distribution of the singular values through efficient approximation methods. In experiments with two common graph models –the Stochastic Block Model (SBM) and the Barabási-Albert (BA) model – we show that LADdos has equal performance to LAD, which is the current state-of-the-art, while being orders of magnitude faster. For instance, on a dynamic graph with total 21 million edges over 150 timestamps, LADdos achieves 100x speedup when compared to LAD.

2020-12-31

(published)

www.semanticscholar.org

Seeing things or seeing scenes: Investigating the capabilities of V&L models to align scene descriptions to images

M. D. Anderson

Erich W Graf

James H Elder

Peter Anderson

Xiaodong He

Chris Buehler

Mark Teney

Stephen Johnson

Gould Lei

Emily M. Bender

Timnit Gebru

Angelina McMillan-575

Alexander Koller. 2020

Climb-582

Yonatan Bisk

Ari Holtzman

Jesse Thomason

Yoshua Bengio

Joyce Chai

Angeliki Lazaridou … (see 32 more)

Jonathan May

Aleksandr

Thomas Unterthiner

Mostafa Dehghani

Georg Minderer

Sylvain Heigold

Jakob Gelly

Uszkoreit Neil

Houlsby. 2020

An

Lisa Anne Hendricks

Gabriel Ilharco

Rowan Zellers

Ali Farhadi

John M. Henderson

Contextual

Thomas L. Grifﬁths. 2021

Are Convolutional

Neu-827

Melissa L.-H. Võ

Jeremy M. Wolfe

Differen-830

Jianfeng Wang

Xiaowei Hu

Xiu-834 Pengchuan Zhang

Roy Schwartz

Bolei Zhou

Àgata Lapedriza

Jianxiong Xiao

Hang Zhao

Xavier Puig

Sanja Fidler

Images can be described in terms of the objects 001 they contain, or in terms of the types of scene 002 or place that they instantiate. In t… (see more)his paper we 003 address to what extent pretrained Vision and 004 Language models can learn to align descrip-005 tions of both types with images. We com-006 pare 3 state-of-the-art models, VisualBERT, 007 LXMERT and CLIP. We ﬁnd that (i) V&L 008 models are susceptible to stylistic biases ac-009 quired during pretraining; (ii) only CLIP per-010 forms consistently well on both object-and 011 scene-level descriptions. A follow-up ablation 012 study shows that CLIP uses object-level infor-013 mation in the visual modality to align with 014 scene-level textual descriptions

2020-12-31

(published)

www.semanticscholar.org

Signal diffusion along connectome gradients and inter-hub routing differentially contribute to dynamic human brain function

Bo-yong Park

Reinder Vos de Wael

Casey Paquola

Sara Larivière

Oualid Benkarim

Jessica Royer

Shahin Tavakol

Raul R. Cruces

Qiongling Li

Sofie L. Valk

Daniel S. Margulies

Bratislav Misic

Danilo Bzdok

Jonathan Smallwood

Boris C. Bernhardt

2020-12-31

NeuroImage (published)

doi.org

A Simple and Effective Model for Multi-Hop Question Generation

Jimmy Lei Ba

Jamie Ryan Kiros

Geoffrey E Hin-602

Peter W. Battaglia

Jessica Blake

Chandler Hamrick

Vic-613 tor Bapst

Alvaro Sanchez

Vinicius Zambaldi

M. Malinowski

Andrea Tacchetti

David Raposo

Tom B. Brown

Benjamin Mann

Nick Ryder

Melanie Subbiah

Jared Kaplan

Prafulla Dhariwal

Arvind Neelakantan

Pranav Shyam … (see 72 more)

Girish Sastry

William L. Hamilton

Clutrr

Nitish Srivastava

Geoffrey Hinton

Alex Krizhevsky

Ilya Sutskever

Ruslan Salakhutdinov. 2014

Gabriel Stanovsky

Julian Michael

Luke Zettlemoyer

Dan Su

Yan Xu

Wenliang Dai

Ziwei Ji

Tiezheng Yu

Minghao Tu

Kevin Huang

Guangtao Wang

Jing Huang

Ashish Vaswani

Noam M. Shazeer

Niki Parmar

Jakob Uszkoreit

Llion Jones

Aidan N. Gomez

Łukasz Kaiser

Illia Polosukhin. 2017

Attention

Petar Veliˇckovi´c

Guillem Cucurull

Arantxa Casanova

Adriana Romero

Pietro Lio’

Yoshua Bengio

Johannes Welbl

Pontus Stenetorp

Yonghui Wu

Mike Schuster

Quoc Zhifeng Chen

Mohammad Le

Wolfgang Norouzi

Macherey

M. Krikun

Yuan Cao

Qin Gao

William W. Cohen

Jianxing Yu

Xiaojun Quan

Qinliang Su

Jian Yin

Yuyu Zhang

Hanjun Dai

Zornitsa Kozareva

Cheng Zhao

Chenyan Xiong

Corby Rosset

Xia

Paul Song

Bennett Saurabh

Tiwary

Yao Zhao

Xiaochuan Ni

Yuanyuan Ding

Qingyu Zhou

Nan Yang

Furu Wei

Chuanqi Tan

Previous research on automated question gen-001 eration has almost exclusively focused on gen-002 erating factoid questions whose answers ca… (see more)n 003 be extracted from a single document. How-004 ever, there is an increasing interest in develop-005 ing systems that are capable of more complex 006 multi-hop question generation (QG), where an-007 swering the question requires reasoning over 008 multiple documents. In this work, we pro-009 pose a simple and effective approach based on 010 the transformer model for multi-hop QG. Our 011 approach consists of specialized input repre-012 sentations, a supporting sentence classiﬁcation 013 objective, and training data weighting. Prior 014 work on multi-hop QG considers the simpli-015 ﬁed setting of shorter documents and also ad-016 vocates the use of entity-based graph struc-017 tures as essential ingredients in model design. 018 On the contrary, we showcase that our model 019 can scale to the challenging setting of longer 020 documents as input, does not rely on graph 021 structures, and substantially outperforms the 022 state-of-the-art approaches as measured by au-023 tomated metrics and human evaluation. 024

2020-12-31

(published)

www.semanticscholar.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications