Publications

On-the-Fly Attention Modularization for Neural Generation

Chandra Bhagavatula

Ximing Lu

Jena D. Hwang

Antoine Bosselut

Jackie CK Cheung

Yejin Choi

Despite considerable advancements with deep neural language models (LMs), neural text generation still suffers from de generation: generated… (see more) text is repetitive, generic, self-inconsistent, and lacking commonsense. The empirical analyses on sentence-level attention patterns reveal that neural text degeneration may be associated with insufﬁcient learning of inductive biases by the attention mechanism. Our ﬁndings motivate on-the-ﬂy attention modularization, a simple but effective method for injecting inductive biases into attention computation during inference. The resulting text produced by the language model with attention modularization can yield enhanced diversity and commonsense reasoning while maintaining ﬂuency and coherence.

2020-12-31

arXiv.org (preprint)

dblp.uni-trier.de

Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata

Borja Balle

We address the approximate minimization problem for weighted finite automata (WFAs) with weights in …

2020-12-31

arXiv (preprint)

doi.org

arxiv.org

Optimization of Artificial Neural Network Hyperparameters For Processing Retrospective Information

A. Rogachev

F. Scholle

Yann Lecun

Yoshua Bengio

I. L. Kashirin

M. Demchenko

. Justification of the selection of the architecture and hyperparameters of artificial neural networks (ANN), focused on solving various cla… (see more)sses of applied problems, is a scientific and methodological problem. Optimizing the selection of ANN hyperparameters allows you to improve the quality and speed of ANN training. Various methods of optimizing the selection of ANN hyper-parameters are known – the use of evolutionary calculations, genetic algorithms, etc., but they require the use of additional software. To optimize the process of selecting ANN hyperparameters, Google Research has developed the KerasTuner software tool. It is a platform for automated search of a set of optimal combinations of hyperparameters. In Kerastuner, you can use various methods - random search, Bayesian optimization, or Hyperband. In the numerical experiments conducted by the author, 14 hyperparameters were varied, including the number of blocks of convolutional layers and the filters forming them, the type of activation function, the parameters of the "dropout" layers, and others. The studied tools demonstrated high efficiency while simultaneously varying more than a dozen optimized parameters of the convolutional network. The calculation time on the Colaboratory platform for the various combined ANN architectures studied, including recurrent RNN networks, was several hours, even with the use of GPU graphics accelerators. For ANN, focused on the processing and recognition of retrospective information, an increase in the quality of recognition was achieved to 80 ... 95%.

2020-12-31

(published)

www.semanticscholar.org

Overview of the TREC 2021 Fair Ranking Track

Asia J. Biega

Fernando Diaz

Michael D. Ekstrand

Sebastian Kohlmeier

The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide … (see more)a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents' topical content or authors. The 2021 Fair Ranking Track adopted a resource allocation task. The task focused on supporting Wikipedia editors who are looking to improve the encyclopedia's coverage of topics under the purview of a WikiProject. WikiProject coordinators and/or Wikipedia editors search for Wikipedia documents that are in need of editing to improve the quality of the article. The 2021 Fair Ranking track aimed to ensure that documents that are about, or somehow represent, certain protected characteristics receive a fair exposure to the Wikipedia editors, so that the documents have an fair opportunity of being improved and, therefore, be well-represented in Wikipedia. The under-representation of particular protected characteristics in Wikipedia can result in systematic biases that can have a negative human, social, and economic impact, particularly for disadvantaged or protected societal groups.

2020-12-31

TREC (published)

doi.org

arxiv.org

Personalized Medicine for OSA Syndrome in a Nutshell: Conceptual Clarification for Integration.

Christophe Gauld

Marie Darrason

Guillaume Dumas

Jean‐Arthur Micoulaud‐Franchi

2020-12-31

Chest (published)

doi.org

Post-Editing Extractive Summaries by Definiteness Prediction

Jad Kabbara

Jackie Chi Kit Cheung

Extractive summarization has been the mainstay of automatic summarization for decades. Despite all the progress, extractive summarizers stil… (see more)l suffer from shortcomings including coreference issues arising from extracting sentences away from their original context in the source document. This affects the coherence and readability of extractive summaries. In this work, we propose a lightweight post-editing step for extractive summaries that centers around a single linguistic decision: the definiteness of noun phrases. We conduct human evaluation studies that show that human expert judges substantially prefer the output of our proposed system over the original summaries. Moreover, based on an automatic evaluation study, we provide evidence for our system’s ability to generate linguistic decisions that lead to improved extractive summaries. We also draw insights about how the automatic system is exploiting some local cues related to the writing style of the main article texts or summary texts to make the decisions, rather than reasoning about the contexts pragmatically.

2020-12-31

Conference on Empirical Methods in Natural Language Processing (published)

doi.org

Predicting Infectiousness for Proactive Contact Tracing

Prateek Gupta

Nasim Rahaman

Pierre-Luc St. Charles

Hannah Alsdurf

Olexa Bilaniuk

David Buckeridge

Gaétan Marceau-Caron

Pierre-Luc Carrier

Joumana Ghosn

Satya Ortiz-Gagné

Chris Pal

Irina Rish

Bernhard Schölkopf … (see 3 more)

Abhinav Sharma

Jian Tang

Andrew Williams

The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdo… (see more)wns for emergency containment. Large-scale digital contact tracing (DCT) has emerged as a potential solution to resume economic and social activity while minimizing spread of the virus. Various DCT methods have been proposed, each making trade-offs between privacy, mobility restrictions, and public health. The most common approach, binary contact tracing (BCT), models infection as a binary event, informed only by an individual's test results, with corresponding binary recommendations that either all or none of the individual's contacts quarantine. BCT ignores the inherent uncertainty in contacts and the infection process, which could be used to tailor messaging to high-risk individuals, and prompt proactive testing or earlier warnings. It also does not make use of observations such as symptoms or pre-existing medical conditions, which could be used to make more accurate infectiousness predictions. In this paper, we use a recently-proposed COVID-19 epidemiological simulator to develop and test methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness (risk of infecting others) based on their contact history and other information, while respecting strong privacy constraints. Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT). We find a deep-learning based PCT method which improves over BCT for equivalent average mobility, suggesting PCT could help in safe re-opening and second-wave prevention.

2020-12-31

ICLR (published)

doi.org

openreview.net

Preferential Temporal Difference Learning

Nishanth Anand

Doina Precup

Temporal-Difference (TD) learning is a general and very useful tool for estimating the value function of a given policy, which in turn is re… (see more)quired to find good policies. Generally speaking, TD learning updates states whenever they are visited. When the agent lands in a state, its value can be used to compute the TD-error, which is then propagated to other states. However, it may be interesting, when computing updates, to take into account other information than whether a state is visited or not. For example, some states might be more important than others (such as states which are frequently seen in a successful trajectory). Or, some states might have unreliable value estimates (for example, due to partial observability or lack of data), making their values less desirable as targets. We propose an approach to re-weighting states used in TD updates, both when they are the input and when they provide the target for the update. We prove that our approach converges with linear function approximation and illustrate its desirable empirical behaviour compared to other TD-style methods.

2020-12-31

ICML (published)

doi.org

proceedings.mlr.press

Pretraining Representations for Data-Efficient Reinforcement Learning

Philip Bachman

Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder w… (see more)hich is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience), our approach significantly surpasses prior work combining offline representation pretraining with task-specific finetuning, and compares favourably with other pretraining methods that require orders of magnitude more data. Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data -- approaching human-level performance and data-efficiency on Atari in our best setting. We provide code associated with this work at https://github.com/mila-iqia/SGI.

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (published)

doi.org

openreview.net

RAFFIC V IS : Fighting Human Trafﬁcking through Visualization

Catalina Vajiac

Andreas Olligschlaeger

Yifei Li

Pratheeksha Nair

Meng-Chieh Lee

Namyong Park

Reihaneh Rabbany

Duen Horng Chau

Christos Faloutsos

Law enforcement can detect human trafficking (HT) in online escort websites by analyzing suspicious clusters of connected ads. Given such cl… (see more)usters, how can we interactively visualize potential evidence for law enforcement and domain experts? We present TRAFFICVIS, which, to our knowledge, is the first interface for cluster-level HT detection and labeling. It builds on state-of-the-art HT clustering algorithms by incorporating metadata as a signal of organized and potentially suspicious activity. Also, domain experts can label clusters as HT, spam, and more, efficiently creating labeled datasets to enable further HT research. TRAFFICVIS has been built in close collaboration with domain experts, who estimate that TRAFFICVIS provides a median 36x speedup over manual labeling.

2020-12-31

(published)

www.semanticscholar.org

Randomized Least Squares Policy Optimization

Haque Ishfaq

Zhuoran Yang

Andrei Lupu

Viet Bang Nguyen

Lewis Liu

Riashat Islam

Zhaoran Wang

Doina Precup

Policy Optimization (PO) methods with function approximation are one of the most popular classes of Reinforcement Learning (RL) algorithms. … (see more)However, designing provably efﬁcient policy optimization algorithms remains a challenge. Recent work in this area has focused on incorporating upper conﬁdence bound (UCB)-style bonuses to drive exploration in policy optimization. In this paper, we present Randomized Least Squares Policy Optimization (RLSPO) which is inspired by Thompson Sampling. We prove that, in an episodic linear kernel MDP setting, RLSPO achieves (cid:101) O ( d 3 / 2 H 3 / 2 √ T ) worst-case (frequentist) regret, where H is the number of episodes, T is the total number of steps and d is the feature dimension. Finally, we evaluate RLSPO empirically and show that it is competitive with existing provably efﬁcient PO algorithms.

2020-12-31

(published)

www.semanticscholar.org

A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Mukul Gagrani

Sagar Sudhakara

Aditya Mahajan

Ashutosh Nayyar

Yi Ouyang

—We revisit the Thompson sampling algorithm to control an unknown linear quadratic (LQ) system recently proposed by Ouyang et al. [1]. The… (see more) regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system. In this technical note, we show that by making a minor modiﬁcation in the algorithm (in particular, ensuring that an episode does not end too soon), this technical assumption on the induced norm can be replaced by a milder assumption in terms of the spectral radius of the closed loop system. The modiﬁed algorithm has the same Bayesian regret of ˜ O ( √ T ) , where T is the time-horizon and the ˜ O ( · ) notation hides logarithmic terms in T .

2020-12-31

arXiv.org (preprint)

dblp.uni-trier.de

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications