Publications

Towards Policy-Guided Conversational Recommendation with Dialogue Acts

Paul Crook

Y-Lan Boureau

J. Weston

Akbar Karimi

Leonardo Rossi

Andrea Prati

Wenqiang Lei

Xiangnan He

Qingyun Yisong Miao

Richang Wu

Min-Yen Hong

Kan Tat-Seng

Raymond Li

Samira Ebrahimi Kahou

Hannes Schulz

Zujie Liang

Huang Hu

Can Xu

Jian Miao

Lizi Liao … (voir 47 de plus)

Ryuichi Takanobu

Yunshan Ma

Xun Yang

Wenchang Ma

Minlie Huang

Minghao Tu

Iulian Serban

Aaron C. Courville

David Silver

Julian Schrittwieser

K. Simonyan

Ioannis Antonoglou

Aja Huang

A. Guez

Hanlin Zhu

O. Vinyals

Igor Babuschkin

Junyoung Chung

M. Mathieu

Max Jaderberg

Wojciech M. Czar-725 necki

A. Dudzik

Petko Georgiev

Richard Powell

T. Ewalds

Dan Horgan

M. Kroiss

Ivo Danihelka

J. Agapiou

Junhyuk Oh

Valentin Dalibard

David Choi

L. Sifre

Yury Sulsky

Sasha Vezhnevets

James Molloy

Trevor Cai

D. Budden

T. Paine

Caglar Gulçehre

Ziyu Wang

Tobias Pfaff

Tobias Pohlen

2021-12-31

(publié)

www.semanticscholar.org

Trajectory balance: Improved credit assignment in GFlowNets

Nikolay Malkin

Moksh Jain

Emmanuel Bengio

Chen Sun

Yoshua Bengio

Generative flow networks (GFlowNets) are a method for learning a stochastic policy for generating compositional objects, such as graphs or s… (voir plus)trings, from a given unnormalized density by sequences of actions, where many possible action sequences may lead to the same object. We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, which are analogous to temporal difference learning, to be prone to inefficient credit propagation across long action sequences. We thus propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives. We prove that any global minimizer of the trajectory balance objective can define a policy that samples exactly from the target distribution. In experiments on four distinct domains, we empirically demonstrate the benefits of the trajectory balance objective for GFlowNet convergence, diversity of generated samples, and robustness to long action sequences and large action spaces.

2021-12-31

Neural Information Processing Systems (publié)

doi.org

openreview.net

Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions

Kiwon Lee

Andrew N. Cheng

Courtney Paquette

Elliot Paquette

We analyze the dynamics of large batch stochastic gradient descent with momentum (SGD+M) on the least squares problem when both the number o… (voir plus)f samples and dimensions are large. In this setting, we show that the dynamics of SGD+M converge to a deterministic discrete Volterra equation as dimension increases, which we analyze. We identify a stability measurement, the implicit conditioning ratio (ICR), which regulates the ability of SGD+M to accelerate the algorithm. When the batch size exceeds this ICR, SGD+M converges linearly at a rate of

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (publié)

doi.org

openreview.net

Understanding Generalization via Leave-One-Out Conditional Mutual Information

MAHDI HAGHIFAM

Shay Moran

Daniel M. Roy

Gintare Karolina Dziugaite

2021-12-31

ISIT (publié)

doi.org

arxiv.org

Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Setareh Cohan

Nam Hee Gordon Kim

David Rolnick

Michiel van de Panne

Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in … (voir plus)many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (publié)

doi.org

openreview.net

Usefulness of School Absenteeism Data for Predicting Inﬂ uenza Outbreaks,

Joseph R. Egger

A. Hoen

John S. Brownstein

David L Buckeridge

Donald R. Olson

Kevin James Konty

and second-round PCR were 94°C for 3 min, followed by 40 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 2 min. Expected amplifi ca… (voir plus)tion products were 458 bp (PCR-1) and 304 bp (PCR-2). Using dilutions of a synthetic template corresponding to the target sequence, we estimated the sensitivity of the amplifi cation assay to be 5 copies of target sequence by limiting-dilution assay. Negative (sterile water) and positive controls (synthetic template dilutions) were

2021-12-31

(publié)

www.semanticscholar.org

Vision-Language Pretraining: Current Trends and the Future

Aishwarya Agrawal

Damien Teney

Aida Nematzadeh

In the last few years, there has been an increased interest in building multimodal (vision-language) models that are pretrained on larger bu… (voir plus)t noisier datasets where the two modalities (e.g., image and text) loosely correspond to each other (e.g., Lu et al., 2019; Radford et al., 2021). Given a task (such as visual question answering), these models are then often fine-tuned on task-specific supervised datasets. (e.g., Lu et al., 2019; Chen et al.,2020; Tan and Bansal, 2019; Li et al., 2020a,b). In addition to the larger pretraining datasets, the transformer architecture (Vaswani et al., 2017) and in particular self-attention applied to two modalities are responsible for the impressive performance of the recent pretrained models on downstream tasks (Hendricks et al., 2021). In this tutorial, we focus on recent vision-language pretraining paradigms. Our goal is to first provide the background on image–language datasets, benchmarks, and modeling innovations before the multimodal pretraining area. Next we discuss the different family of models used for vision-language pretraining, highlighting their strengths and shortcomings. Finally, we discuss the limits of vision-language pretraining through statistical learning, and the need for alternative approaches such as causal representation learning.

2021-12-31

Annual Meeting of the Association for Computational Linguistics (publié)

doi.org

Washing The Unwashable : On The (Im)possibility of Fairwashing Detection

Ali Shahin Shamsabadi

Mohammad Yaghini

Natalie Dullerud

Sierra Wyllie

Ulrich Matchi Aïvodji

Aisha Alaagib Alryeh Mkean

Sébastien Gambs

Nicolas Papernot

The use of black-box models (e.g., deep neural networks) in high-stakes decision-making systems, whose internal logic is complex, raises the… (voir plus) need for providing explanations about their decisions. Model explanation techniques mitigate this problem by generating an interpretable and high-fidelity surrogate model (e.g., a logistic regressor or decision tree) to explain the logic of black-box models. In this work, we investigate the issue of fairwashing, in which model explanation techniques are manipulated to rationalize decisions taken by an unfair black-box model using deceptive surrogate models. More precisely, we theoretically characterize and analyze fairwashing, proving that this phenomenon is difficult to avoid due to an irreducible factor---the unfairness of the black-box model. Based on the theory developed, we propose a novel technique, called FRAUD-Detect (FaiRness AUDit Detection), to detect fairwashed models by measuring a divergence over subpopulation-wise fidelity measures of the interpretable model. We empirically demonstrate that this divergence is significantly larger in purposefully fairwashed interpretable models than in honest ones. Furthermore, we show that our detector is robust to an informed adversary trying to bypass our detector. The code implementing FRAUD-Detect is available at https://github.com/cleverhans-lab/FRAUD-Detect.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (publié)

openreview.net

Weakly Supervised Representation Learning with Sparse Perturbations

Kartik Ahuja

Jason Hartford

Yoshua Bengio

The theory of representation learning aims to build methods that provably invert the data generating process with minimal domain knowledge o… (voir plus)r any source of supervision. Most prior approaches require strong distributional assumptions on the latent variables and weak supervision (auxiliary information such as timestamps) to provide provable identification guarantees. In this work, we show that if one has weak supervision from observations generated by sparse perturbations of the latent variables--e.g. images in a reinforcement learning environment where actions move individual sprites--identification is achievable under unknown continuous latent distributions. We show that if the perturbations are applied only on mutually exclusive blocks of latents, we identify the latents up to those blocks. We also show that if these perturbation blocks overlap, we identify latents up to the smallest blocks shared across perturbations. Consequently, if there are blocks that intersect in one latent variable only, then such latents are identified up to permutation and scaling. We propose a natural estimation procedure based on this theory and illustrate it on low-dimensional synthetic and image-based experiments.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (publié)

doi.org

openreview.net

What does it mean to be an AI Ethicist: An ontology of existing roles

Shalaleh Rismani

AJung Moon

With the increasing adoption of Artificial Intelligence systems (AIS) in various application and the growing efforts to regulate such system… (voir plus)s, a new set of occupations has emerged in the industry. This new set of roles take different titles and hold varying responsibilities. However, the individuals in these roles are tasked with interpreting and operationalizing best practices for developing ethical and safe AI systems. We will broadly refer to this new set of occupations as AI ethicists and recognize that they often hold a specific role in the intersection of technology development, business needs, and societal implications. In this work, we examine what it means to be an AI ethicist in the industry and propose an ontology of existing roles under this broad title along with their required competencies. We create this ontology by examining the job postings for such roles over the past two years and conduct expert interviews with fourteen individuals who currently hold such a role in the industry. The proposed ontology will inform executives and leaders who are looking to build responsible AI teams and provide educators the necessary information for creating new learning objectives and curriculum.

2021-12-31

arXiv.org (prépublication)

doi.org

Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning

Samin Yeasar Arnob

Riashat Islam

Doina Precup

2021-12-30

ArXiv (prépublication)

arxiv.org