Publications

Towards Policy-Guided Conversational Recommendation with Dialogue Acts
Paul Crook
Y-Lan Boureau
J. Weston
Akbar Karimi
Leonardo Rossi
Andrea Prati
Wenqiang Lei
Xiangnan He
Qingyun Yisong Miao
Richang Wu
Min-Yen Hong
Kan Tat-Seng
Raymond Li
Hannes Schulz
Zujie Liang
Huang Hu
Can Xu
Jian Miao
Lizi Liao … (voir 47 de plus)
Ryuichi Takanobu
Yunshan Ma
Xun Yang
Wenchang Ma
Minlie Huang
Minghao Tu
Iulian Serban
Aaron C. Courville
David Silver
Julian Schrittwieser
K. Simonyan
Ioannis Antonoglou
Aja Huang
A. Guez
Hanlin Zhu
O. Vinyals
Igor Babuschkin
M. Mathieu
Max Jaderberg
Wojciech M. Czar-725 necki
A. Dudzik
Petko Georgiev
Richard Powell
T. Ewalds
Dan Horgan
M. Kroiss
Ivo Danihelka
J. Agapiou
Junhyuk Oh
Valentin Dalibard
David Choi
L. Sifre
Yury Sulsky
Sasha Vezhnevets
James Molloy
Trevor Cai
D. Budden
T. Paine
Ziyu Wang
Tobias Pfaff
Tobias Pohlen
Trajectory balance: Improved credit assignment in GFlowNets
Nikolay Malkin
Chen Sun
Generative flow networks (GFlowNets) are a method for learning a stochastic policy for generating compositional objects, such as graphs or s… (voir plus)trings, from a given unnormalized density by sequences of actions, where many possible action sequences may lead to the same object. We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, which are analogous to temporal difference learning, to be prone to inefficient credit propagation across long action sequences. We thus propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives. We prove that any global minimizer of the trajectory balance objective can define a policy that samples exactly from the target distribution. In experiments on four distinct domains, we empirically demonstrate the benefits of the trajectory balance objective for GFlowNet convergence, diversity of generated samples, and robustness to long action sequences and large action spaces.
Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions
Kiwon Lee
Andrew N. Cheng
We analyze the dynamics of large batch stochastic gradient descent with momentum (SGD+M) on the least squares problem when both the number o… (voir plus)f samples and dimensions are large. In this setting, we show that the dynamics of SGD+M converge to a deterministic discrete Volterra equation as dimension increases, which we analyze. We identify a stability measurement, the implicit conditioning ratio (ICR), which regulates the ability of SGD+M to accelerate the algorithm. When the batch size exceeds this ICR, SGD+M converges linearly at a rate of
Understanding Generalization via Leave-One-Out Conditional Mutual Information
MAHDI HAGHIFAM
Shay Moran
Daniel M. Roy
Understanding the Evolution of Linear Regions in Deep Reinforcement Learning
Setareh Cohan
Nam Hee Gordon Kim
Michiel van de Panne
Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in … (voir plus)many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.
Usefulness of School Absenteeism Data for Predicting Infl uenza Outbreaks,
Joseph R. Egger
A. Hoen
John S. Brownstein
David L Buckeridge
Donald R. Olson
Kevin James Konty
and second-round PCR were 94°C for 3 min, followed by 40 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 2 min. Expected amplifi ca… (voir plus)tion products were 458 bp (PCR-1) and 304 bp (PCR-2). Using dilutions of a synthetic template corresponding to the target sequence, we estimated the sensitivity of the amplifi cation assay to be 5 copies of target sequence by limiting-dilution assay. Negative (sterile water) and positive controls (synthetic template dilutions) were
Vision-Language Pretraining: Current Trends and the Future
Damien Teney
Aida Nematzadeh
In the last few years, there has been an increased interest in building multimodal (vision-language) models that are pretrained on larger bu… (voir plus)t noisier datasets where the two modalities (e.g., image and text) loosely correspond to each other (e.g., Lu et al., 2019; Radford et al., 2021). Given a task (such as visual question answering), these models are then often fine-tuned on task-specific supervised datasets. (e.g., Lu et al., 2019; Chen et al.,2020; Tan and Bansal, 2019; Li et al., 2020a,b). In addition to the larger pretraining datasets, the transformer architecture (Vaswani et al., 2017) and in particular self-attention applied to two modalities are responsible for the impressive performance of the recent pretrained models on downstream tasks (Hendricks et al., 2021). In this tutorial, we focus on recent vision-language pretraining paradigms. Our goal is to first provide the background on image–language datasets, benchmarks, and modeling innovations before the multimodal pretraining area. Next we discuss the different family of models used for vision-language pretraining, highlighting their strengths and shortcomings. Finally, we discuss the limits of vision-language pretraining through statistical learning, and the need for alternative approaches such as causal representation learning.
Washing The Unwashable : On The (Im)possibility of Fairwashing Detection
Ali Shahin Shamsabadi
Mohammad Yaghini
Natalie Dullerud
Sierra Wyllie
Ulrich Matchi Aïvodji
Aisha Alaagib Alryeh Mkean
Sébastien Gambs
Nicolas Papernot
The use of black-box models (e.g., deep neural networks) in high-stakes decision-making systems, whose internal logic is complex, raises the… (voir plus) need for providing explanations about their decisions. Model explanation techniques mitigate this problem by generating an interpretable and high-fidelity surrogate model (e.g., a logistic regressor or decision tree) to explain the logic of black-box models. In this work, we investigate the issue of fairwashing, in which model explanation techniques are manipulated to rationalize decisions taken by an unfair black-box model using deceptive surrogate models. More precisely, we theoretically characterize and analyze fairwashing, proving that this phenomenon is difficult to avoid due to an irreducible factor---the unfairness of the black-box model. Based on the theory developed, we propose a novel technique, called FRAUD-Detect (FaiRness AUDit Detection), to detect fairwashed models by measuring a divergence over subpopulation-wise fidelity measures of the interpretable model. We empirically demonstrate that this divergence is significantly larger in purposefully fairwashed interpretable models than in honest ones. Furthermore, we show that our detector is robust to an informed adversary trying to bypass our detector. The code implementing FRAUD-Detect is available at https://github.com/cleverhans-lab/FRAUD-Detect.
Weakly Supervised Representation Learning with Sparse Perturbations
Jason Hartford
The theory of representation learning aims to build methods that provably invert the data generating process with minimal domain knowledge o… (voir plus)r any source of supervision. Most prior approaches require strong distributional assumptions on the latent variables and weak supervision (auxiliary information such as timestamps) to provide provable identification guarantees. In this work, we show that if one has weak supervision from observations generated by sparse perturbations of the latent variables--e.g. images in a reinforcement learning environment where actions move individual sprites--identification is achievable under unknown continuous latent distributions. We show that if the perturbations are applied only on mutually exclusive blocks of latents, we identify the latents up to those blocks. We also show that if these perturbation blocks overlap, we identify latents up to the smallest blocks shared across perturbations. Consequently, if there are blocks that intersect in one latent variable only, then such latents are identified up to permutation and scaling. We propose a natural estimation procedure based on this theory and illustrate it on low-dimensional synthetic and image-based experiments.
What does it mean to be an AI Ethicist: An ontology of existing roles
With the increasing adoption of Artificial Intelligence systems (AIS) in various application and the growing efforts to regulate such system… (voir plus)s, a new set of occupations has emerged in the industry. This new set of roles take different titles and hold varying responsibilities. However, the individuals in these roles are tasked with interpreting and operationalizing best practices for developing ethical and safe AI systems. We will broadly refer to this new set of occupations as AI ethicists and recognize that they often hold a specific role in the intersection of technology development, business needs, and societal implications. In this work, we examine what it means to be an AI ethicist in the industry and propose an ontology of existing roles under this broad title along with their required competencies. We create this ontology by examining the job postings for such roles over the past two years and conduct expert interviews with fourteen individuals who currently hold such a role in the industry. The proposed ontology will inform executives and leaders who are looking to build responsible AI teams and provide educators the necessary information for creating new learning objectives and curriculum.
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning
Single-Shot Pruning for Offline Reinforcement Learning
Riyasat Ohib
Sergey Plis