Publications

On the (Im)Possibility of Estimating Various Notions of Differential Privacy (short paper)
Daniele Gorla
Louis Jalouzot
Federica Granese
Catuscia Palamidessi
We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism … (voir plus)is a priori unknown to them (the so-called “black-box" scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely 𝜀 -LDP and Rényi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator.
On the Limitations of Elo: Real-World Games, are Transitive, not Additive
Quentin Bertrand
Wojciech M. Czarnecki
Real-world competitive games, such as chess, go, or StarCraft II, rely on Elo models to measure the strength of their players. Since these g… (voir plus)ames are not fully transitive, using Elo implicitly assumes they have a strong transitive component that can correctly be identified and extracted. In this study, we investigate the challenge of identifying the strength of the transitive component in games. First, we show that Elo models can fail to extract this transitive component, even in elementary transitive games. Then, based on this observation, we propose an extension of the Elo score: we end up with a disc ranking system that assigns each player two scores, which we refer to as skill and consistency. Finally, we propose an empirical validation on payoff matrices coming from real-world games played by bots and humans.
The race to understand immunopathology in COVID-19: Perspectives on the impact of quantitative approaches to understand within-host interactions
Sonia Gazeau
Xiaoyan Deng
Hsu Kiang Ooi
Fatima Mostefai
Jane Heffernan
Adrianne L. Jenner
Morgan Craig
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Xing Han Lu
Harm de Vries
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Mark Rowland
Yunhao Tang
Clare Lyle
Remi Munos
Will Dabney
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Mark Rowland
Yunhao Tang
Clare Lyle
Remi Munos
Will Dabney
We study the problem of temporal-difference-based policy evaluation in reinforcement learning. In particular, we analyse the use of a distri… (voir plus)butional reinforcement learning algorithm, quantile temporal-difference learning (QTD), for this task. We reach the surprising conclusion that even if a practitioner has no interest in the return distribution beyond the mean, QTD (which learns predictions about the full distribution of returns) may offer performance superior to approaches such as classical TD learning, which predict only the mean return, even in the tabular setting.
A theory of continuous generative flow networks
Salem Lahlou
Tristan Deleu
Pablo Lemos
Dinghuai Zhang
Alexandra Volokhova
Alex Hernandez-Garcia
Lena Nehale Ezzine
Nikolay Malkin
A theory of continuous generative flow networks
Salem Lahlou
Tristan Deleu
Pablo Lemos
Dinghuai Zhang
Alexandra Volokhova
Alex Hernandez-Garcia
Lena Nehale Ezzine
Nikolay Malkin
Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target dist… (voir plus)ributions over compositional objects. A key limitation of GFlowNets until this time has been that they are restricted to discrete spaces. We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous or hybrid state spaces, and perform experiments with two goals in mind. First, we illustrate critical points of the theory and the importance of various assumptions. Second, we empirically demonstrate how observations about discrete GFlowNets transfer to the continuous case and show strong results compared to non-GFlowNet baselines on several previously studied tasks. This work greatly widens the perspectives for the application of GFlowNets in probabilistic inference and various modeling settings.
Towards Detecting Contextual Real-Time Toxicity for In-Game Chat
Zachary Yang
Nicolas Grenon-Godbout
Real-time toxicity detection in online environments poses a significant challenge, due to the increasing prevalence of social media and gami… (voir plus)ng platforms. We introduce ToxBuster, a simple and scalable model that reliably detects toxic content in real-time for a line of chat by including chat history and metadata. ToxBuster consistently outperforms conventional toxicity models across popular multiplayer games, including Rainbow Six Siege, For Honor, and DOTA 2. We conduct an ablation study to assess the importance of each model component and explore ToxBuster's transferability across the datasets. Furthermore, we showcase ToxBuster's efficacy in post-game moderation, successfully flagging 82.1% of chat-reported players at a precision level of 90.0%. Additionally, we show how an additional 6% of unreported toxic players can be proactively moderated.
Towards Learning to Imitate from a Single Video Demonstration
Florian Golemo
Agents that can learn to imitate given video observation -- \emph{without direct access to state or action information} are more applicable … (voir plus)to learning in the natural world. However, formulating a reinforcement learning (RL) agent that facilitates this goal remains a significant challenge. We approach this challenge using contrastive training to learn a reward function comparing an agent's behaviour with a single demonstration. We use a Siamese recurrent neural network architecture to learn rewards in space and time between motion clips while training an RL policy to minimize this distance. Through experimentation, we also find that the inclusion of multi-task data and additional image encoding losses improve the temporal consistency of the learned rewards and, as a result, significantly improves policy learning. We demonstrate our approach on simulated humanoid, dog, and raptor agents in 2D and a quadruped and a humanoid in 3D. We show that our method outperforms current state-of-the-art techniques in these environments and can learn to imitate from a single video demonstration.
Towards Reliable Neural Specifications
Chuqin Geng
Nham Le
Xiaojie Xu
Zhaoyue Wang
Arie Gurfinkel
Towards Reliable Neural Specifications
Chuqin Geng
Nham Le
Xiaojie Xu
Zhaoyue Wang
Arie Gurfinkel