Publications

Inferring Dynamic Regulatory Interaction Graphs From Time Series Data With Perturbations
Dhananjay Bhaskar
Daniel Sumner Magruder
Matheo Morales
Edward De Brouwer
Aarthi Venkat
Frederik Wenkel
Smita Krishnaswamy
Inferring multiple consensus trees and supertrees using clustering: a review
Gayane S. Barseghyan
Nadia Tahiri
An Intentional Forgetting-Driven Self-Healing Method for Deep Reinforcement Learning Systems
Ahmed Haj Yahmed
Rached Bouchoucha
Houssem Ben Braiek
Deep reinforcement learning (DRL) is increasingly applied in large-scale productions like Netflix and Facebook. As with most data-driven sys… (voir plus)tems, DRL systems can exhibit undesirable behaviors due to environmental drifts, which often occur in constantly-changing production settings. Continual Learning (CL) is the inherent self-healing approach for adapting the DRL agent in response to the environment's conditions shifts. However, successive shifts of considerable magnitude may cause the production environment to drift from its original state. Recent studies have shown that these environmental drifts tend to drive CL into long, or even unsuccessful, healing cycles, which arise from inefficiencies such as catastrophic forgetting, warm-starting failure, and slow convergence. In this paper, we propose Dr. DRL, an effective self-healing approach for DRL systems that integrates a novel mechanism of intentional forgetting into vanilla CL (i.e., standard CL) to overcome its main issues. Dr. DRL deliberately erases the DRL system's minor behaviors to systematically prioritize the adaptation of the key problem-solving skills. Using well-established DRL algorithms, Dr. DRL is compared with vanilla CL on various drifted environments. Dr. DRL is able to reduce, on average, the healing time and fine-tuning episodes by, respectively, 18.74% and 17.72%. Dr. DRL successfully helps agents to adapt to 19.63% of drifted environments left unsolved by vanilla CL while maintaining and even enhancing by up to 45% the obtained rewards for drifted environments that are resolved by both approaches.
Invasion of Ukraine Discourse on TikTok Dataset
Benjamin D. Steel
Sara J. Parker
Derek Ruths
We present a dataset of videos and comments from the social media platform TikTok, centred around the invasion of Ukraine in 2022, an event … (voir plus)that launched TikTok into the geopolitical arena. The discourse around the invasion exposed myriad political behaviours and dynamics that are unexplored on this platform. To this end we provide a mass scale language and interaction dataset for further research into these processes. An initial investigation of language and social interaction dynamics are explored in this paper. The dataset and the library used to collect it are open sourced to the public.
Invited commentary on Stoehr J et al: The personal impact of involvement in international global health outreach: A national survey of former operation smile student volunteers.
Iorl: Inductive-Offline-Reinforcement-Learning for Traffic Signal Control Warmstarting
François-Xavier Devailly
Denis Larocque
A Kernel Perspective on Behavioural Metrics for Markov Decision Processes
Tyler Kastner
Mark Rowland
We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We define a ne… (voir plus)w metric under this lens that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective enables us to provide new theoretical results, including value-function bounds and low-distortion finite-dimensional Euclidean embeddings, which are crucial when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice.
Lag-Llama: Towards Foundation Models for Time Series Forecasting
Kashif Rasul
Arjun Ashok
Andrew Robert Williams
Arian Khorasani
George Adamopoulos
Rishika Bhagwatkar
Marin Biloš
Hena Ghonia
N. Hassen
Anderson Schneider
Sahil Garg
Yuriy Nevmyvaka
Aiming to build foundation models for time-series forecasting and study their scaling behavior, we present here our work-in-progress on Lag-… (voir plus)Llama , a general-purpose univariate probabilistic time-series forecasting model trained on a large collection of time-series data. The model shows good zero-shot prediction capabilities on unseen “out-of-distribution” time-series datasets, outperforming supervised baselines. We use smoothly broken power-laws [7] to fit and predict model scaling behavior. The open source code is made available at https://github
LEAD: Min-Max Optimization from a Physical Perspective
Reyhane Askari Hemmat
Amartya Mitra
Adversarial formulations have rekindled interest in two-player min-max games. A central obstacle in the optimization of such games is the ro… (voir plus)tational dynamics that hinder their convergence. In this paper, we show that game optimization shares dynamic properties with particle systems subject to multiple forces, and one can leverage tools from physics to improve optimization dynamics. Inspired by the physical framework, we propose LEAD, an optimizer for min-max games. Next, using Lyapunov stability theory from dynamical systems as well as spectral analysis, we study LEAD’s convergence properties in continuous and discrete time settings for a class of quadratic min-max games to demonstrate linear convergence to the Nash equilibrium. Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements in GAN training.
Learning GFlowNets from partial episodes for improved convergence and stability
Kanika Madan
Jarrid Rector-Brooks
Maksym Korablyov
Emmanuel Bengio
Moksh J. Jain
Andrei Cristian Nica
Tom Bosc
Nikolay Malkin
Generative flow networks (GFlowNets) are a family of algorithms for training a sequential sampler of discrete objects under an unnormalized … (voir plus)target density and have been successfully used for various probabilistic modeling tasks. Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory. We argue that these alternatives represent opposite ends of a gradient bias-variance tradeoff and propose a way to exploit this tradeoff to mitigate its harmful effects. Inspired by the TD(
Learning Syntactic Monoids from Samples by extending known Algorithms for learning State Machines
Simon Dieck
Sicco Verwer
François Coste
Faissal Ouardi
For the inference of regular languages, most current methods learn a version of deterministic finite automata. Syntactic monoids are an alte… (voir plus)rnative representation of regular languages, which have some advantages over automata. For example, traces can be parsed starting from any index and the star-freeness of the language they represent can be checked in polynomial time. But, to date, there existed no passive learning algorithm for syntactic monoids. In this paper, we prove that known state-merging algorithms for learning deterministic finite automata can be instrumented to learn syntactic monoids instead, by using as the input a special structure proposed in this paper: the interfix-graph. Further, we introduce a method to encode frequencies on the interfix-graph, such that models can also be learned from only positive traces. We implemented this structure and performed experiments with both traditional data and data containing only positive traces. As such this work answers basic theoretical and experimental questions regarding a novel passive learning algorithm for syntactic monoids.
On learning Whittle index policy for restless bandits with scalable regret
Nima Akbarzadeh
Reinforcement learning is an attractive approach to learn good resource allocation and scheduling policies based on data when the system mod… (voir plus)el is unknown. However, the cumulative regret of most RL algorithms scales as ˜ O(S