Publications

On the Transfer of Object-Centric Representation Learning.
Aniket Rajiv Didolkar
Andrii Zadaianchuk
Michael Curtis Mozer
Georg Martius
Maximilian Seitzer
Towards General-Purpose Model-Free Reinforcement Learning
Yuandong Tian
Michael G. Rabbat
Reinforcement learning (RL) promises a framework for near-universal problem-solving. In practice however, RL algorithms are often tailored t… (voir plus)o specific benchmarks, relying on carefully tuned hyperparameters and algorithmic choices. Recently, powerful model-based RL methods have shown impressive general results across benchmarks but come at the cost of increased complexity and slow run times, limiting their broader applicability. In this paper, we attempt to find a unifying model-free deep RL algorithm that can address a diverse class of domains and problem settings. To achieve this, we leverage model-based representations that approximately linearize the value function, taking advantage of the denser task objectives used by model-based RL while avoiding the costs associated with planning or simulated trajectories. We evaluate our algorithm, MR.Q, on a variety of common RL benchmarks with a single set of hyperparameters and show a competitive performance against domain-specific and general baselines, providing a concrete step towards building general-purpose model-free deep RL algorithms.
Towards Improving Exploration Through Sibling Augmented GFlowNets
Towards Interpreting Visual Information Processing in Vision-Language Models
Clement Neo
Luke Ong
Philip Torr
Mor Geva
David M. Krueger
Fazl Barez
Towards whole-genome inference of polygenic scores with fast and memory-efficient algorithms
Chirayu Anant Haryan
Simon Gravel
Sanchit Misra
Yuemei Li
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning
Zhaohan Daniel Guo
Bernardo Avila Pires
Yunhao Tang
Clare Lyle
Mark Rowland
Nicolas Heess
Diana Borsa
Arthur Guez
Will Dabney
Enhancing Privacy in the Early Detection of Sexual Predators Through Federated Learning and Differential Privacy
The increased screen time and isolation caused by the COVID-19 pandemic have led to a significant surge in cases of online grooming, which i… (voir plus)s the use of strategies by predators to lure children into sexual exploitation. Previous efforts to detect grooming in industry and academia have involved accessing and monitoring private conversations through centrally-trained models or sending private conversations to a global server. In this work, we implement a privacy-preserving pipeline for the early detection of sexual predators. We leverage federated learning and differential privacy in order to create safer online spaces for children while respecting their privacy. We investigate various privacy-preserving implementations and discuss their benefits and shortcomings. Our extensive evaluation using real-world data proves that privacy and utility can coexist with only a slight reduction in utility.
Supervised Large Neighbourhood Search for MIPs
Charly Robinson La Rocca
Jean-François Cordeau
Large Neighbourhood Search (LNS) is a powerful heuristic framework for solving Mixed-Integer Programming (MIP) problems. However, designing … (voir plus)effective variable selection strategies in LNS remains challenging, especially for diverse sets of problems. In this paper, we propose an approach that integrates Machine Learning (ML) within the destroy operator of LNS for MIPs with a focus on minimal offline training. We implement a modular LNS matheuristic as a test bench to compare different LNS heuristics, including our ML-enhanced LNS. Experimental results on the MIPLIB 2017 dataset demonstrate that the matheuristic can significantly improve the performance of state-of-the-art solvers like Gurobi and SCIP. We conduct analyses on noisy oracles to explore the impact of prediction accuracy on solution quality. Additionally, we develop techniques to enhance the ML model through loss adjustments and sampling routines. Our findings suggest that while random LNS remains competitive, our Supervised LNS (SLNS) outperforms other baselines and helps set the foundation for future research on ML for LNS methods that are both efficient and general.
CHIRP: A Fine-Grained Benchmark for Open-Ended Response Evaluation in Vision-Language Models
Daniel Z Kaplan
Qirui Sun
Jonathan Siu Chi Lim
Quentin Gregory Anthony
Edwin Fennell
The proliferation of Vision-Language Models (VLMs) in the past several years calls for rigorous and comprehensive evaluation methods and ben… (voir plus)chmarks. This work analyzes existing VLM evaluation techniques, including automated metrics, AI-based assessments, and human evaluations across diverse tasks. We first introduce Robin - a novel suite of VLMs that we built by combining Large Language Models (LLMs) and Vision Encoders (VEs) at multiple scales, and use Robin to identify shortcomings of current evaluation approaches across scales. Next, to overcome the identified limitations, we introduce CHIRP - a new long form response benchmark we developed for more robust and complete VLM evaluation. We provide open access to the Robin training code, model suite, and CHIRP benchmark to promote reproducibility and advance VLM research.
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages
Shamsuddeen Hassan Muhammad
Idris Abdulmumin
Abinew Ayele
Ibrahim Ahmad
Saminu Mohammad Aliyu
Nelson Odhiambo Onyango
Lilian D. A. Wanzare
Samuel Rutunda
Lukman Jibril Aliyu
Esubalew Alemneh
Oumaima Hourrane
Hagos Gebremichael
Elyas Abdi Ismail
Meriem Beloucif
Ebrahim Chekol Jibril
Andiswa Bukula
Rooweither Mabuya
Salomey Osei
Abigail Oppong … (voir 7 de plus)
Tadesse Belay
Tadesse Kebede Guge
Tesfa Tegegne Asfaw
Chiamaka Ijeoma Chukwuneke
Paul Rottger
Seid Muhie Yimam
Nedjma OUSIDHOUM
Hate speech and abusive language are global phenomena that need socio-cultural background knowledge to be understood, identified, and modera… (voir plus)ted. However, in many regions of the Global South, there have been several documented occurrences of (1) absence of moderation and (2) censorship due to the reliance on keyword spotting out of context. Further, high-profile individuals have frequently been at the center of the moderation process, while large and targeted hate speech campaigns against minorities have been overlooked. These limitations are mainly due to the lack of high-quality data in the local languages and the failure to include local communities in the collection, annotation, and moderation processes. To address this issue, we present AfriHate: a multilingual collection of hate speech and abusive language datasets in 15 African languages. Each instance in AfriHate is annotated by native speakers familiar with the local culture. We report the challenges related to the construction of the datasets and present various classification baseline results with and without using LLMs. The datasets, individual annotations, and hate speech and offensive language lexicons are available on https://github.com/AfriHate/AfriHate
Integrating food webs in species distribution models can improve ecological niche estimation and predictions
Giovanni Poggiato
Jérémy Andréoletti
Wilfried Thuiller
Multi-agent deep reinforcement learning with online and fair optimal dispatch of EV aggregators
Anoosh Dini
Keyhan Sheshyekani
The growing popularity of electric vehicles (EVs) and the unpredictable behavior of EV owners have attracted attention to real-time coordina… (voir plus)tion of EVs charging management. This paper presents a hierarchical structure for charging management of EVs by integrating fairness and efficiency concepts within the operations of the distribution system operator (DSO) while utilizing a multi-agent deep reinforcement learning (MADRL) framework to tackle the complexities of energy purchasing and distribution among EV aggregators (EVAs). At the upper level, DSO calculates the maximum allowable power for each EVA based on power flow constraints to ensure grid safety. Then, it finds the optimal efficiency-jain tradeoff (EJT) point, where it sells the highest energy amount while ensuring equitable energy distribution. At the lower level, initially, each EVA acts as an agent employing a double deep Q-network (DDQN) with adaptive learning rates and prioritized experience replay to determine optimal energy purchases from the DSO. Then, the real-time smart dispatch (RSD) controller prioritizes EVs for energy dispatch based on relevant EVs information. Findings indicate the proposed enhanced DDQN outperforms deep deterministic policy gradient (DDPG) and proximal policy optimization (PPO) in cumulative rewards and convergence speed. Finally, the framework’s performance is evaluated against uncontrolled charging and the first come first serve (FCFS) scenario using the 118-bus distribution system, demonstrating superior performance in maintaining safe operation of the grid while reducing charging costs for EVAs. Additionally, the framework’s integration with renewable energy sources (RESs), such as photovoltaic (PV), demonstrates its potential to enhance grid reliability. • Introduces a scalable MADRL framework for real-time EV charging and energy distribution. • Ensures fairness via an Efficiency-Jain Tradeoff (EJT) strategy at the DSO level. • Enhances agent convergence with DDQN using adaptive learning rates and prioritized replay. • Preserves stakeholder privacy with decentralized control and minimal data sharing. • Balances grid reliability with equitable energy allocation under dynamic uncertainties.