Publications

Efficient Deep Reinforcement Learning-Based Supplementary Damping Control With a Coordinated RMS Training and EMT Testing Scheme

Tao Xue

Mingxuan Zhao

Ilhan Kocar

Mohsen Ghafouri

Antoine Lesage-Landry

Siqi Bu

Ziqing Zhu

Inverter-based resources (IBRs) can cause instability in weak AC grids. While supplementary damping controllers (SDCs) effectively mitigate … (voir plus)this instability, they are typically designed for specific resonance frequencies but struggle with large shifts caused by changing grid conditions. This paper proposes a deep reinforcement learning-based agent (DRL Agent) as an adaptive SDC to handle shifted resonance frequencies. To address the time-consuming nature of training DRL Agents in electromagnetic transient (EMT) simulations, we coordinate fast root mean square (RMS) and EMT simulations. Resonance frequencies of the weak grid instability are accurately reproduced by RMS simulations to support the training process. The DRL Agent’s efficacy is tested in unseen scenarios outside the training dataset. We then iteratively improve the DRL Agent’s performance by modifying the reward function and hyper-parameters.

2025-08-01

IEEE Transactions on Power Delivery (publié)

doi.org

Efficient Deep Reinforcement Learning-Based Supplementary Damping Control With a Coordinated RMS Training and EMT Testing Scheme

Tao Xue

Mingxuan Zhao

Ilhan Kocar

Mohsen Ghafouri

Antoine Lesage-Landry

Siqi Bu

Ziqing Zhu

Inverter-based resources (IBRs) can cause instability in weak AC grids. While supplementary damping controllers (SDCs) effectively mitigate … (voir plus)this instability, they are typically designed for specific resonance frequencies but struggle with large shifts caused by changing grid conditions. This paper proposes a deep reinforcement learning-based agent (DRL Agent) as an adaptive SDC to handle shifted resonance frequencies. To address the time-consuming nature of training DRL Agents in electromagnetic transient (EMT) simulations, we coordinate fast root mean square (RMS) and EMT simulations. Resonance frequencies of the weak grid instability are accurately reproduced by RMS simulations to support the training process. The DRL Agent’s efficacy is tested in unseen scenarios outside the training dataset. We then iteratively improve the DRL Agent’s performance by modifying the reward function and hyper-parameters.

2025-08-01

IEEE Transactions on Power Delivery (publié)

doi.org

The Promise of RL for Autoregressive Image Editing

Saba Ahmadi

Rabiul Awal

Ankur Sikarwar

Amirhossein Kazemnejad

Ge Ya Luo

Juan A. Rodriguez

Sai Rajeswar

We explore three strategies to enhance performance on a wide range of image editing tasks: supervised fine-tuning (SFT), reinforcement learn… (voir plus)ing (RL), and Chain-of-Thought (CoT) reasoning. In order to study all these components in one consistent framework, we adopt an autoregressive multimodal model that processes textual and visual tokens in a unified manner. We find RL combined with a large multi-modal LLM verifier to be the most effective of these strategies. As a result, we release EARL: Editing with Autoregression and RL, a strong RL-based image editing model that performs competitively on a diverse range of edits compared to strong baselines, despite using much less training data. Thus, EARL pushes the frontier of autoregressive multimodal models on image editing. We release our code, training data, and trained models at https://github.com/mair-lab/EARL.

2025-08-01

ArXiv (prépublication)

arxiv.org

Advancing science- and evidence-based AI policy.

Rishi Bommasani

Sanjeev Arora

Jennifer Chayes

Yejin Choi

Mariano-Florentino Cuéllar

Li Fei-Fei

Daniel E. Ho

Dan Jurafsky

Sanmi Koyejo

Hima Lakkaraju

Arvind Narayanan

Alondra Nelson

Emma Pierson

Joelle Pineau

Scott R. Singer

Gael Varoquaux

Suresh Venkatasubramanian

Ion Stoica

Percy Liang

Dawn Song

Policy must be informed by, but also facilitate the generation of, scientific evidence.

2025-07-31

ArXiv (prépublication)

doi.org

arxiv.org

Advancing science- and evidence-based AI policy.

Rishi Bommasani

Sanjeev Arora

Jennifer Chayes

Yejin Choi

Mariano-Florentino Cuéllar

Li Fei-Fei

Daniel E. Ho

Dan Jurafsky

Sanmi Koyejo

Hima Lakkaraju

Arvind Narayanan

Alondra Nelson

Emma Pierson

Joelle Pineau

Scott Singer

Gael Varoquaux

Suresh Venkatasubramanian

Ion Stoica

Percy Liang

Dawn Song

2025-07-31

Science (publié)

doi.org

arxiv.org

Audio Prototypical Network for Controllable Music Recommendation

Traditional recommendation systems represent user preferences in dense representations obtained through black-box encoder models. While thes… (voir plus)e models often provide strong recommendation performance, they lack interpretability for users, leaving users unable to understand or control the system’s modeling of their preferences. This limitation is especially challenging in music recommendation, where user preferences are highly personal and often evolve based on nuanced qualities like mood, genre, tempo, or instrumentation. In this paper, we propose an audio prototypical network for controllable music recommendation. This network expresses user preferences in terms of prototypes representative of semantically meaningful features pertaining to musical qualities. We show that the model obtains competitive recommendation performance compared to popular baseline models while also providing interpretable and controllable user profiles.

2025-07-31

ArXiv (prépublication)

openreview.net

Computing Approximate Nash Equilibria for Integer Programming Games

Aloïs Duguet

Margarida Carvalho

Gabriele Dragotto

Sandra-ulrich Ngueveu

2025-07-31

Optimization Letters (publié)

doi.org

arxiv.org

Evaluating and Improving LitLLMs with Deep Research

Gaurav Sahu

Shubham Agarwal

Abhay Puri

Issam Hadj Laradji

Krishnamurthy Dj Dvijotham

Jason Stanley

Laurent Charlin

Chris Pal

Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially du… (voir plus)e to the recent influx of research papers. This paper explores the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We decompose the task into two components: (1) Retrieving related works given a query abstract and (2) Writing a literature review based on the retrieved results. We analyze how effective LLMs are for both components. For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods while providing insights into the LLM's decision-making process. In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We release this evaluation protocol to promote additional research and development in this regard. Our empirical results suggest that LLMs show promising potential for writing literature reviews when the task is decomposed into smaller components of retrieval and planning. Particularly, our ``Deep Research" retrieval variant improves coverage by over 5x compared to standard keyword search, addressing a key bottleneck in the pipeline. Further, we demonstrate that our planning-based approach achieves higher-quality reviews by minimizing hallucinated references in the generated review by 18-26\% compared to existing simpler LLM-based generation methods.

2025-07-31

colmweb.org/COLM/2025/Workshop/LM4Sci (publié)

openreview.net

Towards a General Recipe for Combinatorial Optimization with Multi-Filter GNNs

Michael Perlmutter

2025-07-30

Proceedings of the Third Learning on Graphs Conference (publié)

proceedings.mlr.press

openreview.net

UTG: Towards a Unified View of Snapshot and Event Based Models for Temporal Graphs

Emanuele Rossi

2025-07-30

Proceedings of the Third Learning on Graphs Conference (publié)

doi.org

openreview.net

Capacity-Constrained Continual Learning

Zheng Wen

Doina Precup

Benjamin Van Roy

Satinder Singh

2025-07-29

ArXiv (prépublication)

arxiv.org

Capacity-Constrained Continual Learning

Zheng Wen

Doina Precup

Benjamin Van Roy

Satinder Singh

Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, compar… (voir plus)atively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.

2025-07-29

ArXiv (prépublication)

arxiv.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications