Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Efficient Deep Reinforcement Learning-Based Supplementary Damping Control With a Coordinated RMS Training and EMT Testing Scheme
Inverter-based resources (IBRs) can cause instability in weak AC grids. While supplementary damping controllers (SDCs) effectively mitigate … (voir plus)this instability, they are typically designed for specific resonance frequencies but struggle with large shifts caused by changing grid conditions. This paper proposes a deep reinforcement learning-based agent (DRL Agent) as an adaptive SDC to handle shifted resonance frequencies. To address the time-consuming nature of training DRL Agents in electromagnetic transient (EMT) simulations, we coordinate fast root mean square (RMS) and EMT simulations. Resonance frequencies of the weak grid instability are accurately reproduced by RMS simulations to support the training process. The DRL Agent’s efficacy is tested in unseen scenarios outside the training dataset. We then iteratively improve the DRL Agent’s performance by modifying the reward function and hyper-parameters.
Inverter-based resources (IBRs) can cause instability in weak AC grids. While supplementary damping controllers (SDCs) effectively mitigate … (voir plus)this instability, they are typically designed for specific resonance frequencies but struggle with large shifts caused by changing grid conditions. This paper proposes a deep reinforcement learning-based agent (DRL Agent) as an adaptive SDC to handle shifted resonance frequencies. To address the time-consuming nature of training DRL Agents in electromagnetic transient (EMT) simulations, we coordinate fast root mean square (RMS) and EMT simulations. Resonance frequencies of the weak grid instability are accurately reproduced by RMS simulations to support the training process. The DRL Agent’s efficacy is tested in unseen scenarios outside the training dataset. We then iteratively improve the DRL Agent’s performance by modifying the reward function and hyper-parameters.
We explore three strategies to enhance performance on a wide range of image editing tasks: supervised fine-tuning (SFT), reinforcement learn… (voir plus)ing (RL), and Chain-of-Thought (CoT) reasoning. In order to study all these components in one consistent framework, we adopt an autoregressive multimodal model that processes textual and visual tokens in a unified manner. We find RL combined with a large multi-modal LLM verifier to be the most effective of these strategies. As a result, we release EARL: Editing with Autoregression and RL, a strong RL-based image editing model that performs competitively on a diverse range of edits compared to strong baselines, despite using much less training data. Thus, EARL pushes the frontier of autoregressive multimodal models on image editing. We release our code, training data, and trained models at https://github.com/mair-lab/EARL.
Traditional recommendation systems represent user preferences in dense representations obtained through black-box encoder models. While thes… (voir plus)e models often provide strong recommendation performance, they lack interpretability for users, leaving users unable to understand or control the system’s modeling of their preferences. This limitation is especially challenging in music recommendation, where user preferences are highly personal and often evolve based on nuanced qualities like mood, genre, tempo, or instrumentation.
In this paper, we propose an audio prototypical network for controllable music recommendation. This network expresses user preferences in terms of prototypes representative of semantically meaningful features pertaining to musical qualities. We show that the model obtains competitive recommendation performance compared to popular baseline models while also providing interpretable and controllable user profiles.
Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially du… (voir plus)e to the recent influx of research papers. This paper explores the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract.
We decompose the task into two components: (1) Retrieving related works given a query abstract and (2) Writing a literature review based on the retrieved results. We analyze how effective LLMs are for both components.
For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods while providing insights into the LLM's decision-making process.
In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review.
To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations.
We release this evaluation protocol to promote additional research and development in this regard.
Our empirical results suggest that LLMs show promising potential for writing literature reviews when the task is decomposed into smaller components of retrieval and planning.
Particularly, our ``Deep Research" retrieval variant improves coverage by over 5x compared to standard keyword search, addressing a key bottleneck in the pipeline.
Further, we demonstrate that our planning-based approach achieves higher-quality reviews by minimizing hallucinated references in the generated review by 18-26\% compared to existing simpler LLM-based generation methods.
Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, compar… (voir plus)atively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.