RL4Med-DDPO: Reinforcement Learning for Controlled Guidance Towards Diverse Medical Image Generation using Vision-Language Foundation Models
Parham Saremi
Amar Kumar
Mohammed Mohammed
Zahra Tehraninasab
SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Defne Tur
Nicholas Meade
Xing Han Lu
Alejandra Zambrano
Arkil Patel
Esin Durmus
Spandana Gella
Karolina Sta'nczak
Self-adaptive cyber defense for sustainable IoT: A DRL-based IDS optimizing security and energy efficiency
Saeid Jamshidi
Ashkan Amirnia
Amin Nikanjam
Kawser Wazed Nafi
Samira Keivanpour
SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection
Shamsuddeen Hassan Muhammad
Nedjma OUSIDHOUM
Idris Abdulmumin
Seid Muhie Yimam
Jan Philip Wahle
Terry Lima Ruas
Meriem Beloucif
Christine de Kock
Tadesse Belay
Ibrahim Ahmad
Nirmal Surange
Daniela Teodorescu
Alham Fikri Aji
Felermino Ali
Vladimir Araujo
Abinew Ayele
Oana Ignat
Alexander Panchenko
Yi Zhou … (voir 1 de plus)
Saif M. Mohammad
Societal Alignment Frameworks Can Improve LLM Alignment
Karolina Sta'nczak
Nicholas Meade
Mehar Bhatia
Hattie Zhou
Konstantin Bottinger
Jeremy Barnes
Jason Stanley
Jessica Montgomery
Richard Zemel
Nicolas Papernot
Denis Therien
Timothy P. Lillicrap
Ana Marasovi'c
Sylvie Delacroix
Gillian K. Hadfield
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values… (voir plus) - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity rather than perfect their specification. Beyond technical improvements in LLM alignment, we discuss the need for participatory alignment interface designs.
Spectral State Space Model for Rotation-Invariant Visual Representation Learning
Sahar Dastani
Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
David Osowiechi
Gustavo Adolfo Vargas Hakim
Farzad Beizaee
Milad Cheraghalikhani
Arnab Kumar Mondal
Christian Desrosiers
Tractable Representations for Convergent Approximation of Distributional HJB Equations
Julie Alhosh
Harley Wiltzer
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction
Shravan Nayak
Xiangru Jian
Kevin Qinghong Lin
Juan A. Rodriguez
Montek Kalsi
Rabiul Awal
M. T. ¨Ozsu
David Vazquez
Perouz Taslakian
Spandana Gella
Sai Rajeswar
Human Annotator
Unveiling Inefficiencies in LLM-Generated Code: Toward a Comprehensive Taxonomy
Altaf Allah Abbassi
Leuson Da Silva
Amin Nikanjam
PRISM: High-Resolution&Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion
Amar Kumar
Anita Kriz
Mohammad Havaei
Developing reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, da… (voir plus)ta imbalances, and limited text annotations in datasets. Addressing these challenges requires architectures robust to the unique complexities posed by medical imaging data. The rapid advancements in vision-language foundation models within the natural image domain prompt the question of how they can be adapted for medical imaging tasks. In this work, we present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals using Stable Diffusion. Our approach demonstrates unprecedented precision in selectively modifying spurious correlations (the medical devices) and disease features, enabling the removal and addition of specific attributes while preserving other image characteristics. Through extensive evaluation, we show how PRISM advances counterfactual generation and enables the development of more robust downstream classifiers for clinically deployable solutions. To facilitate broader adoption and research, we make our code publicly available at https://github.com/Amarkr1/PRISM.
Steering Large Language Model Activations in Sparse Spaces
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
Steering Large Language Model Activations in Sparse Spaces
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
A key challenge in AI alignment is guiding large language models (LLMs) to follow desired behaviors at test time. Activation steering, which… (voir plus) modifies internal model activations during inference, offers a potential solution. However, prior work in dense activation spaces struggles with superposition, wherein multiple features become entangled, limiting interpretability and precise control. In contrast, sparse representations provide an untapped opportunity for more interpretable behavior modulation. In this work, we introduce sparse activation steering (SAS), a method that leverages sparse autoencoders (SAEs) to steer LLM behavior in sparse spaces. By isolating behavior-specific features through a contrastive prompt-pairing approach, we define a set of features that can selectively reinforce or suppress behaviors. Experiments on Gemma 2 LLMs show that SAS vectors enable nuanced behavioral modulation and finer-grained control. Furthermore, scaling SAEs improves monosemanticity of SAS vectors, suggesting more reliable and interpretable interventions.