Steering Large Language Model Activations in Sparse Spaces
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
A key challenge in AI alignment is guiding large language models (LLMs) to follow desired behaviors at test time. Activation steering, which… (voir plus) modifies internal model activations during inference, offers a potential solution. However, prior work in dense activation spaces struggles with superposition, wherein multiple features become entangled, limiting interpretability and precise control. In contrast, sparse representations provide an untapped opportunity for more interpretable behavior modulation. In this work, we introduce sparse activation steering (SAS), a method that leverages sparse autoencoders (SAEs) to steer LLM behavior in sparse spaces. By isolating behavior-specific features through a contrastive prompt-pairing approach, we define a set of features that can selectively reinforce or suppress behaviors. Experiments on Gemma 2 LLMs show that SAS vectors enable nuanced behavioral modulation and finer-grained control. Furthermore, scaling SAEs improves monosemanticity of SAS vectors, suggesting more reliable and interpretable interventions.
Steering Large Language Model Activations in Sparse Spaces
Reza Bayat
Ali Rahimi-Kalahroudi
Mohammad Pezeshki
Assessing the adoption of security policies by developers in terraform across different cloud providers
Alexandre Verdet
Mohammad Hamdaqa
Leuson Da Silva
LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces
Rashid A. Mushkani
Shravan Nayak
Hugo Berard
Allison Cohen
Hadrien Bertrand
LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces
Rashid A. Mushkani
Shravan Nayak
Hugo Berard
Allison Cohen
Hadrien Bertrand
We introduce the Local Intersectional Visual Spaces (LIVS) dataset, a benchmark for multi-criteria alignment of text-to-image (T2I) models i… (voir plus)n inclusive urban planning. Developed through a two-year participatory process with 30 community organizations, LIVS encodes diverse spatial preferences across 634 initial concepts, consolidated into six core criteria: Accessibility, Safety, Comfort, Invitingness, Inclusivity, and Diversity, through 37,710 pairwise comparisons. Using Direct Preference Optimization (DPO) to fine-tune Stable Diffusion XL, we observed a measurable increase in alignment with community preferences, though a significant proportion of neutral ratings highlights the complexity of modeling intersectional needs. Additionally, as annotation volume increases, accuracy shifts further toward the DPO-tuned model, suggesting that larger-scale preference data enhances fine-tuning effectiveness. LIVS underscores the necessity of integrating context-specific, stakeholder-driven criteria into generative modeling and provides a resource for evaluating AI alignment methodologies across diverse socio-spatial contexts.
Societal Alignment Frameworks Can Improve LLM Alignment
Karolina Sta'nczak
Nicholas Meade
Mehar Bhatia
Hattie Zhou
Konstantin Bottinger
Jeremy Barnes
Jason Stanley
Jessica Montgomery
Richard Zemel
Nicolas Papernot
Denis Therien
Timothy P. Lillicrap
Ana Marasovi'c
Sylvie Delacroix
Gillian K. Hadfield
Combining Sampling Methods with Attractor Dynamics in Spiking Models of Head-Direction Systems
Vojko Pjanovic
Jacob Zavatone-Veth
Sander Keemink
Michele Nardin
Uncertainty is a fundamental aspect of the natural environment, requiring the brain to infer and integrate noisy signals to guide behavior e… (voir plus)ffectively. Sampling-based inference has been proposed as a mechanism for dealing with uncertainty, particularly in early sensory processing. However, it is unclear how to reconcile sampling-based methods with operational principles of higher-order brain areas, such as attractor dynamics of persistent neural representations. In this study, we present a spiking neural network model for the head-direction (HD) system that combines sampling-based inference with attractor dynamics. To achieve this, we derive the required spiking neural network dynamics and interactions to perform sampling from a large family of probability distributions—including variables encoded with Poisson noise. We then propose a method that allows the network to update its estimate of the current head direction by integrating angular velocity samples—derived from noisy inputs—with a pull towards a circular manifold, thereby maintaining consistent attractor dynamics. This model makes specific, testable predictions about the HD system that can be examined in future neurophysiological experiments: it predicts correlated subthreshold voltage fluctuations; distinctive short- and long-term firing correlations among neurons; and characteristic statistics of the movement of the neural activity “bump” representing the head direction. Overall, our approach extends previous theories on probabilistic sampling with spiking neurons, offers a novel perspective on the computations responsible for orientation and navigation, and supports the hypothesis that sampling-based methods can be combined with attractor dynamics to provide a viable framework for studying neural dynamics across the brain.
Considerations and recommendations from the ISMRM Diffusion Study Group for preclinical diffusion MRI: Part 3 -- Ex vivo imaging: data processing, comparisons with microscopy, and tractography
Kurt G Schilling
Amy F D Howard
Francesco Grussu
Andrada Ianus
Brian Hansen
Rachel L. C. Barrett
Manisha Aggarwal
Stijn Michielse
Fatima Nasrallah
W. Syeda
Nian Wang
Jelle Veraart
Alard J. Roebroeck
Andrew F Bagdasarian
Cornelius Eichner
Farshid Sepehrband
Jan Zimmermann
L. Soustelle
Christien Bowman
Benjamin C. Tendler … (voir 38 de plus)
A. Hertanu
Ben Jeurissen
M. Verhoye
L. Frydman
Y. Looij
David C. Hike
Jeff F. Dunn
Karla L. Miller
Bennett A. Landman
N. Shemesh
Adam Anderson
Emilie McKinnon
Shawna Farquharson
Flavio Dell’ Acqua
C. Pierpaoli
Ivana Drobnjak
Alexander Leemans
K. Harkins
Maxime Descoteaux
Duan Xu
Hao Huang
Mathieu D. Santin
Samuel C. Grant
Andre Obenaus
Gene S Kim
Dan Wu
D. Bihan
S. Blackband
Luisa Ciobanu
E. Fieremans
Ruiliang Bai
T. Leergaard
Jiangyang Zhang
T. Dyrby
G. A. Johnson
Matthew D. Budde
Ileana Ozana Jelescu
NeoBERT: A Next-Generation BERT
Lola Le Breton
Quentin Fournier
Mariam El Mezouar
Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of … (voir plus)large auto-regressive language models such as LLaMA and DeepSeek. In contrast, encoders like BERT and RoBERTa have not seen the same level of progress despite being foundational for many downstream NLP applications. To bridge this gap, we introduce NeoBERT, a next-generation encoder that redefines the capabilities of bidirectional models by integrating state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies. NeoBERT is designed for seamless adoption: it serves as a plug-and-play replacement for existing base models, relies on an optimal depth-to-width ratio, and leverages an extended context length of 4,096 tokens. Despite its compact 250M parameter footprint, it achieves state-of-the-art results on the massive MTEB benchmark, outperforming BERT large, RoBERTa large, NomicBERT, and ModernBERT under identical fine-tuning conditions. In addition, we rigorously evaluate the impact of each modification on GLUE and design a uniform fine-tuning and evaluation framework for MTEB. We release all code, data, checkpoints, and training scripts to accelerate research and real-world adoption.
NeoBERT: A Next-Generation BERT
Lola Le Breton
Quentin Fournier
Mariam El Mezouar
Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of … (voir plus)large auto-regressive language models such as LLaMA and DeepSeek. In contrast, encoders like BERT and RoBERTa have not seen the same level of progress despite being foundational for many downstream NLP applications. To bridge this gap, we introduce NeoBERT, a next-generation encoder that redefines the capabilities of bidirectional models by integrating state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies. NeoBERT is designed for seamless adoption: it serves as a plug-and-play replacement for existing base models, relies on an optimal depth-to-width ratio, and leverages an extended context length of 4,096 tokens. Despite its compact 250M parameter footprint, it achieves state-of-the-art results on the massive MTEB benchmark, outperforming BERT large, RoBERTa large, NomicBERT, and ModernBERT under identical fine-tuning conditions. In addition, we rigorously evaluate the impact of each modification on GLUE and design a uniform fine-tuning and evaluation framework for MTEB. We release all code, data, checkpoints, and training scripts to accelerate research and real-world adoption.
Origin of Nonlinear Circular Photocurrent in 2D Semiconductor
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"><mml:mrow><mml:msub><mml:mrow><mml:mi>MoS</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:math>
Yanchong Zhao
Fengyu Chen
Jing Liang
Mohammad Saeed Bahramy
Mingwei Yang
Yao Guang
Xiaomei Li
Zheng Wei
Jiaojiao Zhao
Mengzhou Liao
Cheng Shen
Qinqin Wang
Rong Yang
Kenji Watanabe
Takashi Taniguchi
Zhiheng Huang
Dongxia Shi
Kaihui Liu
Zhipei Sun … (voir 3 de plus)
Ji Feng
Luojun Du
Guangyu Zhang
Origin of Nonlinear Circular Photocurrent in 2D Semiconductor MoS_{2}.
Yanchong Zhao
Fengyu Chen
Jing Liang
Mohammad Saeed Bahramy
Mingwei Yang
Yao Guang
Xiaomei Li
Zheng Wei
Jiaojiao Zhao
Mengzhou Liao
Cheng Shen
Qinqin Wang
Rong Yang
Kenji Watanabe
Takashi Taniguchi
Zhiheng Huang
Dongxia Shi
Kaihui Liu
Zhipei Sun … (voir 3 de plus)
Ji Feng
Luojun Du
Guangyu Zhang