Publications

A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness
Moritz Ladenburger
Tim Beyer
Stephan Günnemann
Automated \enquote{LLM-as-a-Judge} frameworks have become the de facto standard for scalable evaluation across natural language processing. … (voir plus)For instance, in safety evaluation, these judges are relied upon to evaluate harmfulness in order to benchmark the robustness of safety against adversarial attacks. However, we show that existing validation protocols fail to account for substantial distribution shifts inherent to red-teaming: diverse victim models exhibit distinct generation styles, attacks distort output patterns, and semantic ambiguity varies significantly across jailbreak scenarios. Through a comprehensive audit using 6642 human-verified labels, we reveal that the unpredictable interaction of these shifts often causes judge performance to degrade to near random chance. This stands in stark contrast to the high human agreement reported in prior work. Crucially, we find that many attacks inflate their success rates by exploiting judge insufficiencies rather than eliciting genuinely harmful content. To enable more reliable evaluation, we propose ReliableBench, a benchmark of behaviors that remain more consistently judgeable, and JudgeStressTest, a dataset designed to expose judge failures. (Data in supplement).
$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers
Learned optimizers (LOs) have the potential to significantly reduce the wall-clock training time of neural networks. However, they can strug… (voir plus)gle to optimize unseen tasks (*meta-generalize*), especially when training networks wider than those seen during meta-training. To address this, we derive the Maximal Update Parametrization (
Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations
Charlotte Morissette
Anas El Houssaini
Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (voir plus)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce Contractive Diffusion Policies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real-world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity.
Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware SSL
Sangyoon Bae
Mehdi Azabou
Blake A Richards
Jiook Cha
Self-supervised learning (SSL) holds a great deal of promise for applications in neuroscience, due to the lack of large-scale, consistently … (voir plus)labeled neural datasets. However, most neural datasets contain heterogeneous populations that mix stable, predictable cells with highly stochastic, stimulus-contingent ones, which has made it hard to identify consistent activity patterns during SSL. As a result, self-supervised pretraining has yet to show clear signs of benefits from scale on neural data. Here, we present a novel approach to self-supervised pretraining, POYO-SSL that exploits the heterogeneity of neural data to improve pretraining and achieve benefits of scale. Specifically, in POYO-SSL we pretrain only on predictable neurons---identified on the pretraining split via simple higher-order statistics (skewness and kurtosis)---then we fine-tune on the unpredictable population for downstream tasks. On the Allen Brain Observatory dataset, this strategy yields approximately 12--13\% relative gains over from-scratch training and exhibits smooth, monotonic scaling with model size. In contrast, existing state-of-the-art baselines plateau or destabilize as model size increases. By making predictability an explicit metric for crafting the data diet, POYO-SSL turns heterogeneity from a liability into an asset, providing a robust, biologically grounded recipe for scalable neural decoding and a path toward foundation models of neural dynamics.
A Deep Learning and Inertia-Aware Load Shedding Framework for Mitigating Load-Altering Attacks
Anoosh Dini
Keyhan Sheshyekani
The widespread integration of information and communication technologies into modern power systems has increased their vulnerability to cybe… (voir plus)r-physical threats, such as load-altering attacks (LAA). These attacks can cause rapid load changes, potentially triggering protective mechanisms like under-frequency load shedding (UFLS). Existing approaches for mitigating these attacks are limited, and they mostly rely on preventive measures or neglect system dynamics. In this paper, we propose a novel online framework for the detection and mitigation of LAAs that addresses these limitations. The detection component employs a convolutional neural network–long short-term memory autoencoder (CNN-LSTM AE) architecture to capture anomalies in load consumption data. For mitigation, we propose an inertia-aware load shedding scheme that dynamically adjusts the shedding amount based on the real-time frequency and the magnitude of the attack. This approach prevents overshedding caused by predefined UFLS relay settings and mitigates undershedding by considering the system’s real-time inertia. To this end, a variable forgetting factor recursive least squares (VFF-RLS) algorithm is proposed, which can track inertia variations within a few seconds. The proposed framework is compatible with both synchronous generator-based and converter-interfaced generator-dominated grids. Simulations indicate the effectiveness of the proposed framework in maintaining frequency stability under a wide range of attack scenarios.
Deep neural networks divide and conquer dihedral multiplication
We find multilayer perceptrons and transformers both universally learn an instantiation of the same divide-and-conquer algorithm that requir… (voir plus)es only a logarithmic number of neural representations to solve dihedral multiplication. Clustering neurons based on similar activation behaviour reveals remarkably clear structure: each neural representation corresponds to a Cayley graph. To our knowledge, this is the first work that fully characterizes and describes all neural representations that are learnable on a dataset, while prior work on group multiplications studied neuron-level behavior, or preliminarily investigated cluster behavior. Thus, we can understand the algorithm networks universally learn at three levels of abstraction: 1) Neurons activate on coset or approximate coset structure of the dihedral group. 2) Groups of neurons together form neural representations that act to divide the dataset into different subproblems, being Cayley graphs, where the equivalence class of the answer is computed. 3) The global algorithm then linearly combines each neural representation (subproblem) together at the logits. This work provides a deep case study and provides the community with a very well understood toy model for interpretability, as well as makes steps toward proving the conjecture that DNNs will divide and conquer all group multiplication tasks.
Development and Deployment of Jami: Experiences with a Mobile Distributed Communication Platform
Tianyi Yang
Adrien Béraud
Tao Yang
Sébastien Blin
Cyrille Béraud
Qiao Xiang
Xue Liu
Diffusion tractography outside the brain: the road less travelled
Kurt G. Schilling
Irvin Teh
Richard Dortch
Ibrahim Ibrahim
Nian Wang
Bruce Damon
Rory L. Cochran
Alexander Leemans
Diffusion tractography is a powerful MRI technique for mapping fibrous tissue architecture, traditionally applied to the white matter of the… (voir plus) brain. This report surveys the growing application of tractography to anatomical structures outside the brain, a domain that presents both unique challenges and unique opportunities. We examine its use in the heart, spinal cord, peripheral nerves, brachial plexus, kidney, skeletal muscle, and prostate. For each region, we detail the necessary methodological adaptations for acquisition, modeling, and processing, and highlight the unique anatomical information that can be derived for research and clinical applications. While significant challenges remain - spanning technical hurdles like physiological motion and susceptibility artifacts, to biological complexities like lower anisotropy and the interpretation of streamline validity - tractography beyond the brain provides invaluable, non-invasive insights into tissue micro-organization, opening a new frontier for biomedical imaging.
Discovering Diverse Behaviors via Temporal Contrastive Learning
Catherine Ji
Benjamin Eysenbach
Effective exploration in reinforcement learning requires not only tracking where an agent has been, but also understanding how the agent per… (voir plus)ceives and represents the world. To learn powerful representations, an agent should actively explore states that contribute to its knowledge of the environment. Temporal representations can capture the information necessary to solve a wide range of potential tasks while avoiding the computational cost associated with full state reconstruction. In this paper, we propose an exploration method that leverages temporal contrastive representations to guide exploration, prioritizing states with unpredictable future outcomes. We demonstrate that such representations can enable the learning of complex exploratory behaviors in locomotion, manipulation, and embodied-AI tasks, revealing capabilities and behaviors that traditionally require extrinsic rewards. Unlike approaches that rely on explicit distance learning or episodic memory mechanisms (e.g., quasimetric-based methods), our method builds directly on temporal similarities, yielding a simpler yet effective strategy for exploration.
DisTaC: Conditioning Task Vectors via Distillation for Robust Model Merging
Kotaro Yoshida
Yuji Naraki
Takafumi Horie
Ryotaro Shimizu
Model merging has emerged as an efficient and flexible paradigm for multi-task learning, with numerous methods being proposed in recent year… (voir plus)s. However, these state-of-the-art techniques are typically evaluated on benchmark suites that are highly favorable to model merging, and their robustness in more realistic settings remains largely unexplored. In this work, we first investigate the vulnerabilities of model-merging methods and pinpoint the source-model characteristics that critically underlie them. Specifically, we identify two factors that are particularly harmful to the merging process: (1) disparities in task vector norms, and (2) the low confidence of the source models. To address this issue, we propose **DisTaC** (**Dis**tillation for **Ta**sk vector **C**onditioning), a novel method that pre-conditions these problematic task vectors before the merge. DisTaC leverages knowledge distillation to adjust a task vector's norm and increase source-model confidence while preserving its essential task-specific knowledge. Our extensive experiments demonstrate that by pre-conditioning task vectors with DisTaC, state-of-the-art merging techniques can successfully integrate models that exhibit these harmful traits, where they would otherwise fail, and achieve significant performance gains.
DSL: Dual-Stage List Decoding for Polar Turbo Product Coded Massive MIMO Towards 6 G Extreme Connectivity
Jian Zheng
Yu Tian
Huayi Zhou
Xiaosi Tan
Yutai Sun
Warren J. Gross
Xiaohu You
Chuan Zhang
Targeting sixth generation (6 G) extreme connectivity, massive multiple-input multiple-output (mMIMO) systems face challenges in achieving b… (voir plus)oth high data rates and low latency. Polar turbo product codes (polar-TPCs) with near-optimal error correction performance under constrained latency are promising for 6 G mMIMO applications. In this letter, we propose a dual-stage list (DSL) decoding scheme for polar-TPC coded mMIMO systems, leveraging a turbo list decoder followed by a tree-search-based post-processing method to enhance receiver performance. Moreover, we introduce a two-phase codeword validity check strategy to address the complexity challenges. Empirical results demonstrate that the receiver with our proposed DSL decoding outperforms existing receivers, achieving a superior trade-off between performance and complexity.
Dual Optimistic Ascent (PI Control) is the Augmented Lagrangian Method in Disguise
Constrained optimization is a powerful framework for enforcing requirements on neural networks. These constrained deep learning problems are… (voir plus) typically solved using first-order methods on their min-max Lagrangian formulation, but such approaches often suffer from oscillations and can fail to find all local solutions. While the Augmented Lagrangian method (ALM) addresses these issues, practitioners often favor dual optimistic ascent schemes (PI control) on the standard Lagrangian, which perform well empirically but lack formal guarantees. In this paper, we establish a previously unknown equivalence between these approaches: dual optimistic ascent on the Lagrangian is equivalent to gradient descent-ascent on the Augmented Lagrangian. This finding allows us to transfer the robust theoretical guarantees of the ALM to the dual optimistic setting, proving it converges linearly to all local solutions. Furthermore, the equivalence provides principled guidance for tuning the optimism hyper-parameter. Our work closes a critical gap between the empirical success of dual optimistic methods and their theoretical foundation.