Publications

A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness

Moritz Ladenburger

Tim Beyer

Stephan Günnemann

Automated \enquote{LLM-as-a-Judge} frameworks have become the de facto standard for scalable evaluation across natural language processing. … (see more)For instance, in safety evaluation, these judges are relied upon to evaluate harmfulness in order to benchmark the robustness of safety against adversarial attacks. However, we show that existing validation protocols fail to account for substantial distribution shifts inherent to red-teaming: diverse victim models exhibit distinct generation styles, attacks distort output patterns, and semantic ambiguity varies significantly across jailbreak scenarios. Through a comprehensive audit using 6642 human-verified labels, we reveal that the unpredictable interaction of these shifts often causes judge performance to degrade to near random chance. This stands in stark contrast to the high human agreement reported in prior work. Crucially, we find that many attacks inflate their success rates by exploiting judge insufficiencies rather than eliciting genuinely harmful content. To enable more reliable evaluation, we propose ReliableBench, a benchmark of behaviors that remain more consistently judgeable, and JudgeStressTest, a dataset designed to expose judge failures. (Data in supplement).

2025-12-31

International Conference on Machine Learning (Accept (regular))

doi.org

openreview.net

$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers

Benjamin Therien

Charles-Etienne Joseph

Learned optimizers (LOs) have the potential to significantly reduce the wall-clock training time of neural networks. However, they can strug… (see more)gle to optimize unseen tasks (*meta-generalize*), especially when training networks wider than those seen during meta-training. To address this, we derive the Maximal Update Parametrization (

2025-12-31

International Conference on Learning Representations (Accept (Poster))

openreview.net

Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations

Charlotte Morissette

Anas El Houssaini

Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (see more)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce Contractive Diffusion Policies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real-world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity.

2025-12-31

arXiv (preprint)

doi.org

openreview.net

Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware SSL

Sangyoon Bae

Mehdi Azabou

Blake A Richards

Jiook Cha

Self-supervised learning (SSL) holds a great deal of promise for applications in neuroscience, due to the lack of large-scale, consistently … (see more)labeled neural datasets. However, most neural datasets contain heterogeneous populations that mix stable, predictable cells with highly stochastic, stimulus-contingent ones, which has made it hard to identify consistent activity patterns during SSL. As a result, self-supervised pretraining has yet to show clear signs of benefits from scale on neural data. Here, we present a novel approach to self-supervised pretraining, POYO-SSL that exploits the heterogeneity of neural data to improve pretraining and achieve benefits of scale. Specifically, in POYO-SSL we pretrain only on predictable neurons---identified on the pretraining split via simple higher-order statistics (skewness and kurtosis)---then we fine-tune on the unpredictable population for downstream tasks. On the Allen Brain Observatory dataset, this strategy yields approximately 12--13\% relative gains over from-scratch training and exhibits smooth, monotonic scaling with model size. In contrast, existing state-of-the-art baselines plateau or destabilize as model size increases. By making predictability an explicit metric for crafting the data diet, POYO-SSL turns heterogeneity from a liability into an asset, providing a robust, biologically grounded recipe for scalable neural decoding and a path toward foundation models of neural dynamics.

2025-12-31

International Conference on Learning Representations (Accept (Poster))

openreview.net

A Deep Learning and Inertia-Aware Load Shedding Framework for Mitigating Load-Altering Attacks

Anoosh Dini

Keyhan Sheshyekani

Hanane Dagdougui

The widespread integration of information and communication technologies into modern power systems has increased their vulnerability to cybe… (see more)r-physical threats, such as load-altering attacks (LAA). These attacks can cause rapid load changes, potentially triggering protective mechanisms like under-frequency load shedding (UFLS). Existing approaches for mitigating these attacks are limited, and they mostly rely on preventive measures or neglect system dynamics. In this paper, we propose a novel online framework for the detection and mitigation of LAAs that addresses these limitations. The detection component employs a convolutional neural network–long short-term memory autoencoder (CNN-LSTM AE) architecture to capture anomalies in load consumption data. For mitigation, we propose an inertia-aware load shedding scheme that dynamically adjusts the shedding amount based on the real-time frequency and the magnitude of the attack. This approach prevents overshedding caused by predefined UFLS relay settings and mitigates undershedding by considering the system’s real-time inertia. To this end, a variable forgetting factor recursive least squares (VFF-RLS) algorithm is proposed, which can track inertia variations within a few seconds. The proposed framework is compatible with both synchronous generator-based and converter-interfaced generator-dominated grids. Simulations indicate the effectiveness of the proposed framework in maintaining frequency stability under a wide range of attack scenarios.

2025-12-31

IEEE Access (published)

doi.org

Deep neural networks divide and conquer dihedral multiplication

Sihui Wei

Gavin McCracken

Gabriela Moisescu-Pareja

Harley Wiltzer

Doina Precup

Irina Rish

Jonathan Love

We find multilayer perceptrons and transformers both universally learn an instantiation of the same divide-and-conquer algorithm that requir… (see more)es only a logarithmic number of neural representations to solve dihedral multiplication. Clustering neurons based on similar activation behaviour reveals remarkably clear structure: each neural representation corresponds to a Cayley graph. To our knowledge, this is the first work that fully characterizes and describes all neural representations that are learnable on a dataset, while prior work on group multiplications studied neuron-level behavior, or preliminarily investigated cluster behavior. Thus, we can understand the algorithm networks universally learn at three levels of abstraction: 1) Neurons activate on coset or approximate coset structure of the dihedral group. 2) Groups of neurons together form neural representations that act to divide the dataset into different subproblems, being Cayley graphs, where the equivalence class of the answer is computed. 3) The global algorithm then linearly combines each neural representation (subproblem) together at the logits. This work provides a deep case study and provides the community with a very well understood toy model for interpretability, as well as makes steps toward proving the conjecture that DNNs will divide and conquer all group multiplication tasks.

2025-12-31

International Conference on Machine Learning (Accept (regular))

openreview.net

Development and Deployment of Jami: Experiences with a Mobile Distributed Communication Platform

Yili Jin

Tianyi Yang

Adrien Béraud

Tao Yang

Sébastien Blin

Cyrille Béraud

Qiao Xiang

Xue Liu

2025-12-31

IEEE Transactions on Mobile Computing (published)

doi.org

Diffusion tractography outside the brain: the road less travelled

Kurt G. Schilling

Irvin Teh

Julien Cohen-Adad

Richard Dortch

Ibrahim Ibrahim

Nian Wang

Bruce Damon

Rory L. Cochran

Alexander Leemans

Diffusion tractography is a powerful MRI technique for mapping fibrous tissue architecture, traditionally applied to the white matter of the… (see more) brain. This report surveys the growing application of tractography to anatomical structures outside the brain, a domain that presents both unique challenges and unique opportunities. We examine its use in the heart, spinal cord, peripheral nerves, brachial plexus, kidney, skeletal muscle, and prostate. For each region, we detail the necessary methodological adaptations for acquisition, modeling, and processing, and highlight the unique anatomical information that can be derived for research and clinical applications. While significant challenges remain - spanning technical hurdles like physiological motion and susceptibility artifacts, to biological complexities like lower anisotropy and the interpretation of streamline validity - tractography beyond the brain provides invaluable, non-invasive insights into tissue micro-organization, opening a new frontier for biomedical imaging.

2025-12-31

Brain Structure and Function (published)

doi.org

Discovering Diverse Behaviors via Temporal Contrastive Learning

Faisal Mohamed

Catherine Ji

Benjamin Eysenbach

Glen Berseth

Effective exploration in reinforcement learning requires not only tracking where an agent has been, but also understanding how the agent per… (see more)ceives and represents the world. To learn powerful representations, an agent should actively explore states that contribute to its knowledge of the environment. Temporal representations can capture the information necessary to solve a wide range of potential tasks while avoiding the computational cost associated with full state reconstruction. In this paper, we propose an exploration method that leverages temporal contrastive representations to guide exploration, prioritizing states with unpredictable future outcomes. We demonstrate that such representations can enable the learning of complex exploratory behaviors in locomotion, manipulation, and embodied-AI tasks, revealing capabilities and behaviors that traditionally require extrinsic rewards. Unlike approaches that rely on explicit distance learning or episodic memory mechanisms (e.g., quasimetric-based methods), our method builds directly on temporal similarities, yielding a simpler yet effective strategy for exploration.

2025-12-31

International Conference on Learning Representations (Accept (Poster))

openreview.net

DisTaC: Conditioning Task Vectors via Distillation for Robust Model Merging

Kotaro Yoshida

Yuji Naraki

Takafumi Horie

Ryotaro Shimizu

Ioannis Mitliagkas

Hiroki Naganuma

Model merging has emerged as an efficient and flexible paradigm for multi-task learning, with numerous methods being proposed in recent year… (see more)s. However, these state-of-the-art techniques are typically evaluated on benchmark suites that are highly favorable to model merging, and their robustness in more realistic settings remains largely unexplored. In this work, we first investigate the vulnerabilities of model-merging methods and pinpoint the source-model characteristics that critically underlie them. Specifically, we identify two factors that are particularly harmful to the merging process: (1) disparities in task vector norms, and (2) the low confidence of the source models. To address this issue, we propose **DisTaC** (**Dis**tillation for **Ta**sk vector **C**onditioning), a novel method that pre-conditions these problematic task vectors before the merge. DisTaC leverages knowledge distillation to adjust a task vector's norm and increase source-model confidence while preserving its essential task-specific knowledge. Our extensive experiments demonstrate that by pre-conditioning task vectors with DisTaC, state-of-the-art merging techniques can successfully integrate models that exhibit these harmful traits, where they would otherwise fail, and achieve significant performance gains.

2025-12-31

International Conference on Learning Representations (Accept (Poster))

doi.org

openreview.net

DSL: Dual-Stage List Decoding for Polar Turbo Product Coded Massive MIMO Towards 6 G Extreme Connectivity

Jian Zheng

Yu Tian

Huayi Zhou

Xiaosi Tan

Yutai Sun

Warren J. Gross

Xiaohu You

Chuan Zhang

Targeting sixth generation (6 G) extreme connectivity, massive multiple-input multiple-output (mMIMO) systems face challenges in achieving b… (see more)oth high data rates and low latency. Polar turbo product codes (polar-TPCs) with near-optimal error correction performance under constrained latency are promising for 6 G mMIMO applications. In this letter, we propose a dual-stage list (DSL) decoding scheme for polar-TPC coded mMIMO systems, leveraging a turbo list decoder followed by a tree-search-based post-processing method to enhance receiver performance. Moreover, we introduce a two-phase codeword validity check strategy to address the complexity challenges. Empirical results demonstrate that the receiver with our proposed DSL decoding outperforms existing receivers, achieving a superior trade-off between performance and complexity.

2025-12-31

IEEE Transactions on Vehicular Technology (published)

doi.org

Dual Optimistic Ascent (PI Control) is the Augmented Lagrangian Method in Disguise

Juan Ramirez

Simon Lacoste-Julien

Constrained optimization is a powerful framework for enforcing requirements on neural networks. These constrained deep learning problems are… (see more) typically solved using first-order methods on their min-max Lagrangian formulation, but such approaches often suffer from oscillations and can fail to find all local solutions. While the Augmented Lagrangian method (ALM) addresses these issues, practitioners often favor dual optimistic ascent schemes (PI control) on the standard Lagrangian, which perform well empirically but lack formal guarantees. In this paper, we establish a previously unknown equivalence between these approaches: dual optimistic ascent on the Lagrangian is equivalent to gradient descent-ascent on the Augmented Lagrangian. This finding allows us to transfer the robust theoretical guarantees of the ALM to the dual optimistic setting, proving it converges linearly to all local solutions. Furthermore, the equivalence provides principled guidance for tuning the optimism hyper-parameter. Our work closes a critical gap between the empirical success of dual optimistic methods and their theoretical foundation.

2025-12-31

International Conference on Learning Representations (Accept (Poster))

doi.org

openreview.net

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications