Nino Scherrer

Seijin Kobayashi

Luca Versari

Songlin Yang

Maximilian Schlegel

Kaitlin Maile

Yanick Schimpf

Oliver Sieberling

Alexander Meulemans

Rif A. Saurous

Guillaume Lajoie

Charlotte Frenkel

Razvan Pascanu

Blaise Aguera y Arcas

João Sacramento

Sequence modeling is currently dominated by causal transformer architectures that use softmax self-attention. Although widely adopted, trans… (see more)formers require scaling memory and compute linearly during inference. A recent stream of work linearized the softmax operation, resulting in powerful recurrent neural network (RNN) models with constant memory and compute costs such as DeltaNet, Mamba or xLSTM. These models can be unified by noting that their recurrent layer dynamics can all be derived from an in-context regression objective, approximately optimized through an online learning rule. Here, we join this line of work and introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer (von Oswald et al., 2024), and study it in language modeling at the billion-parameter scale. This layer again stems from an in-context loss, but which is now minimized to optimality at every time point using a fast conjugate gradient solver. Through an extensive suite of experiments, we show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs, especially on tasks requiring long context understanding. This performance gain comes at the cost of additional flops spent during inference time. Our results are therefore intriguingly related to recent trends of increasing test-time compute to improve performance -- here by spending compute to solve sequential optimization problems within the neural network itself.

2025-06-05

ArXiv (preprint)

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Johannes Von Oswald

Seijin Kobayashi

Luca Versari

Songlin Yang

Maximilian Schlegel

Kaitlin Maile

Yanick Schimpf

Oliver Sieberling

Alexander Meulemans

Rif A. Saurous

Guillaume Lajoie

Charlotte Frenkel

Razvan Pascanu

Blaise Aguera y Arcas

João Sacramento

Sequence modeling is currently dominated by causal transformer architectures that use softmax self-attention. Although widely adopted, trans… (see more)formers require scaling memory and compute linearly during inference. A recent stream of work linearized the softmax operation, resulting in powerful recurrent neural network (RNN) models with constant memory and compute costs such as DeltaNet, Mamba or xLSTM. These models can be unified by noting that their recurrent layer dynamics can all be derived from an in-context regression objective, approximately optimized through an online learning rule. Here, we join this line of work and introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer (von Oswald et al., 2024), and study it in language modeling at the billion-parameter scale. This layer again stems from an in-context loss, but which is now minimized to optimality at every time point using a fast conjugate gradient solver. Through an extensive suite of experiments, we show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs, especially on tasks requiring long context understanding. This performance gain comes at the cost of additional flops spent during inference time. Our results are therefore intriguingly related to recent trends of increasing test-time compute to improve performance -- here by spending compute to solve sequential optimization problems within the neural network itself.

2025-06-01

arXiv (published)

Multi-agent cooperation through learning-aware policy gradients

Alexander Meulemans

Seijin Kobayashi

Johannes Von Oswald

Blaise Aguera y Arcas

João Sacramento

Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning. How can we achieve cooperation… (see more) among self-interested, independent learning agents? Promising recent work has shown that in certain tasks cooperation can be established between learning-aware agents who model the learning dynamics of each other. Here, we present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning, which takes into account that other agents are themselves learning through trial and error based on multiple noisy trials. We then leverage efficient sequence models to condition behavior on long observation histories that contain traces of the learning dynamics of other agents. Training long-context policies with our algorithm leads to cooperative behavior and high returns on standard social dilemmas, including a challenging environment where temporally-extended action coordination is required. Finally, we derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Bertie Vidgen

Adarsh Agrawal

Ahmed M. Ahmed

Victor Akinwande

Namir Al-nuaimi

Najla Alfaraj

Elie Alhajjar

Lora Aroyo

Trupti Bavalatti

Borhane Blili-Hamelin

K. Bollacker

Rishi Bomassani

Marisa Ferrara Boston

Sim'eon Campos

Kal Chakra

Canyu Chen

Cody Coleman

Zacharie Delpierre Coudert

Leon Strømberg Derczynski

Debojyoti Dutta … (see 77 more)

Ian Eisenberg

James R. Ezick

Heather Frase

Brian Fuller

Ram Gandikota

Agasthya Gangavarapu

Ananya Gangavarapu

James Gealy

Rajat Ghosh

James Goel

Usman Gohar

Sujata Goswami

Scott A. Hale

Wiebke Hutiri

Joseph Marvin Imperial

Surgan Jandial

Nicholas C. Judd

Felix Juefei-Xu

Foutse Khomh

Bhavya Kailkhura

Hannah Rose Kirk

Kevin Klyman

Chris Knotz

Michael Kuchnik

Shachi H. Kumar

Chris Lengerich

Bo Li

Zeyi Liao

Eileen Peters Long

Victor Lu

Yifan Mai

Priyanka Mary Mammen

Kelvin Manyeki

Sean McGregor

Virendra Mehta

Shafee Mohammed

Emanuel Moss

Lama Nachman

Dinesh Jinenhally Naganna

Amin Nikanjam

Besmira Nushi

Luis Oala

Iftach Orr

Alicia Parrish

Çigdem Patlak

William Pietri

Forough Poursabzi-Sangdeh

Eleonora Presani

Fabrizio Puletti

Paul Rottger

Saurav Sahay

Tim Santos

Alice Schoenauer Sebag

Patrick Schramowski

Abolfazl Shahbazi

Vin Sharma

Xudong Shen

Vamsi Sistla

Leonard Tang

Davide Testuggine

Vithursan Thangarasa

Elizabeth A Watkins

Rebecca Weiss

Christoper A. Welty

Tyler Wilbers

Adina Williams

Carole-Jean Wu

Poonam Yadav

Xianjun Yang

Yi Zeng

Wenhui Zhang

Fedor Zhdanov

Jiacheng Zhu

Percy Liang

Peter Mattson

Joaquin Vanschoren

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchm… (see more)ark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark.

2024-04-18

ArXiv (preprint)

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Bertie Vidgen

Adarsh Agrawal

Ahmed M. Ahmed

Victor Akinwande

Namir Al-nuaimi

Najla Alfaraj

Elie Alhajjar

Lora Aroyo

Trupti Bavalatti

Borhane Blili-Hamelin

K. Bollacker

Rishi Bomassani

Marisa Ferrara Boston

Sim'eon Campos

Kal Chakra

Canyu Chen

Cody Coleman

Zacharie Delpierre Coudert

Leon Strømberg Derczynski

Debojyoti Dutta … (see 77 more)

Ian Eisenberg

James R. Ezick

Heather Frase

Brian Fuller

Ram Gandikota

Agasthya Gangavarapu

Ananya Gangavarapu

James Gealy

Rajat Ghosh

James Goel

Usman Gohar

Sujata Goswami

Scott A. Hale

Wiebke Hutiri

Joseph Marvin Imperial

Surgan Jandial

Nicholas C. Judd

Felix Juefei-Xu

Foutse Khomh

Bhavya Kailkhura

Hannah Rose Kirk

Kevin Klyman

Chris Knotz

Michael Kuchnik

Shachi H. Kumar

Chris Lengerich

Bo Li

Zeyi Liao

Eileen Peters Long

Victor Lu

Yifan Mai

Priyanka Mary Mammen

Kelvin Manyeki

Sean McGregor

Virendra Mehta

Shafee Mohammed

Emanuel Moss

Lama Nachman

Dinesh Jinenhally Naganna

Amin Nikanjam

Besmira Nushi

Luis Oala

Iftach Orr

Alicia Parrish

Çigdem Patlak

William Pietri

Forough Poursabzi-Sangdeh

Eleonora Presani

Fabrizio Puletti

Paul Rottger

Saurav Sahay

Tim Santos

Alice Schoenauer Sebag

Patrick Schramowski

Abolfazl Shahbazi

Vin Sharma

Xudong Shen

Vamsi Sistla

Leonard Tang

Davide Testuggine

Vithursan Thangarasa

Elizabeth A Watkins

Rebecca Weiss

Christoper A. Welty

Tyler Wilbers

Adina Williams

Carole-Jean Wu

Poonam Yadav

Xianjun Yang

Yi Zeng

Wenhui Zhang

Fedor Zhdanov

Jiacheng Zhu

Percy Liang

Peter Mattson

Joaquin Vanschoren

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchm… (see more)ark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark.

2024-04-18

ArXiv (preprint)

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Bertie Vidgen

Adarsh Agrawal

Ahmed M. Ahmed

Victor Akinwande

Namir Al-nuaimi

Najla Alfaraj

Elie Alhajjar

Lora Aroyo

Trupti Bavalatti

Borhane Blili-Hamelin

K. Bollacker

Rishi Bomassani

Marisa Ferrara Boston

Sim'eon Campos

Kal Chakra

Canyu Chen

Cody Coleman

Zacharie Delpierre Coudert

Leon Strømberg Derczynski

Debojyoti Dutta … (see 77 more)

Ian Eisenberg

James R. Ezick

Heather Frase

Brian Fuller

Ram Gandikota

Agasthya Gangavarapu

Ananya Gangavarapu

James Gealy

Rajat Ghosh

James Goel

Usman Gohar

Sujata Goswami

Scott A. Hale

Wiebke Hutiri

Joseph Marvin Imperial

Surgan Jandial

Nicholas C. Judd

Felix Juefei-Xu

Foutse Khomh

Bhavya Kailkhura

Hannah Rose Kirk

Kevin Klyman

Chris Knotz

Michael Kuchnik

Shachi H. Kumar

Chris Lengerich

Bin Li

Zeyi Liao

Eileen Peters Long

Victor Lu

Yifan Mai

Priyanka Mary Mammen

Kelvin Manyeki

Sean McGregor

Virendra Mehta

Shafee Mohammed

Emanuel Moss

Lama Nachman

Dinesh Jinenhally Naganna

Amin Nikanjam

Besmira Nushi

Luis Oala

Iftach Orr

Alicia Parrish

Çigdem Patlak

William Pietri

Forough Poursabzi-Sangdeh

Eleonora Presani

Fabrizio Puletti

Paul Rottger

Saurav Sahay

Tim Santos

Alice Schoenauer Sebag

Patrick Schramowski

Abolfazl Shahbazi

Vin Sharma

Xudong Shen

Vamsi Sistla

Leonard Tang

Davide Testuggine

Vithursan Thangarasa

Elizabeth A Watkins

Rebecca Weiss

Christoper A. Welty

Tyler Wilbers

Adina Williams

Carole-Jean Wu

Poonam Yadav

Xianjun Yang

Yi Zeng

Wenhui Zhang

Fedor Zhdanov

Jiacheng Zhu

Percy Liang

Peter Mattson

Joaquin Vanschoren

2024-04-18

ArXiv (preprint)

On the Generalization and Adaption Performance of Causal Models

Nan Rosemary Ke

2022-07-20

ICML.cc/2022/Workshop/SCIS (poster)

openreview.net

Learning Neural Causal Models with Active Interventions

Olexa Bilaniuk

Yashas Annadani

Anirudh Goyal

Patrick Schwab

Bernhard Schölkopf

Michael Curtis Mozer

Yoshua Bengio

Stefan Bauer

Nan Rosemary Ke

Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science. The appealing … (see more)scaling properties of neural networks have recently led to a surge of interest in differentiable neural network-based methods for learning causal structures from data. So far, differentiable causal discovery has focused on static datasets of observational or interventional origin. In this work, we introduce an active intervention-targeting mechanism which enables quick identification of the underlying causal structure of the data-generating process. Our method significantly reduces the required number of interactions compared with random intervention targeting and is applicable for both discrete and continuous optimization formulations of learning the underlying directed acyclic graph (DAG) from data. We examine the proposed method across multiple frameworks in a wide range of settings and demonstrate superior performance on multiple benchmarks from simulated to real-world data.

2021-09-06

ArXiv (preprint)

openreview.net

Variational Causal Networks: Approximate Bayesian Inference over Causal Structures

Yashas Annadani

Jonas Rothfuss

Alexandre Lacoste

Learning the causal structure that underlies data is a crucial step towards robust real-world decision making. The majority of existing work… (see more) in causal inference focuses on determining a single directed acyclic graph (DAG) or a Markov equivalence class thereof. However, a crucial aspect to acting intelligently upon the knowledge about causal structure which has been inferred from finite data demands reasoning about its uncertainty. For instance, planning interventions to find out more about the causal mechanisms that govern our data requires quantifying epistemic uncertainty over DAGs. While Bayesian causal inference allows to do so, the posterior over DAGs becomes intractable even for a small number of variables. Aiming to overcome this issue, we propose a form of variational inference over the graphs of Structural Causal Models (SCMs). To this end, we introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs. Its number of parameters does not grow exponentially with the number of variables and can be tractably learned by maximising an Evidence Lower Bound (ELBO). In our experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.

2021-06-14

ArXiv (preprint)