Rejecting Hallucinated State Targets during Planning
Harry Zhao
Mingde Zhao
Tristan Sylvain
Romain Laroche
Rootlets-based registration to the spinal cord PAM50 template
Sandrine B'edard
Jan Valošek
Valeria Oliva
Kenneth A. Weber
SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Defne Tur
Nicholas Meade
Xing Han Lu
Alejandra Zambrano
Arkil Patel
Esin DURMUS
Spandana Gella
Karolina Stanczak
Scaling Trends in Language Model Robustness
Nikolaus H. R. Howe
Ian R. McKenzie
Oskar John Hollinsworth
Michał Zając
Tom Tseng
Aaron David Tucker
Adam Gleave
SCAR: Shapley Credit Assignment for More Efficient RLHF
Meng Cao
Shuyuan Zhang
Xiaojun Chang
SDLog: A Deep Learning Framework for Detecting Sensitive Information in Software Logs
Roozbeh Aghili
Xingfang Wu
Heng Li
Search-Based Correction of Reasoning Chains for Language Models
Minsu Kim
Jean-Pierre R. Falet
Oliver E. Richardson
Xiaoyin Chen
Moksh J. Jain
Sungjin Ahn
Sungsoo Ahn
Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models
Lucas Berry
Axel Brando
Wei-Di Chang
Juan Higuera
Self-Evolving Curriculum for LLM Reasoning
Xiaoyin Chen
Jiarui Lu
Minsu Kim
Dinghuai Zhang
Alexandre Piché
Nicolas Gontier
Ehsan Kamalloo
Self-Play $Q$-Learners Can Provably Collude in the Iterated Prisoner's Dilemma
Quentin Bertrand
Juan Agustin Duque
Emilio Calvano
A growing body of computational studies shows that simple machine learning agents converge to cooperative behaviors in social dilemmas, such… (see more) as collusive price-setting in oligopoly markets, raising questions about what drives this outcome. In this work, we provide theoretical foundations for this phenomenon in the context of self-play multi-agent Q-learners in the iterated prisoner’s dilemma. We characterize broad conditions under which such agents provably learn the cooperative Pavlov (win-stay, lose-shift) policy rather than the Pareto-dominated “always defect” policy. We validate our theoretical results through additional experiments, demonstrating their robustness across a broader class of deep learning algorithms.
SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Yitian Zhang
Liheng Ma
Antonios Valkanas
Boris Oreshkin
Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a sp… (see more)ace of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-dimensional Koopman operator approximation. In this work, we establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs), which have recently demonstrated remarkable success in sequence modeling. We show that by considering an extended state consisting of lagged observations, we can establish an equivalence between a structured Koopman operator and linear RNN updates. Building on this connection, we present SKOLR, which integrates a learnable spectral decomposition of the input signal with a multilayer perceptron (MLP) as the measurement functions and implements a structured Koopman operator via a highly parallel linear RNN stack. Numerical experiments on various forecasting benchmarks and dynamical systems show that this streamlined, Koopman-theory-based design delivers exceptional performance. Our code is available at: https://github.com/networkslab/SKOLR.
Structure-Aligned Protein Language Model
Can Chen
David Heurtel-Depeiges
Robert M. Vernon
Christopher J. Langmead
Quentin Fournier