Mark Coates

SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

Boris Oreshkin

Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a sp… (see more)ace of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-dimensional Koopman operator approximation. In this work, we establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs), which have recently demonstrated remarkable success in sequence modeling. We show that by considering an extended state consisting of lagged observations, we can establish an equivalence between a structured Koopman operator and linear RNN updates. Building on this connection, we present SKOLR, which integrates a learnable spectral decomposition of the input signal with a multilayer perceptron (MLP) as the measurement functions and implements a structured Koopman operator via a highly parallel linear RNN stack. Numerical experiments on various forecasting benchmarks and dynamical systems show that this streamlined, Koopman-theory-based design delivers exceptional performance. Our code is available at: https://github.com/networkslab/SKOLR.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

Boris Oreshkin

Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a sp… (see more)ace of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-dimensional Koopman operator approximation. In this work, we establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs), which have recently demonstrated remarkable success in sequence modeling. We show that by considering an extended state consisting of lagged observations, we can establish an equivalence between a structured Koopman operator and linear RNN updates. Building on this connection, we present SKOLR, which integrates a learnable spectral decomposition of the input signal with a multilayer perceptron (MLP) as the measurement functions and implements a structured Koopman operator via a highly parallel linear RNN stack. Numerical experiments on various forecasting benchmarks and dynamical systems show that this streamlined, Koopman-theory-based design delivers exceptional performance. Our code is available at: https://github.com/networkslab/SKOLR.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

doi.org

openreview.net

When to retrain a machine learning model

Florence Regol

Leo Schwinn

Kyle Sprague

Mark Coates

Thomas Markovich

A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of dat… (see more)a. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it a poor fit for existing offline reinforcement learning methods, and the online learning formulation overlooks key practical considerations. To address this, we present a principled formulation of the retraining problem and propose an uncertainty-based method that makes decisions by continually forecasting the evolution of model performance evaluated with a bounded metric. Our experiments, addressing classification tasks, show that the method consistently outperforms existing baselines on 7 datasets. We thoroughly assess its robustness to varying cost trade-off values and mis-specified cost trade-offs.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

When to retrain a machine learning model

Florence Regol

Leo Schwinn

Kyle Sprague

Mark Coates

Thomas Markovich

A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of dat… (see more)a. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it a poor fit for existing offline reinforcement learning methods, and the online learning formulation overlooks key practical considerations. To address this, we present a principled formulation of the retraining problem and propose an uncertainty-based method that makes decisions by continually forecasting the evolution of model performance evaluated with a bounded metric. Our experiments, addressing classification tasks, show that the method consistently outperforms existing baselines on 7 datasets. We thoroughly assess its robustness to varying cost trade-off values and mis-specified cost trade-offs.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (published)

proceedings.mlr.press

openreview.net

FEval-TTC: Fair Evaluation Protocol for Test-Time Compute

Pavel Rumiantsev

Soumyasundar Pal

Yingxue Zhang

Mark Coates

The performance of Large Language Models (LLMs) and the associated dollar costs of API calls can fluctuate over time, potentially invalidati… (see more)ng conclusions drawn in prior research. To address this, we propose a _**F**air **Eval**uation protocol for **T**est-**T**ime **C**ompute_ (FEval-TTC), designed to ensure consistent assessment of test-time compute (TTC) methods, regardless of such fluctuations. FEval-TTC focuses on evaluation of TTC methods that utilize underlying Chains-of-Thought (CoT). It supports evaluations across multiple LLMs on a diverse set of mathematical and commonsense reasoning datasets. The few-shot prompting and answer extraction processes are standardized across datasets, reducing both time and monetary overhead for researchers. Furthermore, we provide a cost modeling procedure that estimates both the token and dollar cost per query, facilitating equitable comparisons of prevalent TTC methods. We open-source FEval-TTC for public use at [anonymized code link](https://drive.google.com/file/d/1DUeteFA7lnx5MubuR0lh6OPN6XKfpqGC/view?usp=sharing).

2025-09-23

NeurIPS.cc/2025/Workshop/LLM_Evaluation (poster)

openreview.net

FEval-TTC: Fair Evaluation Protocol for Test-Time Compute

Pavel Rumiantsev

Soumyasundar Pal

Yingxue Zhang

Mark Coates

The performance of Large Language Models (LLMs) and the associated dollar costs of API calls can fluctuate over time, potentially invalidati… (see more)ng conclusions drawn in prior research. To address this, we propose a _**F**air **Eval**uation protocol for **T**est-**T**ime **C**ompute_ (FEval-TTC), designed to ensure consistent assessment of test-time compute (TTC) methods, regardless of such fluctuations. FEval-TTC focuses on evaluation of TTC methods that utilize underlying Chains-of-Thought (CoT). It supports evaluations across multiple LLMs on a diverse set of mathematical and commonsense reasoning datasets. The few-shot prompting and answer extraction processes are standardized across datasets, reducing both time and monetary overhead for researchers. Furthermore, we provide a cost modeling procedure that estimates both the token and dollar cost per query, facilitating equitable comparisons of prevalent TTC methods. We open-source FEval-TTC for public use at [anonymized code link](https://drive.google.com/file/d/1DUeteFA7lnx5MubuR0lh6OPN6XKfpqGC/view?usp=sharing).

2025-09-23

NeurIPS.cc/2025/Workshop/LLM_Evaluation (poster)

openreview.net

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

Jiaming Zhou

Abbas Ghaddar

Ge Zhang

Liheng Ma

Yaochen Hu

Soumyasundar Pal

Mark Coates

Jianye HAO

B. Wang

Yingxue Zhang

2025-07-24

colmweb.org/COLM/2025/Workshop/XLLM-Reason-Plan (published)

doi.org

openreview.net

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data

Jiaming Zhou

Abbas Ghaddar

Ge Zhang

Liheng Ma

Yaochen Hu

Soumyasundar Pal

B. Wang

Jianye HAO

Mark Coates

Yingxue Zhang

Despite recent advances in training and prompt- ing strategies for Large Language Models (LLMs), these models continue to face chal- lenges … (see more)with complex logical reasoning tasks that involve long reasoning chains. In this work, we explore the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance LLMs’ reasoning capabilities. Our extensive experiments, con- ducted on two established natural language rea- soning tasks—inductive reasoning and spatial reasoning—demonstrate that supervised fine- tuning (SFT) with synthetic graph-based rea- soning data effectively enhances LLMs’ rea- soning performance, without compromising their effectiveness on other standard evaluation benchmarks.

2025-07-24

colmweb.org/COLM/2025/Workshop/XLLM-Reason-Plan (published)

openreview.net

Extracting and Following Paths for Robust Relational Reasoning with Large Language Models

Ge Zhang

Mohammad Alomrani

Hongjian Gu

Jiaming Zhou

Yaochen Hu

B. Wang

Qun Liu

Mark Coates

Yingxue Zhang

Jianye HAO

2025-07-04

KDD.org/2025/Workshop/SKnow-LLM (oral)

openreview.net

Robust and Interpretable Relational Reasoning with Large Language Models and Symbolic Solvers

Ge Zhang

Mohammad Alomrani

Hongjian Gu

Jiaming Zhou

Yaochen Hu

B. Wang

Qun Liu

Mark Coates

Yingxue Zhang

Jianye HAO

Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (see more)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3\%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.

2025-06-30

ICML.cc/2025/Workshop/R2-FM (poster)

openreview.net

SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

Boris Oreshkin

Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a sp… (see more)ace of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-dimensional Koopman operator approximation. In this work, we establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs), which have recently demonstrated remarkable success in sequence modeling. We show that by considering an extended state consisting of lagged observations, we can establish an equivalence between a structured Koopman operator and linear RNN updates. Building on this connection, we present SKOLR, which integrates a learnable spectral decomposition of the input signal with a multilayer perceptron (MLP) as the measurement functions and implements a structured Koopman operator via a highly parallel linear RNN stack. Numerical experiments on various forecasting benchmarks and dynamical systems show that this streamlined, Koopman-theory-based design delivers exceptional performance.

2025-06-17

ArXiv (preprint)

doi.org

arxiv.org

One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration

Jinbang Huang

Yixin Xiao

Zhanguang Zhang

Mark Coates

Jianye HAO

Yingxue Zhang

Pre-trained Large Language Models (LLMs) have shown promise in solving planning problems but often struggle to ensure plan correctness, espe… (see more)cially for long-horizon tasks. Meanwhile, traditional robotic task and motion planning (TAMP) frameworks address these challenges more reliably by combining high-level symbolic search with low-level motion planning. However, TAMP relies on the availability of planning domains that typically involve substantial manual effort and domain expertise, limiting its generalizability. We introduce Planning Domain Derivation with LLMs (PDDLLM), a novel approach that combines simulated physical interaction with LLM reasoning to improve planning performance. The method reduces reliance on humans by inferring planning domains from a single annotated task-execution demonstration. Unlike prior domain-inference methods that rely on partially predefined or language descriptions of planning domains, PDDLLM constructs domains entirely from scratch and automatically integrates them with low-level motion planning skills, enabling fully automated long-horizon planning. PDDLLM is evaluated on over 1,200 diverse tasks spanning nine environments and benchmarked against six LLM-based planning baselines, demonstrating superior planning performance, lower token costs, and successful deployment on multiple robot platforms.

2025-05-28

thecvf.com/CVPR/2025/Workshop/FMEA (oral)

openreview.net

Speed Science

Leading in a New Era

Supervision Requests

Mark Coates

Biography

Current Students

Blog Posts

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Mark Coates

Biography

Current Students

Blog Posts

Publications