Publications

Monitoring morphometric drift in lifelong learning segmentation of the spinal cord

Enamundram Naga Karthik

Sandrine Bédard

Jan Valosek

Christoph S. Aigner

Elise Bannier

Josef Bednařík

Virginie Callot

Anna Combes

Armin Curt

Gergely David

Falk Eippert

Lynn Farner

Michael G Fehlings

Patrick Freund

Tobias Granberg

Cristina Granziera

Rhscir Network Imaging Group

Ulrike Horn

Tomáš Horák

Suzanne Humphreys … (see 36 more)

Markus Hupp

Anne Kerbrat

Nawal Kinany

Shannon Kolind

Petr Kudlička

Anna Lebret

Lisa Eunyoung Lee

Caterina Mainero

Allan R. Martin

Megan McGrath

Govind Nair

Kristin P. O’Grady

Jiwon Oh

Russell Ouellette

Nikolai Pfender

Dario Pfyffer

P. Pradat

Alexandre Prat

Emanuele Pravatà

Daniel S. Reich

Ilaria Ricchi

Naama Rotem-Kohavi

Simon Schading-Sassenhausen

Maryam Seif

Andrew C. Smith

Seth Aaron Smith

Grace Sweeney

Roger Tam

Anthony Traboulsee

Constantina A. Treaba

Charidimos Tsagkas

Zachary Vavasour

Dimitri Van De Ville

Kenneth A. Weber

Sarath Chandar

Julien Cohen-Adad

2025-06-12

lifelong-ml.cc/CoLLAs/2025/Workshop_Track (published)

doi.org

openreview.net

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

Luke Rowe

Rodrigue De Schaetzen

Roger Girgis

Chris Pal

Liam Paull

We present Poutine, a 3B-parameter vision-language model (VLM) tailored for end-to-end autonomous driving in long-tail driving scenarios. Po… (see more)utine is trained in two stages. To obtain strong base driving capabilities, we train Poutine-Base in a self-supervised vision-language-trajectory (VLT) next-token prediction fashion on 83 hours of CoVLA nominal driving and 11 hours of Waymo long-tail driving. Accompanying language annotations are auto-generated with a 72B-parameter VLM. Poutine is obtained by fine-tuning Poutine-Base with Group Relative Policy Optimization (GRPO) using less than 500 preference-labeled frames from the Waymo validation set. We show that both VLT pretraining and RL fine-tuning are critical to attain strong driving performance in the long-tail. Poutine-Base achieves a rater-feedback score (RFS) of 8.12 on the validation set, nearly matching Waymo's expert ground-truth RFS. The final Poutine model achieves an RFS of 7.99 on the official Waymo test set, placing 1st in the 2025 Waymo Vision-Based End-to-End Driving Challenge by a significant margin. These results highlight the promise of scalable VLT pre-training and lightweight RL fine-tuning to enable robust and generalizable autonomy.

2025-06-12

ArXiv (preprint)

doi.org

arxiv.org

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

Luke Rowe

Rodrigue De Schaetzen

Roger Girgis

Chris Pal

Liam Paull

We present Poutine, a 3B-parameter vision-language model (VLM) tailored for end-to-end autonomous driving in long-tail driving scenarios. Po… (see more)utine is trained in two stages. To obtain strong base driving capabilities, we train Poutine-Base in a self-supervised vision-language-trajectory (VLT) next-token prediction fashion on 83 hours of CoVLA nominal driving and 11 hours of Waymo long-tail driving. Accompanying language annotations are auto-generated with a 72B-parameter VLM. Poutine is obtained by fine-tuning Poutine-Base with Group Relative Policy Optimization (GRPO) using less than 500 preference-labeled frames from the Waymo validation set. We show that both VLT pretraining and RL fine-tuning are critical to attain strong driving performance in the long-tail. Poutine-Base achieves a rater-feedback score (RFS) of 8.12 on the validation set, nearly matching Waymo's expert ground-truth RFS. The final Poutine model achieves an RFS of 7.99 on the official Waymo test set, placing 1st in the 2025 Waymo Vision-Based End-to-End Driving Challenge by a significant margin. These results highlight the promise of scalable VLT pre-training and lightweight RL fine-tuning to enable robust and generalizable autonomy.

2025-06-12

ArXiv (preprint)

doi.org

arxiv.org

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving

Luke Rowe

Rodrigue De Schaetzen

Roger Girgis

Chris Pal

Liam Paull

Maintaining good driving behavior in out-of-distribution scenarios remains a critical challenge in autonomous driving. A promising direction… (see more) is to leverage the generalist knowledge and reasoning capabilities of large-language models by treating unusual driving scenarios as a logical reasoning task. In this work, we present Poutine, a method that uses an off-the-shelf 3B-parameter vision-language model (VLM) - without any additional components - to achieve robust end-to-end autonomous driving via a simple and scalable training recipe. To learn strong base driving capabilities, we first train Poutine-Base using self-supervised next-token prediction over vision, language, and trajectory (VLT) tokens, leveraging both nominal and long-tail driving data. In the second stage, we fine-tune Poutine-Base using Group Relative Policy Optimization (GRPO) with a small set of human preference-labeled examples. We evaluated our approach on the Waymo end-to-end driving benchmark curated for long-tail scenarios. The final Poutine model achieves an RFS of 7.99 on the test set, placing 1st in the 2025 Waymo Vision-Based End-to-End Driving Challenge by a significant margin. Our results suggest that handcrafted tokenizers or custom architectural components added to base VLMs in prior work are not necessary to achieve strong driving performance. Instead, this work highlights the potential of scalable VLT pretraining combined with lightweight RL fine-tuning to enable robust and generalizable autonomous driving.

2025-06-12

ArXiv (preprint)

doi.org

arxiv.org

PyLO: Towards Accessible Learned Optimizers in PyTorch

Paul Janson

Benjamin Therien

Quentin Anthony

Xiaolong Huang

Abhinav Moudgil

Eugene Belilovsky

Learned optimizers have been an active research topic over the past decade, with increasing progress toward practical, general-purpose optim… (see more)izers that can serve as drop-in replacements for widely used methods like Adam. However, recent advances -- such as VeLO, which was meta-trained for 4000 TPU-months -- remain largely inaccessible to the broader community, in part due to their reliance on JAX and the absence of user-friendly packages for applying the optimizers after meta-training. To address this gap, we introduce PyLO, a PyTorch-based library that brings learned optimizers to the broader machine learning community through familiar, widely adopted workflows. Unlike prior work focused on synthetic or convex tasks, our emphasis is on applying learned optimization to real-world large-scale pre-training tasks. Our release includes a CUDA-accelerated version of the small_fc_lopt learned optimizer architecture from (Metz et al., 2022a), delivering substantial speedups -- from 39.36 to 205.59 samples/sec throughput for training ViT B/16 with batch size 32. PyLO also allows us to easily combine learned optimizers with existing optimization tools such as learning rate schedules and weight decay. When doing so, we find that learned optimizers can substantially benefit. Our code is available at https://github.com/Belilovsky-Lab/pylo

2025-06-12

ArXiv (preprint)

arxiv.org

On Selecting Robust Approaches for Learning Predictive Biomarkers in Metabolomics Data Sets.

Pier-Luc Plante

Metabolomics, the study of small molecules within biological systems, offers insights into metabolic processes and, consequently, holds grea… (see more)t promise for advancing health outcomes. Biomarker discovery in metabolomics represents a significant challenge, notably due to the high dimensionality of the data. Recent work has addressed this problem by analyzing the most important variables in machine learning models. Unfortunately, this approach relies on prior hypotheses about the structure of the data and may overlook simple patterns. To assess the true usefulness of machine learning methods, we evaluate them on a collection of 835 metabolomics data sets. This effort provides valuable insights for metabolomics researchers regarding where and when to use machine learning. It also establishes a benchmark for the evaluation of future methods. Nonetheless, the results emphasize the high diversity of data sets in metabolomics and the complexity of finding biologically relevant biomarkers. As a result, we propose a novel approach applicable across all data sets, offering guidance for future analyses. This method involves directly comparing univariate and multivariate models. We demonstrate through selected examples how this approach can guide data analysis across diverse data set structures, representative of the observed variability. Code and data are available for research purposes.

2025-06-12

Analytical Chemistry (published)

doi.org

Advancements in Affective and Behavior Analysis: The 8th ABAW Workshop and Competition

Dimitrios Kollias

Panagiotis Tzirakis

Alan Cowen

Stefanos Zafeiriou

Irene Kotsia

Eric Granger

Marco Pedersoli

Simon Bacon

Alice Baird

Chris Gagne

Chunchang Shao

Guanyu Hu

Soufiane Belharbi

Muhammad Haseeb Aslam

The 8th Affective & Behavior Analysis in-the-Wild (ABAW) Workshop at CVPR 2025 focuses on advancing the understanding and modeling of human … (see more)affective and behavioral patterns in real-world scenarios. It serves as a platform for interdisciplinary collaboration, showcasing the latest methodologies and applications in affective computing and behavior analysis. A core feature of the workshop is the ABAW Competition, which tackles critical challenges in human affect and behavior recognition essential for developing human-centered AI technologies. The 8th ABAW Competition features six challenges: (1) estimation of two continuous affect dimensions (valence and arousal), (2) recognition of eight mutually exclusive classes (the 7 basic expressions and a category 'other'), (3) detection of twelve action units, (4) recognition of seven mutually exclusive compound expressions, (5) estimation of emotional mimicry intensity across six dimensions, and (6) recognition of presence and absence of ambivalence/hesitancy. These challenges leverage datasets such as Aff-Wild2, C-EXPR-DB, HUME-Vidmimic2, and BAH, providing a comprehensive benchmark for evaluating affective behavior analysis models. Each challenge is assessed using specialized performance metrics, including Concordance Correlation Coefficient, F1-score, and Pearson's correlation. This paper provides an overview of the competition, detailing the datasets, pre-processing methodologies, evaluation criteria, baseline models and top performing teams' in each Challenge, including their obtained performance. Further details on the competition are available at: https://affective-behavior-analysis-inthe-wild.github.io/8th.

2025-06-11

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (published)

doi.org

Amortized Sampling with Transferable Normalizing Flows

Charlie B. Tan

Majdi Hassan

Leon Klein

Saifuddin Syed

Dominique Beaini

Michael M. Bronstein

Alexander Tong

Kirill Neklyudov

2025-06-11

ICML.cc/2025/Workshop/GenBio (spotlight)

doi.org

openreview.net

Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training

The ever-growing availability of unlabeled data presents both opportunities and challenges for training artificial intelligence systems. Whi… (see more)le self-supervised learning (SSL) has emerged as a powerful paradigm for extracting meaningful representations from vast amounts of unlabeled data, existing methods still struggle to adapt to the non-stationary, non-IID nature of real-world data streams without forgetting previously learned knowledge. Recent works have adopted a repeated cosine annealing schedule for large-scale continual pre-training; however, these schedules (1) inherently cause forgetting during the re-warming phase and (2) have not been systematically compared to existing continual SSL methods. In this work, we systematically compare the widely used cosine schedule with the recently proposed infinite learning rate schedule and empirically find the latter to be a more effective alternative. Our extensive empirical evaluation across diverse image and language datasets demonstrates that the infinite learning rate schedule consistently enhances continual pre-training performance compared to a repeated cosine decay without being restricted to a fixed iteration budget. For instance, in a small-scale MAE pre-training setup, it outperforms several strong baselines from the literature. We then scale up our experiments to larger MAE pre-training and autoregressive language model pre-training. Our results show that the infinite learning rate schedule remains effective at scale, surpassing repeated cosine decay for both MAE pre-training and zero-shot LM benchmarks.

2025-06-11

ICML.cc/2025/Workshop/ES-FoMo-III (published)

doi.org

openreview.net

Causal Climate Emulation with Bayesian Filtering

Sebastian H. M. Hickman

Ilija Trajković

Julia Kaltenborn

Francis Pelletier

Alex Archibald

Yaniv Gurwicz

Peer Nowack

David Rolnick

Julien Boussard

Traditional models of climate change use complex systems of coupled equations to simulate physical processes across the Earth system. These … (see more)simulations are highly computationally expensive, limiting our predictions of climate change and analyses of its causes and effects. Machine learning has the potential to quickly emulate data from climate models, but current approaches are not able to incorporate physics-informed causal relationships. Here, we develop an interpretable climate model emulator based on causal representation learning. We derive a physics-informed approach including a Bayesian filter for stable long-term autoregressive emulation. We demonstrate that our emulator learns accurate climate dynamics, and we show the importance of each one of its components on a realistic synthetic dataset and data from two widely deployed climate models.

2025-06-11

ArXiv (preprint)

doi.org

arxiv.org

Causal Climate Emulation with Bayesian Filtering

Sebastian H. M. Hickman

Ilija Trajković

Julia Kaltenborn

Francis Pelletier

Alex Archibald

Yaniv Gurwicz

Peer Nowack

David Rolnick

Julien Boussard

Traditional models of climate change use complex systems of coupled equations to simulate physical processes across the Earth system. These … (see more)simulations are highly computationally expensive, limiting our predictions of climate change and analyses of its causes and effects. Machine learning has the potential to quickly emulate data from climate models, but current approaches are not able to incorporate physics-informed causal relationships. Here, we develop an interpretable climate model emulator based on causal representation learning. We derive a physics-informed approach including a Bayesian filter for stable long-term autoregressive emulation. We demonstrate that our emulator learns accurate climate dynamics, and we show the importance of each one of its components on a realistic synthetic dataset and data from two widely deployed climate models.

2025-06-11

ArXiv (preprint)

arxiv.org

Fast Monte Carlo Tree Diffusion: 100x Speedup via Parallel Sparse Planning

Jaesik Yoon

Hyeonseo Cho

Yoshua Bengio

Sungjin Ahn

Diffusion models have recently emerged as a powerful approach for trajectory planning. However, their inherently non-sequential nature limit… (see more)s their effectiveness in long-horizon reasoning tasks at test time. The recently proposed Monte Carlo Tree Diffusion (MCTD) offers a promising solution by combining diffusion with tree-based search, achieving state-of-the-art performance on complex planning problems. Despite its strengths, our analysis shows that MCTD incurs substantial computational overhead due to the sequential nature of tree search and the cost of iterative denoising. To address this, we propose Fast-MCTD, a more efficient variant that preserves the strengths of MCTD while significantly improving its speed and scalability. Fast-MCTD integrates two techniques: Parallel MCTD, which enables parallel rollouts via delayed tree updates and redundancy-aware selection; and Sparse MCTD, which reduces rollout length through trajectory coarsening. Experiments show that Fast-MCTD achieves up to 100x speedup over standard MCTD while maintaining or improving planning performance. Remarkably, it even outperforms Diffuser in inference speed on some tasks, despite Diffuser requiring no search and yielding weaker solutions. These results position Fast-MCTD as a practical and scalable solution for diffusion-based inference-time reasoning.

2025-06-11

ArXiv (preprint)

arxiv.org

Speed Science

Leading in a New Era

Supervision Requests

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Publications