Publications

Long-Horizon Model-Based Offline Reinforcement Learning Without Conservatism

2025-12-03

arXiv (preprint)

doi.org

openreview.net

A Survey of Bugs in AI-Generated Code

Ruofan Gao

Amjed Tahir

Peng Liang

Teo Susnjak

Foutse Khomh

Developers are widely using AI code-generation models, aiming to increase productivity and efficiency. However, there are also quality conce… (see more)rns regarding the AI-generated code. The generated code is produced by models trained on publicly available code, which are known to contain bugs and quality issues. Those issues can cause trust and maintenance challenges during the development process. Several quality issues associated with AI-generated code have been reported, including bugs and defects. However, these findings are often scattered and lack a systematic summary. A comprehensive review is currently lacking to reveal the types and distribution of these errors, possible remediation strategies, as well as their correlation with the specific models. In this paper, we systematically analyze the existing AI-generated code literature to establish an overall understanding of bugs and defects in generated code, providing a reference for future model improvement and quality assessment. We aim to understand the nature and extent of bugs in AI-generated code, and provide a classification of bug types and patterns present in code generated by different models. We also discuss possible fixes and mitigation strategies adopted to eliminate bugs from the generated code.

2025-12-03

ArXiv (preprint)

doi.org

arxiv.org

Cognitive cartography of mammalian brains using meta-analysis of AI experts

Andrea I. Luppi

Hana Ali

Zhen-Qi Liu

Filip Milisav

Alessandro Gozzi

Danilo Bzdok

Bratislav Misic

The complexity of the brain is increasingly mirrored by the complexity of the neuroscientific literature, yet no individual mind can fully g… (see more)rasp the diversity of scales, methodologies and model organisms. Where human experts flag, the latest AI models excel: large language models can seamlessly integrate knowledge across scientific domains. Here we show how large language models can systematically and quantitatively synthesise literature-wide neuroscientific knowledge about the cognitive operations and dysfunctions associated with each brain region. Meta-analysis of AI experts reveals structure-function mappings to which existing meta-analytic frameworks are blind, demonstrated by lesions and direct intracranial stimulation. It also unlocks the possibility of extending quantitative literature meta-analysis and decoding of brain maps to other model organisms beyond human. As proof of concept, we integrate LLM meta-analysis with species-specific transcriptomics in human, macaque, and mouse, to discover an evolutionarily conserved molecular circuit for cognition. Altogether, meta-analysis of AI experts can fundamentally catalyze neuroscientific discovery by overcoming the barrier of data aggregation from heterogeneous studies, finally bringing together a scattered literature to identify emergent patterns and latent insights across disparate subfields, modalities, and species.

2025-12-02

bioRxiv (preprint)

doi.org

Curly Flow Matching for Learning Non-gradient Field Dynamics

Katarina Petrović

Lazar Atanackovic

Viggo Moro

Kacper Kapuśniak

İsmail İlkan Ceylan

Michael Bronstein

Avishek Joey Bose

Alexander Tong

Modeling the transport dynamics of natural processes from population-level observations is a ubiquitous problem in the natural sciences. Suc… (see more)h models rely on key assumptions about the underlying process in order to enable faithful learning of governing dynamics that mimic the actual system behavior. The de facto assumption in current approaches relies on the principle of least action that results in gradient field dynamics and leads to trajectories minimizing an energy functional between two probability measures. However, many real-world systems, such as cell cycles in single-cell RNA, are known to exhibit non-gradient, periodic behavior, which fundamentally cannot be captured by current state-of-the-art methods such as flow and bridge matching. In this paper, we introduce Curly Flow Matching (Curly-FM), a novel approach that is capable of learning non-gradient field dynamics by designing and solving a Schrödinger bridge problem with a non-zero drift reference process---in stark contrast to typical zero-drift reference processes---which is constructed using inferred velocities in addition to population snapshot data. We showcase Curly-FM by solving the trajectory inference problems for single cells, computational fluid dynamics, and ocean currents with approximate velocities. We demonstrate that Curly-FM can learn trajectories that better match both the reference process and population marginals. Curly-FM expands flow matching models beyond the modeling of populations and towards the modeling of known periodic behavior in physical systems. Our code repository is accessibleat: https://github.com/kpetrovicc/curly-flow-matching.git

2025-12-02

Conference on Neural Information Processing Systems (Accept (poster))

openreview.net

Diffusion Tree Sampling: Scalable inference-time alignment of diffusion models

Adapting a pretrained diffusion model to new objectives at inference time remains an open problem in generative modeling. Existing steering … (see more)methods suffer from inaccurate value estimation, especially at high noise levels, which biases guidance. Moreover, information from past runs is not reused to improve sample quality, resulting in inefficient use of compute. Inspired by the success of Monte Carlo Tree Search, we address these limitations by casting inference-time alignment as a search problem that reuses past computations. We introduce a tree-based approach that samples from the reward-aligned target density by propagating terminal rewards back through the diffusion chain and iteratively refining value estimates with each additional generation. Our proposed method, Diffusion Tree Sampling (DTS), produces asymptotically exact samples from the target distribution in the limit of infinite rollouts, and its greedy variant, Diffusion Tree Search (DTS

2025-12-02

Neural Information Processing Systems (Accept (poster))

doi.org

openreview.net

Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation

Bailey Trang

Parham Saremi

Alan Q. Wang

Fangrui Huang

Zahra TehraniNasab

Amar Kumar

Tal Arbel

Li Fei-Fei

Ehsan Adeli

Capturing diversity is crucial in conditional and prompt-based image generation, particularly when conditions contain uncertainty that can l… (see more)ead to multiple plausible outputs. To generate diverse images reflecting this diversity, traditional methods often modify random seeds, making it difficult to discern meaningful differences between samples, or diversify the input prompt, which is limited in verbally interpretable diversity. We propose Rainbow, a novel conditional image generation framework, applicable to any pretrained conditional generative model, that addresses inherent condition/prompt uncertainty and generates diverse plausible images. Rainbow is based on a simple yet effective idea: decomposing the input condition into diverse latent representations, each capturing an aspect of the uncertainty and generating a distinct image. First, we integrate a latent graph, parameterized by Generative Flow Networks (GFlowNets), into the prompt representation computation. Second, leveraging GFlowNets' advanced graph sampling capabilities to capture uncertainty and output diverse trajectories over the graph, we produce multiple trajectories that collectively represent the input condition, leading to diverse condition representations and corresponding output images. Evaluations on natural image and medical image datasets demonstrate Rainbow's improvement in both diversity and fidelity across image synthesis, image generation, and counterfactual generation tasks.

2025-12-02

Neural Information Processing Systems (Accept (poster))

doi.org

openreview.net

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

As generative AI systems become competent and democratized in science, business, and government, deeper insight into their failure modes now… (see more) poses an acute need. The occasional volatility in their behavior, such as the propensity of transformer models to hallucinate, impedes trust and adoption of emerging AI solutions in high-stakes areas. In the present work, we establish how and when hallucinations arise in pre-trained transformer models through concept representations captured by sparse autoencoders, under scenarios with experimentally controlled uncertainty in the input space. Our systematic experiments reveal that the number of semantic concepts used by the transformer model grows as the input information becomes increasingly unstructured. In the face of growing uncertainty in the input space, the transformer model becomes prone to activate coherent yet input-insensitive semantic features, leading to hallucinated output. At its extreme, for pure-noise inputs, we identify a wide variety of robustly triggered and meaningful concepts in the intermediate activations of pre-trained transformer models, whose functional integrity we confirm through targeted steering. We also show that hallucinations in the output of a transformer model can be reliably predicted from the concept patterns embedded in transformer layer activations. This collection of insights on transformer internal processing mechanics has immediate consequences for aligning AI models with human values, AI safety, opening the attack surface for potential adversarial attacks, and providing a basis for automatic quantification of a model's hallucination risk.

2025-12-02

Conference on Neural Information Processing Systems (Accept (poster))

doi.org

openreview.net

Geometry-Aware Edge Pooling for Graph Neural Networks

Katharina Limbeck

Lydia Mezrag

Guy Wolf

Bastian Rieck

Graph Neural Networks (GNNs) have shown significant success for graph-based tasks. Motivated by the prevalence of large datasets in real-wor… (see more)ld applications, pooling layers are crucial components of GNNs. By reducing the size of input graphs, pooling enables faster training and potentially better generalisation. However, existing pooling operations often optimise for the learning task at the expense of discarding fundamental graph structures, thus reducing interpretability. This leads to unreliable performance across dataset types, downstream tasks and pooling ratios. Addressing these concerns, we propose novel graph pooling layers for structure-aware pooling via edge collapses. Our methods leverage diffusion geometry and iteratively reduce a graph's size while preserving both its metric structure and its structural diversity. We guide pooling using magnitude, an isometry-invariant diversity measure, which permits us to control the fidelity of the pooling process. Further, we use the spread of a metric space as a faster and more stable alternative ensuring computational efficiency. Empirical results demonstrate that our methods (i) achieve top performance compared to alternative pooling layers across a range of diverse graph classification tasks, (ii) preserve key spectral properties of the input graphs, and (iii) retain high accuracy across varying pooling ratios.

2025-12-02

Conference on Neural Information Processing Systems (Accept (poster))

doi.org

openreview.net

PointMAC: Meta-Learned Adaptation for Robust Test-Time Point Cloud Completion

Linlian Jiang

Rui Ma

Li Gu

Ziqiang Wang

Xinxin Zuo

Yang Wang

Point cloud completion is essential for robust 3D perception in safety-critical applications such as robotics and augmented reality. However… (see more), existing models perform static inference and rely heavily on inductive biases learned during training, limiting their ability to adapt to novel structural patterns and sensor-induced distortions at test time. To address this limitation, we propose PointMAC, a meta-learned framework for robust test-time adaptation in point cloud completion. It enables sample-specific refinement without requiring additional supervision. Our method optimizes the completion model under two self-supervised auxiliary objectives that simulate structural and sensor-level incompleteness. A meta-auxiliary learning strategy based on Model-Agnostic Meta-Learning (MAML) ensures that adaptation driven by auxiliary objectives is consistently aligned with the primary completion task. During inference, we adapt the shared encoder on-the-fly by optimizing auxiliary losses, with the decoder kept fixed. To further stabilize adaptation, we introduce Adaptive

2025-12-02

Conference on Neural Information Processing Systems (Accept (poster))

doi.org

openreview.net

Understanding Softmax Attention Layers: Exact Mean-Field Analysis on a Toy Problem

Elvis Dohmatob

Self-attention has emerged as a fundamental component driving the success of modern transformer architectures, which power large language mo… (see more)dels and various applications. However, a theoretical understanding of how such models actually work is still under active development. The recent work of (Marion et al., 2025) introduced the so-called "single-location regression" problem, which can provably be solved by a simplified self-attention layer but not by linear models, thereby demonstrating a striking functional separation. A rigorous analysis of self-attention with softmax for this problem is challenging due to the coupled nature of the model. In the present work, we use ideas from the classical random energy model in statistical physics to analyze softmax self-attention on the single-location problem. Our analysis yields exact analytic expressions for the population risk in terms of the overlaps between the learned model parameters and those of an oracle. Moreover, we derive a detailed description of the gradient descent dynamics for these overlaps and prove that, under broad conditions, the dynamics converge to the unique oracle attractor. Our work not only advances our understanding of self-attention but also provides key theoretical ideas that are likely to find use in further analyses of even more complex transformer architectures.

2025-12-02

Conference on Neural Information Processing Systems (Accept (poster))

openreview.net

Deploying Geospatial Foundation Models in the Real World: Lessons from WorldCereal

Christina Butsko

Gabriel Tseng

Kristof Van Tricht

Giorgia Milli

David Rolnick

Ruben Cartuyvels

Inbal Becker Reshef

Zoltan Szantoi

Hannah Kerner

The increasing availability of geospatial foundation models has the potential to transform remote sensing applications such as land cover cl… (see more)assification, environmental monitoring, and change detection. Despite promising benchmark results, the deployment of these models in operational settings is challenging and rare. Standardized evaluation tasks often fail to capture real-world complexities relevant for end-user adoption such as data heterogeneity, resource constraints, and application-specific requirements. This paper presents a structured approach to integrate geospatial foundation models into operational mapping systems. Our protocol has three key steps: defining application requirements, adapting the model to domain-specific data and conducting rigorous empirical testing. Using the Presto model in a case study for crop mapping, we demonstrate that fine-tuning a pre-trained model significantly improves performance over conventional supervised methods. Our results highlight the model’s strong spatial and temporal generalization capabilities. Our protocol provides a replicable blueprint for practitioners and lays the groundwork for future research to operationalize foundation models in diverse remote sensing applications. Application of the protocol to the WorldCereal global crop-mapping system showcases the framework’s scalability.

2025-12-01

Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation (published)

doi.org

proceedings.mlr.press

The Cloud-Based Geospatial Benchmark: Challenges and LLM Evaluation

Jeffrey A. Cardille

Renee Johnston

Simon Ilyushchenko

Johan Kartiwa

Zahra Shamsi

Matthew Abraham

Khashayar Azad

Kainath Ahmed

Emma Bergeron Quick

Nuala Caughie

Noah Jencz

Karen Dyson

Andrea Puzzi Nicolau

Maria Fernanda Lopez-Ornelas

David Saah

Michael Brenner

Subhashini Venugopalan

Sameera S Ponda

2025-12-01

Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation (published)

proceedings.mlr.press

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications