Publications

Burst firing optimizes invariant coding of natural communication signals by electrosensory neural populations

Michael G. Metzen

Amin Akhshi

Anmar Khadra

Maurice J. Chacron

Accurate perception of objects within the environment independent of context is essential for the survival of an organism. While neurons tha… (see more)t respond in an invariant manner to different stimulus waveforms resulting from identitypreserving transformations of objects are thought to provide a neural correlate of context-independent perception, how such responses emerge in the brain remains poorly understood. Here, we demonstrate that burst firing in neural populations can give rise to an invariant representation of highly heterogeneous natural communication stimuli. Multi-unit recordings from central sensory neural populations showed that considering burst spike trains led to invariant representations at the population but not the single neuron level. Computational modeling further revealed that optimal invariance is achieved at burst firing levels seen experimentally. Taken together, our results demonstrate an important function for burst firing toward establishing invariant representations of sensory input in neural populations.

2025-04-30

iScience (published)

doi.org

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Arjun Ashok

Andrew R. Williams

Étienne Marcotte

Valentina Zantedeschi

Jithendaraa Subramanian

Alexandre Lacoste

Forecasting is a critical task in decision-making across numerous domains. While historical numerical data provide a start, they fail to con… (see more)vey the complete context for reliable and accurate predictions. Human forecasters frequently rely on additional information, such as background knowledge and constraints, which can efficiently be communicated through natural language. However, in spite of recent progress with LLM-based forecasters, their ability to effectively integrate this textual information remains an open question. To address this, we introduce "Context is Key" (CiK), a time-series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context, requiring models to integrate both modalities; crucially, every task in CiK requires understanding textual context to be solved successfully. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters, and propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings. This benchmark aims to advance multimodal forecasting by promoting models that are both accurate and accessible to decision-makers with varied technical expertise. The benchmark can be visualized at https://servicenow.github.io/context-is-key-forecasting/v0/.

2025-04-30

International Conference on Machine Learning (poster)

doi.org

proceedings.mlr.press

Diminished social memory and hippocampal correlates of social interactions in chronic social defeat stress susceptibility

Amanda Larosa

Tian Rui Zhang

Alice S. Wong

Cyrus Y.H. Fung

Y. H. Fung Cyrus

Xiong Ling Yun (Jenny) Long

Prabhjeet Singh

Benjamin C. M. Fung

Tak Pan Wong

The susceptibility to chronic stress has been associated with depression, a mood disorder which highly implicates the hippocampus. Hippocamp… (see more)al contribution to stress susceptibility has been supported by findings in mice following chronic social defeat stress (CSDS). However, little is known of the role of hippocampal activity in determining the development of stress susceptibility. We used the UCLA miniscope to longitudinally measure the activity of dorsal CA1 hippocampal neurons across CSDS. Apart from examining the representation of social information by these neurons, we also compared social memory in mice that were susceptible or resilient to CSDS. We observed more stable dCA1 correlates of social interaction and social memory in CSDS resilience. Such changes were absent in CSDS susceptible mice and accompanied by greater social memory impairments. CSDS susceptibility may be supported by hippocampal social cognitive processes, reflected in diminished hippocampal representations of social information and a greater impairment in social memory.

2025-04-30

Biological Psychiatry Global Open Science (published)

doi.org

Discovering Symbolic Cognitive Models from Human and Animal Behavior

Pablo Samuel Castro

Nenad Tomasev

Ankit Anand

Navodita Sharma

Rishika Mohanta

Aparna Dev

Kuba Perlin

Siddhant Jain

Kyle Levin

Noemi Elteto

Will Dabney

Alexander Novikov

Glenn C Turner

Maria K Eckstein

Nathaniel D. Daw

Kevin J Miller

Kim Stachenfeld

Symbolic models play a key role in cognitive science, expressing computationally precise hypotheses about how the brain implements a cogniti… (see more)ve process. Identifying an appropriate model typically requires a great deal of effort and ingenuity on the part of a human scientist. Here, we adapt FunSearch (Romera-Paredes et al. 2024), a recently developed tool that uses Large Language Models (LLMs) in an evolutionary algorithm, to automatically discover symbolic cognitive models that accurately capture human and animal behavior. We consider datasets from three species performing a classic reward-learning task that has been the focus of substantial modeling effort, and find that the discovered programs outperform state-of-the-art cognitive models for each. The discovered programs can readily be interpreted as hypotheses about human and animal cognition, instantiating interpretable symbolic learning and decision-making algorithms. Broadly, these results demonstrate the viability of using LLM-powered program synthesis to propose novel scientific hypotheses regarding mechanisms of human and animal cognition.

2025-04-30

ICML.cc/2025/Conference (poster)

proceedings.mlr.press

Does learning the right latent variables necessarily improve in-context learning?

Large autoregressive models like Transformers can solve tasks through in-context learning (ICL) without learning new weights, suggesting ave… (see more)nues for efficiently solving new tasks. For many tasks, e.g., linear regression, the data factorizes: examples are independent given a task latent that generates the data, e.g., linear coefficients. While an optimal predictor leverages this factorization by inferring task latents, it is unclear if Transformers implicitly do so or if they instead exploit heuristics and statistical shortcuts enabled by attention layers. Both scenarios have inspired active ongoing work. In this paper, we systematically investigate the effect of explicitly inferring task latents. We minimally modify the Transformer architecture with a bottleneck designed to prevent shortcuts in favor of more structured solutions, and then compare performance against standard Transformers across various ICL tasks. Contrary to intuition and some recent works, we find little discernible difference between the two; biasing towards task-relevant latent variables does not lead to better out-of-distribution performance, in general. Curiously, we find that while the bottleneck effectively learns to extract latent task variables from context, downstream processing struggles to utilize them for robust prediction. Our study highlights the intrinsic limitations of Transformers in achieving structured ICL solutions that generalize, and shows that while inferring the right latents aids interpretability, it is not sufficient to alleviate this problem.

2025-04-30

International Conference on Machine Learning (poster)

doi.org

proceedings.mlr.press

Estimation of Head Motion in Structural MRI and its Impact on Cortical Thickness Measurements in Retrospective Data

C Bricout

S Ebrahimi Kahou

S Bouix

Motion-related artifacts are inevitable in Magnetic Resonance Imaging (MRI) and can bias automated neuroanatomical metrics such as cortical … (see more)thickness. These biases can interfere with statistical analysis which is a major concern as motion has been shown to be more prominent in certain populations such as children or individuals with ADHD. Manual review cannot objectively quantify motion in anatomical scans, and existing quantitative automated approaches often require specialized hardware or custom acquisition protocols. Here, we train a 3D convolutional neural network to estimate a summary motion metric in retrospective routine research scans by leveraging a large training dataset of synthetically motion-corrupted volumes. We validate our method with one held-out site from our training cohort and with 14 fully independent datasets, including one with manual ratings, achieving a representative

2025-04-30

arXiv (published)

doi.org

arxiv.org

FLAM: Frame-Wise Language-Audio Modeling

Yusong Wu

Christos Tsirigotis

Ke Chen

Cheng-Zhi Anna Huang

Aaron Courville

Oriol Nieto

Prem Seetharaman

Justin Salamon

Recent multi-modal audio-language models (ALMs) excel at text-audio retrieval but struggle with frame-wise audio understanding. Prior works … (see more)use temporal-aware labels or unsupervised training to improve frame-wise capabilities, but they still lack fine-grained labeling capability to pinpoint when an event occurs. While traditional sound event detection models can precisely localize events, they are limited to pre-defined categories, making them ineffective for real-world scenarios with out-of-distribution events. In this work, we introduce FLAM, an open-vocabulary contrastive audio-language model capable of localizing specific sound events. FLAM employs a memory-efficient and calibrated frame-wise objective with logit adjustment to address spurious correlations, such as event dependencies and label imbalances during training. To enable frame-wise supervision, we leverage a large-scale dataset with diverse audio events, LLM-generated captions and simulation. Experimental results and case studies demonstrate that FLAM significantly improves the open-vocabulary localization capability while maintaining strong performance in global retrieval and downstream tasks.

2025-04-30

ICML.cc/2025/Conference (poster)

doi.org

proceedings.mlr.press

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

Tianyu Zhang

Andrew Williams

Phillip Wozny

Kai-Hendrik Cohrs

Koen Ponse

Marco Jiralerspong

Soham Phade

Sunil Srinivasa

Lu Li

Yang Zhang

Prateek Gupta

Erman Acar

Irina Rish

Yoshua Bengio

Stephan Zheng

Comprehensive global cooperation is essential to limit global temperature increases while continuing economic development, e.g., reducing se… (see more)vere inequality or achieving long-term economic growth. Achieving long-term cooperation on climate change mitigation with n strategic agents poses a complex game-theoretic problem. For example, agents may negotiate and reach climate agreements, but there is no central authority to enforce adherence to those agreements. Hence, it is critical to design negotiation and agreement frameworks that foster cooperation, allow all agents to meet their individual policy objectives, and incentivize long-term adherence. This is an interdisciplinary challenge that calls for collaboration between researchers in machine learning, economics, climate science, law, policy, ethics, and other fields. In particular, we argue that machine learning is a critical tool to address the complexity of this domain. To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks. We also describe how to use multi-agent reinforcement learning to train rational agents using RICE-N. This framework underpinsAI for Global Climate Cooperation, a working group collaboration and competition on climate negotiation and agreement design. Here, we invite the scientific community to design and evaluate their solutions using RICE-N, machine learning, economic intuition, and other domain knowledge. More information can be found on www.ai4climatecoop.org.

2025-04-30

International Conference on Machine Learning (poster)

doi.org

proceedings.mlr.press

Generative AI: Hype, Hope, and Responsible Use in Science and Everyday Life

Doina Precup

2025-04-30

Biological Psychiatry (published)

doi.org

Half Search Space is All You Need

Pavel Rumiantsev

Mark J. Coates

2025-04-30

arXiv (published)

doi.org

arxiv.org

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts

Neil He

Rishabh Anand

Hiren Madhu

Ali Maatouk

Smita Krishnaswamy

Leandros Tassiulas

Menglin Yang 0001

Rex Ying

2025-04-30

arXiv (published)

doi.org

arxiv.org

Impact of through‐slice gradient optimization for dynamic slice‐wise shimming in the cervico‐thoracic spinal cord

Arnaud Breheret

Alexandre D'Astous

Yixin Ma

Jason P. Stockmann

Julien Cohen‐Adad

This study investigates the effectiveness of through‐slice gradient optimization in dynamic slice‐wise B0 shimming of the cervico‐thor… (see more)acic spinal cord to enhance signal recovery in gradient‐echo (GRE) EPI sequences commonly used in functional MRI studies. Six volunteers underwent MRI acquisitions with dynamic shim updating (DSU) using a custom‐built 15‐channel AC/DC coil at 3 T. A magnetization‐prepared rapid gradient echo was acquired to segment the spine and to provide a clear image of the anatomical region of interest in the figures. GRE B0 field maps were used to measure field homogeneity before and after shimming; the pre‐shimming field map was used for optimization. Shimmed fields were dynamically applied to GRE–echo planar imaging acquisitions simulating functional MRI acquisitions under two shimming conditions: DSU with and without through‐slice gradient consideration. DSU with through‐slice gradient optimization increased the temporal signal‐to‐noise ratio at the T2 vertebral level by 201% compared with volume‐wise shim and by 28% compared with DSU without through‐slice. The residual geometric distortions were similar between DSU with and without through‐slice gradient optimization. A high signal loss penalty parameter was effective in simulations for reducing through‐slice gradient‐induced signal loss but led to instability and reduced image quality in actual acquisitions due to excessive in‐plane B0 inhomogeneities. Introducing a carefully balanced through‐slice gradient parameter in slice‐wise shimming substantially improves signal recovery in axial GRE images of the spinal cord, without compromising in‐plane homogeneity. This effective approach can advance spinal cord functional MRI applications at high field strengths.

2025-04-30

Magnetic Resonance in Medicine (published)

doi.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications