Publications

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Lazar Atanackovic

Xi Zhang

Brandon Amos

Mathieu Blanchette

Leo J Lee

Yoshua Bengio

Alexander Tong

Kirill Neklyudov

Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynam… (see more)ics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities. That is, the change of the population at any moment in time depends on the population itself due to the interactions between samples. In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depends on the microenvironment of cells specific to each patient. We propose Meta Flow Matching (MFM), a practical approach to integrating along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations. Namely, we embed the population of samples using a Graph Neural Network (GNN) and use these embeddings to train a Flow Matching model. This gives MFM the ability to generalize over the initial distributions unlike previously proposed methods. We demonstrate the ability of MFM to improve prediction of individual treatment responses on a large scale multi-patient single-cell drug screen dataset.

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

MMTEB: Massive Multilingual Text Embedding Benchmark

Kenneth Enevoldsen

Isaac Chung

Imene Kerboua

Márton Kardos

Ashwin Mathur

David Stap

Jay Gala

Wissam Siblini

Dominik Krzemiński

Genta Indra Winata

Saba Sturua

Saiteja Utpala

Mathieu Ciancone

Marion Schaeffer

Diganta Misra

Shreeya Dhakal

Jonathan Rystrøm

Roman Solomatin

Ömer Veysel Çağatan

Akash Kundu … (see 62 more)

Martin Bernstorff

Shitao Xiao

Akshita Sukhlecha

Bhavish Pahwa

Rafał Poświata

Kranthi Kiran GV

Shawon Ashraf

Daniel Auras

Björn Plüster

Jan Philipp Harries

Loïc Magne

Isabelle Mohr

Dawei Zhu

Hippolyte Gisserot-Boukhlef

Tom Aarsen

Jan Kostkan

Konrad Wojtasik

Taemin Lee

Marek Suppa

Crystina Zhang

Roberta Rocca

Mohammed Hamdy

Andrianos Michail

John Yang

Manuel Faysse

Aleksei Vatolin

Nandan Thakur

Manan Dey

Dipam Vasani

Pranjal A Chitale

Simone Tedeschi

Nguyen Tai

Artem Snegirev

Mariya Hendriksen

Michael Günther

Mengzhou Xia

Weijia Shi

Xing Han Lu

Jordan Clive

Gayatri K

Maksimova Anna

Silvan Wehrli

Maria Tikhonova

Henil Shalin Panchal

Aleksandr Abramov

Malte Ostendorff

Zheng Liu

Simon Clematide

Lester James Validad Miranda

Alena Fenogenova

Guangyu Song

Ruqiya Bin Safi

Wen-Ding Li

Alessia Borghini

Federico Cassano

Lasse Hansen

Sara Hooker

Chenghao Xiao

Vaibhav Adlakha

Orion Weller

Siva Reddy

Niklas Muennighoff

Text embeddings are typically evaluated on a narrow set of tasks, limited in terms of languages, domains, and task types. To circumvent this… (see more) limitation and to provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) -- a large-scale community-driven initiative expanding MTEB to over 500 quality-controlled evaluation tasks across 1,000+ languages. MMTEB includes a wide range of challenging novel tasks such as instruction following, long-document retrieval, and code retrieval, and represents the largest multilingual collection of evaluation tasks for embedding models to date. We use this collection to construct multiple highly multilingual benchmarks. We evaluate a representative set of models on these benchmarks. Our findings indicate that, while LLM-based models can achieve state-of-the-art performance on a subset of languages, the best-performing publicly available model across languages is the notably smaller, multilingual-e5-large-instruct. Massive benchmarks often impose high computational demands, limiting accessibility, particularly for low-resource communities. To address this, we downsample tasks based on inter-task correlation (i.e., selecting only a diverse set of tasks) while preserving relative rankings. We further optimize tasks such as retrieval by sampling hard negatives, creating smaller but effective splits. These optimizations allow us to introduce benchmarks at a significantly lower computational cost. For instance, we introduce a new zero-shot English benchmark that maintains a similar ordering at a fraction of the cost.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

MMTEB: Massive Multilingual Text Embedding Benchmark

Kenneth Enevoldsen

Isaac Chung

Imene Kerboua

Márton Kardos

Ashwin Mathur

David Stap

Jay Gala

Wissam Siblini

Dominik Krzemiński

Genta Indra Winata

Saba Sturua

Saiteja Utpala

Mathieu Ciancone

Marion Schaeffer

Gabriel Sequeira

Diganta Misra

Shreeya Dhakal

Jonathan Rystrøm

Roman Solomatin

Ömer Veysel Çağatan … (see 66 more)

Akash Kundu

Martin Bernstorff

Shitao Xiao

Akshita Sukhlecha

Bhavish Pahwa

Rafał Poświata

Kranthi Kiran GV

Shawon Ashraf

Daniel Auras

Björn Plüster

Jan Philipp Harries

Loïc Magne

Isabelle Mohr

Mariya Hendriksen

Dawei Zhu

Hippolyte Gisserot-Boukhlef

Tom Aarsen

Jan Kostkan

Konrad Wojtasik

Taemin Lee

Marek Suppa

Crystina Zhang

Roberta Rocca

Mohammed Hamdy

Andrianos Michail

John Yang

Manuel Faysse

Aleksei Vatolin

Nandan Thakur

Manan Dey

Dipam Vasani

Pranjal A Chitale

Simone Tedeschi

Nguyen Tai

Artem Snegirev

Michael Günther

Mengzhou Xia

Weijia Shi

Xing Han Lu

Jordan Clive

Gayatri K

Maksimova Anna

Silvan Wehrli

Maria Tikhonova

Henil Shalin Panchal

Aleksandr Abramov

Malte Ostendorff

Zheng Liu

Simon Clematide

Lester James Validad Miranda

Alena Fenogenova

Guangyu Song

Ruqiya Bin Safi

Wen-Ding Li

Alessia Borghini

Federico Cassano

Hongjin Su

Jimmy Lin

Howard Yen

Lasse Hansen

Sara Hooker

Chenghao Xiao

Vaibhav Adlakha

Orion Weller

Siva Reddy

Niklas Muennighoff

Text embeddings are typically evaluated on a narrow set of tasks, limited in terms of languages, domains, and task types. To circumvent this… (see more) limitation and to provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) -- a large-scale community-driven initiative expanding MTEB to over 500 quality-controlled evaluation tasks across 1,000+ languages. MMTEB includes a wide range of challenging novel tasks such as instruction following, long-document retrieval, and code retrieval, and represents the largest multilingual collection of evaluation tasks for embedding models to date. We use this collection to construct multiple highly multilingual benchmarks. We evaluate a representative set of models on these benchmarks. Our findings indicate that, while LLM-based models can achieve state-of-the-art performance on a subset of languages, the best-performing publicly available model across languages is the notably smaller, multilingual-e5-large-instruct. Massive benchmarks often impose high computational demands, limiting accessibility, particularly for low-resource communities. To address this, we downsample tasks based on inter-task correlation (i.e., selecting only a diverse set of tasks) while preserving relative rankings. We further optimize tasks such as retrieval by sampling hard negatives, creating smaller but effective splits. These optimizations allow us to introduce benchmarks at a significantly lower computational cost. For instance, we introduce a new zero-shot English benchmark that maintains a similar ordering at a fraction of the cost.

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

MODL: Multilearner Online Deep Learning

Antonios Valkanas

Boris Oreshkin

Mark Coates

Online deep learning solves the problem of learning from streams of data, reconciling two opposing objectives: learn fast and learn deep. Ex… (see more)isting work focuses almost exclusively on exploring pure deep learning solutions, which are much better suited to handle the"deep"than the"fast"part of the online learning equation. In our work, we propose a different paradigm, based on a hybrid multilearner approach. First, we develop a fast online logistic regression learner. This learner does not rely on backpropagation. Instead, it uses closed form recursive updates of model parameters, handling the fast learning part of the online learning problem. We then analyze the existing online deep learning theory and show that the widespread ODL approach, currently operating at complexity

2025-01-22

aistats.org/AISTATS/2025/Conference (poster)

doi.org

openreview.net

Multi-agent cooperation through learning-aware policy gradients

Alexander Meulemans

Seijin Kobayashi

Johannes Von Oswald

Nino Scherrer

Eric Elmoznino

Blake Richards

Guillaume Lajoie

Blaise Aguera y Arcas

João Sacramento

Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning. How can we achieve cooperation… (see more) among self-interested, independent learning agents? Promising recent work has shown that in certain tasks cooperation can be established between learning-aware agents who model the learning dynamics of each other. Here, we present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning, which takes into account that other agents are themselves learning through trial and error based on multiple noisy trials. We then leverage efficient sequence models to condition behavior on long observation histories that contain traces of the learning dynamics of other agents. Training long-context policies with our algorithm leads to cooperative behavior and high returns on standard social dilemmas, including a challenging environment where temporally-extended action coordination is required. Finally, we derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Multi-session, multi-task neural decoding from distinct cell-types and brain regions

Mehdi Azabou

Krystal Xuejing Pan

Vinam Arora

Ian Jarratt Knight

Eva L Dyer

Blake Richards

Recent work has shown that scale is important for improved brain decoding, with more data leading to greater decoding accuracy. However, lar… (see more)ge-scale decoding across many different datasets is challenging because neural circuits are heterogeneous---each brain region contains a unique mix of cellular sub-types, and the responses to different stimuli are diverse across regions and sub-types. It is unknown whether it is possible to pre-train and transfer brain decoding models between distinct tasks, cellular sub-types, and brain regions. To address these questions, we developed a multi-task transformer architecture and trained it on the entirety of the Allen Institute's Brain Observatory dataset. This dataset contains responses from over 100,000 neurons in 6 areas of the brains of mice, observed with two-photon calcium imaging, recorded while the mice observed different types of visual stimuli. Our results demonstrate that transfer is indeed possible -combining data from different sources is beneficial for a number of downstream decoding tasks. As well, we can transfer the model between regions and sub-types, demonstrating that there is in fact common information in diverse circuits that can be extracted by an appropriately designed model. Interestingly, we found that the model's latent representations showed clear distinctions between different brain regions and cellular sub-types, even though it was never given any information about these distinctions. Altogether, our work demonstrates that training a large-scale neural decoding model on diverse data is possible, and this provides a means of studying the differences and similarities between heterogeneous neural circuits.

2025-01-22

ICLR.cc/2025/Conference (spotlight)

openreview.net

Neuroplastic Expansion in Deep Reinforcement Learning

Jiashun Liu

Johan Samir Obando Ceron

Aaron Courville

Ling Pan

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching

Arnav Kumar Jain

Harley Wiltzer

Jesse Farebrother

Irina Rish

Glen Berseth

Sanjiban Choudhury

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Optimizing Return Distributions with Distributional Dynamic Programming

Bernardo Avila Pires

Mark Rowland

Diana Borsa

Zhaohan Daniel Guo

Khimya Khetarpal

Andre Barreto

David Abel

Remi Munos

Will Dabney

We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standar… (see more)d reinforcement learning as a special case. Previous distributional DP methods could optimize the same class of expected utilities as classic DP. To go beyond expected utilities, we combine distributional DP with stock augmentation, a technique previously introduced for classic DP in the context of risk-sensitive RL, where the MDP state is augmented with a statistic of the rewards obtained so far (since the first time step). We find that a number of recently studied problems can be formulated as stock-augmented return distribution optimization, and we show that we can use distributional DP to solve them. We analyze distributional value and policy iteration, with bounds and a study of what objectives these distributional DP methods can or cannot optimize. We describe a number of applications outlining how to use distributional DP to solve different stock-augmented return distribution optimization problems, for example maximizing conditional value-at-risk, and homeostatic regulation. To highlight the practical potential of stock-augmented return distribution optimization and distributional DP, we combine the core ideas of distributional value iteration with the deep RL agent DQN, and empirically evaluate it for solving instances of the applications discussed.

2025-01-22

ArXiv (preprint)

doi.org

arxiv.org

Optimizing Return Distributions with Distributional Dynamic Programming

Bernardo Avila Pires

Mark Rowland

Diana Borsa

Zhaohan Daniel Guo

Khimya Khetarpal

Andre Barreto

David Abel

Remi Munos

Will Dabney

We introduce distributional dynamic programming (DP) methods for optimizing statistical functionals of the return distribution, with standar… (see more)d reinforcement learning as a special case. Previous distributional DP methods could optimize the same class of expected utilities as classic DP. To go beyond expected utilities, we combine distributional DP with stock augmentation, a technique previously introduced for classic DP in the context of risk-sensitive RL, where the MDP state is augmented with a statistic of the rewards obtained so far (since the first time step). We find that a number of recently studied problems can be formulated as stock-augmented return distribution optimization, and we show that we can use distributional DP to solve them. We analyze distributional value and policy iteration, with bounds and a study of what objectives these distributional DP methods can or cannot optimize. We describe a number of applications outlining how to use distributional DP to solve different stock-augmented return distribution optimization problems, for example maximizing conditional value-at-risk, and homeostatic regulation. To highlight the practical potential of stock-augmented return distribution optimization and distributional DP, we combine the core ideas of distributional value iteration with the deep RL agent DQN, and empirically evaluate it for solving instances of the applications discussed.

2025-01-22

ArXiv (preprint)

arxiv.org

OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning

Xiaoqiang Wang

Bang Liu

Large language models (LLMs) and large multimodal models (LMMs) have shown great potential in automating complex tasks like web browsing and… (see more) gaming. However, their ability to generalize across diverse applications remains limited, hindering broader utility. To address this challenge, we present OSCAR: Operating System Control via state-Aware reasoning and Re-planning. OSCAR is a generalist agent designed to autonomously navigate and interact with various desktop and mobile applications through standardized controls, such as mouse and keyboard inputs, while processing screen images to fulfill user commands. OSCAR translates human instructions into executable Python code, enabling precise control over graphical user interfaces (GUIs). To enhance stability and adaptability, OSCAR operates as a state machine, equipped with error-handling mechanisms and dynamic task re-planning, allowing it to efficiently adjust to real-time feedback and exceptions. We demonstrate OSCAR’s effectiveness through extensive experiments on diverse benchmarks across desktop and mobile platforms, where it transforms complex workflows into simple natural language commands, significantly boosting user productivity. Our code will be open-source upon publication.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

ParetoFlow: Guided Flows in Multi-Objective Optimization

Ye Yuan

Can Chen

Chris Pal

Xue (Steve) Liu

In offline multi-objective optimization (MOO), we leverage an offline dataset of designs and their associated labels to simultaneously minim… (see more)ize multiple objectives. This setting more closely mirrors complex real-world problems compared to single-objective optimization. Recent works mainly employ evolutionary algorithms and Bayesian optimization, with limited attention given to the generative modeling capabilities inherent in such data. In this study, we explore generative modeling in offline MOO through flow matching, noted for its effectiveness and efficiency. We introduce ParetoFlow, specifically designed to guide flow sampling to approximate the Pareto front. Traditional predictor (classifier) guidance is inadequate for this purpose because it models only a single objective. In response, we propose a multi-objective predictor guidance module that assigns each sample a weight vector, representing a weighted distribution across multiple objective predictions. A local filtering scheme is introduced to address non-convex Pareto fronts. These weights uniformly cover the entire objective space, effectively directing sample generation towards the Pareto front. Since distributions with similar weights tend to generate similar samples, we introduce a neighboring evolution module to foster knowledge sharing among neighboring distributions. This module generates offspring from these distributions, and selects the most promising one for the next iteration. Our method achieves state-of-the-art performance across various tasks.

2025-01-22

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Publications

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Popular keywords:

Publications