Publications

AFlow: Automating Agentic Workflow Generation

Jiayi Zhang

Jinyu Xiang

Zhaoyang Yu

Fengwei Teng

Xiong-Hui Chen

Jiaqi Chen

Mingchen Zhuge

Xin Cheng

Sirui Hong

Jinlin Wang

Bingnan Zheng

Bang Liu

Yuyu Luo

Chenglin Wu

Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains, typically by employing … (voir plus)agentic workflows that follow detailed instructions and operational sequences. However, constructing these workflows requires significant human effort, limiting scalability and generalizability. Recent research has sought to automate the generation and optimization of these workflows, but existing methods still rely on initial manual setup and fall short of achieving fully automated and effective workflow generation. To address this challenge, we reformulate workflow optimization as a search problem over code-represented workflows, where LLM-invoking nodes are connected by edges. We introduce AFLOW, an automated framework that efficiently explores this space using Monte Carlo Tree Search, iteratively refining workflows through code modification, tree-structured experience, and execution feedback. Empirical evaluations across six benchmark datasets demonstrate AFLOW's efficacy, yielding a 5.7% average improvement over state-of-the-art baselines. Furthermore, AFLOW enables smaller models to outperform GPT-4o on specific tasks at 4.55% of its inference cost in dollars. The code is available at https://github.com/geekan/MetaGPT.

2025-01-21

ICLR.cc/2025/Conference (présentation orale)

openreview.net

Ant Colony Sampling with GFlowNets for Combinatorial Optimization

Jiwoo Son

Jinkyoo Park

We present the Generative Flow Ant Colony Sampler (GFACS), a novel meta-heuristic method that hierarchically combines amortized inference an… (voir plus)d parallel stochastic search. Our method first leverages Generative Flow Networks (GFlowNets) to amortize a \emph{multi-modal} prior distribution over combinatorial solution space that encompasses both high-reward and diversified solutions. This prior is iteratively updated via parallel stochastic search in the spirit of Ant Colony Optimization (ACO), leading to the posterior distribution that generates near-optimal solutions. Extensive experiments across seven combinatorial optimization problems demonstrate GFACS's promising performances.

2025-01-21

aistats.org/AISTATS/2025/Conference (poster)

doi.org

proceedings.mlr.press

AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly

Hongyu Guo

Yoshua Bengio

Shengchao Liu

2025-01-21

ICLR.cc/2025/Conference (poster)

openreview.net

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Shengyi Huang

The dominant paradigm for RLHF is online and on-policy RL: synchronously generating from the large language model (LLM) policy, labelling wi… (voir plus)th a reward model, and learning using feedback on the LLM's own outputs. While performant, this paradigm is computationally inefficient. Inspired by classical deep RL literature, we propose separating generation and learning in RLHF. This enables asynchronous generation of new samples while simultaneously training on old samples, leading to faster training and more compute-optimal scaling. However, asynchronous training relies on an underexplored regime, online but off-policy RLHF: learning on samples from previous iterations of our model which give a worse training signal. We tackle the fundamental challenge in this regime: how much off-policyness can we tolerate for asynchronous training to speed up learning but maintain performance? Among several RLHF algorithms we test, online DPO is found to be most robust to off-policy data, and robustness increases with the scale of the policy model. We study further compute optimizations for asynchronous RLHF but find that they come at a performance cost, giving rise to a trade-off. We verify the scalability of asynchronous RLHF by training a general-purpose chatbot from LLaMA 3.1 8B on an instruction-following task ~40% faster than a synchronous run while matching final performance. Finally, we extend our results to math and reasoning to demonstrate asynchronous RL can finetune Rho 1B on GSM8k ~70% faster while matching synchronous accuracy.

2025-01-21

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Beyond FVD: An Enhanced Metric for Evaluating Video Generation Distribution Quality

Ge Ya Luo

Gian Mario Favero

Zhi Hao Luo

Alexia Jolicoeur-Martineau

Christopher Pal

The Fréchet Video Distance (FVD) is a widely adopted metric for evaluating video generation distribution quality. However, its effectivenes… (voir plus)s relies on critical assumptions. Our analysis reveals three significant limitations: (1) the non-Gaussianity of the Inflated 3D Convnet (I3D) feature space; (2) the insensitivity of I3D features to temporal distortions; (3) the impractical sample sizes required for reliable estimation. These findings undermine FVD's reliability and show that FVD falls short as a standalone metric for video generation evaluation. After extensive analysis of a wide range of metrics and backbone architectures, we propose JEDi, the JEPA Embedding Distance, based on features derived from a Joint Embedding Predictive Architecture, measured using Maximum Mean Discrepancy with polynomial kernel. Our experiments on multiple open-source datasets show clear evidence that it is a superior alternative to the widely used FVD metric, requiring only 16% of the samples to reach its steady value, while increasing alignment with human evaluation by 34%, on average.Project page: https://oooolga.github.io/JEDi.github.io/.

2025-01-21

ICLR.cc/2025/Conference (poster)

openreview.net

BigDocs: An Open Dataset for Training Multi-modal Models on Document and Code Tasks

Juan Rodriguez

Xiangru Jian

Siba Smarak Panigrahi

Akshay Kalkunte

Amirhossein Abaskohi

Pierre-Andre Noel

Sanket Biswas … (voir 23 de plus)

Sara Shanian

Ying Zhang

Noah Bolger

Kurt MacDonald

Simon Fauvel

Sathwik Tejaswi

Srinivas Sunkara

Joao Monteiro

Krishnamurthy Dj Dvijotham

Torsten Scholak

Nicolas Chapados

Sepideh Kharagani

Sean Hughes

M. Özsu

Christopher Pal

Sai Rajeswar

Multimodal AI has the potential to significantly enhance document-understanding tasks, such as processing receipts, understanding workflows,… (voir plus) extracting data from documents, and summarizing reports. Code generation tasks that require long-structured outputs can also be enhanced by multimodality. Despite this, their use in commercial applications is often limited due to limited access to training data and restrictive licensing, which hinders open access. To address these limitations, we introduce BigDocs-7.5M, a high-quality, open-access dataset comprising 7.5 million multimodal documents across 30 tasks. We use an efficient data curation process to ensure our data is high-quality and license-permissive. Our process emphasizes accountability, responsibility, and transparency through filtering rules, traceable metadata, and careful content analysis. Additionally, we introduce BigDocs-Bench, a benchmark suite with 10 novel tasks where we create datasets that reflect real-world use cases involving reasoning over Graphical User Interfaces (GUI) and code generation from images. Our experiments show that training with BigDocs-Bench improves average performance up to 25.8% over closed-source GPT-4o in document reasoning and structured output tasks such as Screenshot2HTML or Image2Latex generation. Finally, human evaluations showed a preference for outputs from models trained on BigDocs over GPT-4o. This suggests that BigDocs can help both academics and the open-source community utilize and improve AI tools to enhance multimodal capabilities and document reasoning. The project is hosted at https://bigdocs.github.io .

2025-01-21

International Conference on Learning Representations (poster)

doi.org

openreview.net

Boosting Latent Diffusion with Perceptual Objectives

Tariq Berrada Ifriqi

Pietro Astolfi

Jakob Verbeek

Melissa Hall

Marton Havasi

Michal Drozdzal

Yohann Benchetrit

Adriana Romero-Soriano

Karteek Alahari

Latent diffusion models (LDMs) power state-of-the-art high-resolution generative image models. LDMs learn the data distribution in the laten… (voir plus)t space of an autoencoder (AE) and produce images by mapping the generated latents into RGB image space using the AE decoder. While this approach allows for efficient model training and sampling, it induces a disconnect between the training of the diffusion model and the decoder, resulting in a loss of detail in the generated images. To remediate this disconnect, we propose to leverage the internal features of the decoder to define a latent perceptual loss (LPL). This loss encourages the models to create sharper and more realistic images. Our loss can be seamlessly integrated with common autoencoders used in latent diffusion models, and can be applied to different generative modeling paradigms such as DDPM with epsilon and velocity prediction, as well as flow matching. Extensive experiments with models trained on three datasets at 256 and 512 resolution show improved quantitative -- with boosts between 6% and 20% in FID -- and qualitative results when using our perceptual loss.

2025-01-21

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

Matthew Fortier

Mats L. Richter

Oliver Sonnentag

Chris Pal

Terrestrial carbon fluxes provide vital information about our biosphere's health and its capacity to absorb anthropogenic CO…

2025-01-21

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

Celo: Training Versatile Learned Optimizers on a Compute Diet

Learned optimization has emerged as a promising alternative to hand-crafted optimizers, with the potential to discover stronger learned upda… (voir plus)te rules that enable faster, hyperparameter-free training of neural networks. A critical element for practically useful learned optimizers, that can be used off-the-shelf after meta-training, is strong meta-generalization: the ability to apply the optimizers to new tasks. Recent state-of-the-art work in learned optimizers, VeLO (Metz et al., 2022), requires a large number of highly diverse meta-training tasks along with massive computational resources, 4000 TPU months, to achieve meta-generalization. This makes further improvements to such learned optimizers impractical. In this work, we identify several key elements in learned optimizer architectures and meta-training procedures that can lead to strong meta-generalization. We also propose evaluation metrics to reliably assess quantitative performance of an optimizer at scale on a set of evaluation tasks. Our proposed approach, Celo, makes a significant leap in improving the meta-generalization performance of learned optimizers and also outperforms tuned state-of-the-art optimizers on a diverse set of out-of-distribution tasks, despite being meta-trained for just 24 GPU hours.

2025-01-21

ArXiv (prépublication)

doi.org

openreview.net

Credit-Based Self Organizing Maps: Training Deep Topographic Networks with Minimal Performance Degradation

Amir Ozhan Dehghani

Xinyu Qian

Asa Farahani

Pouya Bashivan

2025-01-21

ICLR.cc/2025/Conference (poster)

openreview.net

Don't Flatten, Tokenize! Unlocking the Key to SoftMoE's Efficacy in Deep RL

Ghada Sokar

Johan Obando-Ceron

Aaron Courville

Hugo Larochelle

Pablo Samuel Castro

The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While sof… (voir plus)t mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reasons behind their effectiveness remain largely unknown. In this work we provide an in-depth analysis identifying the key factors driving this performance gain. We discover the surprising result that tokenizing the encoder output, rather than the use of multiple experts, is what is behind the efficacy of SoftMoEs. Indeed, we demonstrate that even with an appropriately scaled single expert, we are able to maintain the performance gains, largely thanks to tokenization.

2025-01-21

ICLR.cc/2025/Conference (spotlight)

doi.org

openreview.net

Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

DiJia Su

Sainbayar Sukhbaatar

Michael G. Rabbat

Yuandong Tian

Qinqing Zheng

2025-01-21

ICLR.cc/2025/Conference (poster)

doi.org

openreview.net

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications