Publications

Adaptive Local Training in Federated Learning

Donald Shenaj

Pietro Zanuttigh

Federated learning is a machine learning paradigm where multiple clients collaboratively train a global model by exchanging their locally tr… (voir plus)ained model weights instead of raw data. In the standard setting, every client trains the local model for the same number of epochs. We introduce ALT (Adaptive Local Training), a simple yet effective feedback mechanism that can be exploited at the client side to limit unnecessary and degrading computations. ALT dynamically adjusts the number of training epochs for each client based on the similarity between their local representations and the global one, ensuring that well-aligned clients can train longer without experiencing client drift. We evaluated ALT on federated partitions of the CIFAR-10 and Tiny-ImageNet datasets, demonstrating its effectiveness in improving model convergence and stability.

2025-03-05

ICLR.cc/2025/Workshop/MCDC (accepté)

openreview.net

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Ahmed Masry

Juan A. Rodriguez

Tianyu Zhang

Suyuchen Wang

Chao Wang

Aarash Feizi

Akshay Kalkunte Suresh

Abhay Puri

Xiangru Jian

Pierre-Andre Noel

Sathwik Tejaswi Madhusudhan

Enamul Hoque

Issam Hadj Laradji

David Vazquez

Perouz Taslakian … (voir 2 de plus)

Spandana Gella

Sai Rajeswar

Aligning visual features with language embeddings is a key challenge in vision-language models (VLMs). The performance of such models hinges… (voir plus) on having a good connector that maps visual features generated by a vision encoder to a shared embedding space with the LLM while preserving semantic similarity. Existing connectors, such as multilayer perceptrons (MLPs), often produce out-of-distribution or noisy inputs, leading to misalignment between the modalities. In this work, we propose a novel vision-text alignment method, AlignVLM, that maps visual features to a weighted average of LLM text embeddings. Our approach leverages the linguistic priors encoded by the LLM to ensure that visual features are mapped to regions of the space that the LLM can effectively interpret. AlignVLM is particularly effective for document understanding tasks, where scanned document images must be accurately mapped to their textual content. Our extensive experiments show that AlignVLM achieves state-of-the-art performance compared to prior alignment methods. We provide further analysis demonstrating improved vision-text feature alignment and robustness to noise.

2025-03-05

ICLR.cc/2025/Workshop/Re-Align (poster)

doi.org

openreview.net

Cracking the Code of Action: A Generative Approach to Affordances for Reinforcement Learning

David Venuto

Agents that can autonomously navigate the web through a graphical user interface (GUI) using a unified action space (e.g., mouse and keyboar… (voir plus)d actions) can require very large amounts of domain-specific expert demonstrations to achieve good performance. Low sample efficiency is often exacerbated in sparse-reward and large-action-space environments, such as a web GUI, where only a few actions are relevant in any given situation. In this work, we consider the low-data regime, with limited or no access to expert behavior. To enable sample-efficient learning, we explore the effect of constraining the action space through intent-based affordances -- i.e., considering in any situation only the subset of actions that achieve a desired outcome. We propose **Code as Generative Affordances**

2025-03-05

ICLR.cc/2025/Workshop/DL4C (publié)

openreview.net

Cracking the Code of Action: a Generative Approach to Affordances for Reinforcement Learning

David Venuto

Agents that can autonomously navigate the web through a graphical user interface (GUI) using a unified action space (e.g., mouse and keyboar… (voir plus)d actions) can require very large amounts of domain-specific expert demonstrations to achieve good performance. Low sample efficiency is often exacerbated in sparse-reward and large-action-space environments, such as a web GUI, where only a few actions are relevant in any given situation. In this work, we consider the low-data regime, with limited or no access to expert behavior. To enable sample-efficient learning, we explore the effect of constraining the action space through *intent-based affordances* -- i.e., considering in any situation only the subset of actions that achieve a desired outcome. We propose **Code as Generative Affordances (

2025-03-05

ICLR.cc/2025/Workshop/DL4C (publié)

doi.org

openreview.net

Design of Ligand-Binding Proteins with Atomic Flow Matching

Junqi Liu

Shaoning Li

Chence Shi

Zhi Yang

Jian Tang

2025-03-05

ICLR.cc/2025/Workshop/GEM (publié)

openreview.net

Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts

Minseon Kim

Lucas Caccia

Merging parameter-efficient task experts has recently gained growing attention as a way to build modular architectures that can be rapidly a… (voir plus)dapted on the fly for specific downstream tasks, without requiring additional fine-tuning. Typically, LoRA (Low-Rank Adaptation) serves as the foundational building block of such parameter-efficient modular architectures, leveraging low-rank weight structures to reduce the number of trainable parameters. In this paper, we study the properties of sparse adapters, which train only a subset of weights in the base neural network, as potential building blocks of modular architectures. First, we propose a simple method for training highly effective sparse adapters, which is conceptually simpler than existing methods in the literature and surprisingly outperforms both LoRA and full fine-tuning in our setting. Next, we investigate the merging properties of these sparse adapters by merging adapters for up to 20 natural language processing tasks, thus scaling beyond what is usually studied in the literature. Our findings demonstrate that sparse adapters yield superior in-distribution performance post-merging compared to LoRA or full model merging. Achieving strong held-out performance remains a challenge for all methods considered.

2025-03-05

ICLR.cc/2025/Workshop/MCDC (accepté)

openreview.net

From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions

Ruben Weijers

Denton Wu

Hannah Betts

Tamara Jacod

Yuxiang Guan

Vidya Sujaya

Kushal Dev

Toshali Goel

William Delooze

Reihaneh Rabbany

Ying Wu

Jean-François Godbout

Kellin Pelrine

Generative AI has the potential to transform personalization and accessibility of education. However, it raises serious concerns about accur… (voir plus)acy and helping students become independent critical thinkers. In this study, we designed a helpful yet fallible AI "Peer" to help students correct fundamental physics misconceptions related to Newtonian mechanic concepts. In contrast to approaches that seek near-perfect accuracy to create an authoritative AI tutor or teacher, we directly inform students that this AI can answer up to 40\% of questions incorrectly. In a randomized controlled trial with 165 students, those who engaged in targeted dialogue with the AI Peer achieved post-test scores that were, on average, 10.5 percentage points higher—with over 20 percentage points higher normalized gain—than a control group that discussed physics history. Qualitative feedback indicated that 91% of the treatment group's AI interactions were rated as helpful. Furthermore, by comparing student performance on pre- and post-test questions about the same concept, along with experts' annotations of the AI interactions, we find initial evidence suggesting the improvement in performance does not depend on the correctness of the AI. With further research, the AI Peer paradigm described here could open new possibilities for how we learn, adapt to, and grow with AI.

2025-03-05

ICLR.cc/2025/Workshop/Bi-Align (poster)

doi.org

openreview.net

A Generative Approach to LLM Harmfulness Detection with Red Flag Tokens

Mehrnaz Mofakhami

Most safety training methods for large-language models (LLMs) based on fine-tuning rely on dramatically changing the output distribution of … (voir plus)the model when faced with a harmful request, shifting it from an unsafe answer to a refusal to respond. These methods inherently compromise model capabilities and might make auto-regressive models vulnerable to attacks that make likely an initial token of affirmative response. To avoid that, we propose to expand the model's vocabulary with a special token we call a *red flag token* (

2025-03-05

ICLR.cc/2025/Workshop/BuildingTrust (accepté)

openreview.net

Grokking Beyond the Euclidean Norm of Model Parameters

Tikeng Notsawo Pascal Junior

Guillaume Dumas

Guillaume Rabusseau

Grokking refers to a delayed generalization following overfitting when optimizing artificial neural networks with gradient-based methods. I… (voir plus)n this work, we demonstrate that grokking can be induced by regularization, either explicit or implicit. More precisely, we show that when there exists a model with a property

2025-03-05

ICLR.cc/2025/Workshop/SLLM (publié)

openreview.net

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Zhanke Zhou

Xuan Li

Zhaocheng Zhu

Mikhail Galkin

Xiao Feng

Sanmi Koyejo

Jian Tang

Bo Han

Numerous applications of large language models (LLMs) rely on their ability to perform step-by-step reasoning. However, the reasoning behavi… (voir plus)or of LLMs remains poorly understood, posing challenges to research, development, and safety. To address this gap, we introduce landscape of thoughts-the first visualization tool for users to inspect the reasoning paths of chain-of-thought and its derivatives on any multi-choice dataset. Specifically, we represent the states in a reasoning path as feature vectors that quantify their distances to all answer choices. These features are then visualized in two-dimensional plots using t-SNE. Qualitative analysis shows that the landscape of thoughts effectively distinguishes between strong and weak models, correct and incorrect answers, as well as different reasoning tasks. It also uncovers undesirable reasoning patterns, such as low consistency and high uncertainty. Additionally, users can adapt our tool to a neural model that predicts any property they observe. We showcase this advantage by adapting our tool to a lightweight verifier, which significantly improves reasoning by evaluating the correctness of reasoning paths.

2025-03-05

ICLR.cc/2025/Workshop/LLM_Reason_and_Plan (publié)

openreview.net

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Yoshua Bengio

Nikolay Malkin

2025-03-05

ICLR.cc/2025/Workshop/FPI (poster)

doi.org

openreview.net

Learning to Defer for Causal Discovery with Imperfect Experts

Oscar Clivio

Divyat Mahajan

Perouz Taslakian

Sara Magliacane

Ioannis Mitliagkas

Valentina Zantedeschi

Alexandre Drouin

Integrating expert knowledge, e.g. from large language models, into causal discovery algorithms can be challenging when the knowledge is not… (voir plus) guaranteed to be correct. Expert recommendations may contradict data-driven results, and their reliability can vary significantly depending on the domain or specific query. Existing methods based on soft constraints or inconsistencies in predicted causal relationships fail to account for these variations in expertise. To remedy this, we propose L2D-CD, a method for gauging the correctness of expert recommendations and optimally combining them with data-driven causal discovery results. By adapting learning-to-defer (L2D) algorithms for pairwise causal discovery (CD), we learn a deferral function that selects whether to rely on classical causal discovery methods using numerical data or expert recommendations based on textual meta-data. We evaluate L2D-CD on the canonical Tübingen pairs dataset and demonstrate its superior performance compared to both the causal discovery method and the expert used in isolation. Moreover, our approach identifies domains where the expert's performance is strong or weak. Finally, we outline a strategy for generalizing this approach to causal discovery on graphs with more than two variables, paving the way for further research in this area.

2025-03-05

ICLR.cc/2025/Workshop/LLM_Reason_and_Plan (publié)

doi.org

openreview.net

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Hugo Larochelle nommé directeur scientifique de Mila

Publications

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Hugo Larochelle nommé directeur scientifique de Mila

Mots-clés populaires:

Publications