Publications

Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts

Samin Yeasar Arnob

Zhan Su

Minseon Kim

Oleksiy Ostapenko

Lucas Caccia

Merging parameter-efficient task experts has recently gained growing attention as a way to build modular architectures that can be rapidly a… (see more)dapted on the fly for specific downstream tasks, without requiring additional fine-tuning. Typically, LoRA (Low-Rank Adaptation) serves as the foundational building block of such parameter-efficient modular architectures, leveraging low-rank weight structures to reduce the number of trainable parameters. In this paper, we study the properties of sparse adapters, which train only a subset of weights in the base neural network, as potential building blocks of modular architectures. First, we propose a simple method for training highly effective sparse adapters, which is conceptually simpler than existing methods in the literature and surprisingly outperforms both LoRA and full fine-tuning in our setting. Next, we investigate the merging properties of these sparse adapters by merging adapters for up to 20 natural language processing tasks, thus scaling beyond what is usually studied in the literature. Our findings demonstrate that sparse adapters yield superior in-distribution performance post-merging compared to LoRA or full model merging. Achieving strong held-out performance remains a challenge for all methods considered.

2025-03-05

ICLR.cc/2025/Workshop/MCDC (accepted)

openreview.net

From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions

Ruben Weijers

Denton Wu

Hannah Betts

Tamara Jacod

Yuxiang Guan

Vidya Sujaya

Kushal Dev

Toshali Goel

William Delooze

Reihaneh Rabbany

Ying Wu

Jean-François Godbout

Kellin Pelrine

Generative AI has the potential to transform personalization and accessibility of education. However, it raises serious concerns about accur… (see more)acy and helping students become independent critical thinkers. In this study, we designed a helpful yet fallible AI "Peer" to help students correct fundamental physics misconceptions related to Newtonian mechanic concepts. In contrast to approaches that seek near-perfect accuracy to create an authoritative AI tutor or teacher, we directly inform students that this AI can answer up to 40\% of questions incorrectly. In a randomized controlled trial with 165 students, those who engaged in targeted dialogue with the AI Peer achieved post-test scores that were, on average, 10.5 percentage points higher—with over 20 percentage points higher normalized gain—than a control group that discussed physics history. Qualitative feedback indicated that 91% of the treatment group's AI interactions were rated as helpful. Furthermore, by comparing student performance on pre- and post-test questions about the same concept, along with experts' annotations of the AI interactions, we find initial evidence suggesting the improvement in performance does not depend on the correctness of the AI. With further research, the AI Peer paradigm described here could open new possibilities for how we learn, adapt to, and grow with AI.

2025-03-05

ICLR.cc/2025/Workshop/Bi-Align (poster)

doi.org

openreview.net

A Generative Approach to LLM Harmfulness Detection with Red Flag Tokens

Sophie Xhonneux

David Dobre

Mehrnaz Mofakhami

Leo Schwinn

Gauthier Gidel

Most safety training methods for large-language models (LLMs) based on fine-tuning rely on dramatically changing the output distribution of … (see more)the model when faced with a harmful request, shifting it from an unsafe answer to a refusal to respond. These methods inherently compromise model capabilities and might make auto-regressive models vulnerable to attacks that make likely an initial token of affirmative response. To avoid that, we propose to expand the model's vocabulary with a special token we call a *red flag token* (

2025-03-05

ICLR.cc/2025/Workshop/BuildingTrust (accepted)

openreview.net

Grokking Beyond the Euclidean Norm of Model Parameters

Tikeng Notsawo Pascal Junior

Guillaume Dumas

Guillaume Rabusseau

Grokking refers to a delayed generalization following overfitting when optimizing artificial neural networks with gradient-based methods. I… (see more)n this work, we demonstrate that grokking can be induced by regularization, either explicit or implicit. More precisely, we show that when there exists a model with a property

2025-03-05

ICLR.cc/2025/Workshop/SLLM (published)

openreview.net

A Guide to Misinformation Detection Data and Evaluation

Camille Thibault

Jacob-Junqi Tian

Gabrielle Péloquin-Skulski

Taylor Lynn Curtis

James Zhou

Florence Laflamme

Yuxiang Guan

Reihaneh Rabbany

Jean-François Godbout

Kellin Pelrine

Misinformation is a complex societal issue, and mitigating solutions are difficult to create due to data deficiencies. To address this probl… (see more)em, we have curated the largest collection of (mis)information datasets in the literature, totaling 75. From these, we evaluated the quality of all of the 36 datasets that consist of statements or claims, as well as the 9 datasets that consists of data in purely paragraph form. We assess these datasets to identify those with solid foundations for empirical work and those with flaws that could result in misleading and non-generalizable results, such as insufficient label quality, spurious correlations. We further provide state-of-the-art baselines on all these datasets, but show that regardless of label quality, categorical labels may no longer give an accurate evaluation of detection model performance. We discuss alternatives to mitigate this problem. Overall, this guide aims to provide a roadmap for obtaining higher quality data and conducting more effective evaluations, ultimately improving research in misinformation detection. All datasets and other artifacts are available at [anonymized].

2025-03-05

ICLR.cc/2025/Workshop/MLDPR (published)

openreview.net

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Zhanke Zhou

Zhaocheng Zhu

Xuan Li

Mikhail Galkin

Xiao Feng

Sanmi Koyejo

Jian Tang

Bo Han

2025-03-05

ICLR.cc/2025/Workshop/LLM_Reason_and_Plan (published)

doi.org

openreview.net

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Zhanke Zhou

Xuan Li

Zhaocheng Zhu

Mikhail Galkin

Xiao Feng

Sanmi Koyejo

Jian Tang

Bo Han

Numerous applications of large language models (LLMs) rely on their ability to perform step-by-step reasoning. However, the reasoning behavi… (see more)or of LLMs remains poorly understood, posing challenges to research, development, and safety. To address this gap, we introduce landscape of thoughts-the first visualization tool for users to inspect the reasoning paths of chain-of-thought and its derivatives on any multi-choice dataset. Specifically, we represent the states in a reasoning path as feature vectors that quantify their distances to all answer choices. These features are then visualized in two-dimensional plots using t-SNE. Qualitative analysis shows that the landscape of thoughts effectively distinguishes between strong and weak models, correct and incorrect answers, as well as different reasoning tasks. It also uncovers undesirable reasoning patterns, such as low consistency and high uncertainty. Additionally, users can adapt our tool to a neural model that predicts any property they observe. We showcase this advantage by adapting our tool to a lightweight verifier, which significantly improves reasoning by evaluating the correctness of reasoning paths.

2025-03-05

ICLR.cc/2025/Workshop/LLM_Reason_and_Plan (published)

openreview.net

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Yoshua Bengio

Nikolay Malkin

2025-03-05

ICLR.cc/2025/Workshop/FPI (poster)

doi.org

openreview.net

Learning to Defer for Causal Discovery with Imperfect Experts

Oscar Clivio

Divyat Mahajan

Perouz Taslakian

Sara Magliacane

Ioannis Mitliagkas

Valentina Zantedeschi

Alexandre Drouin

Integrating expert knowledge, e.g. from large language models, into causal discovery algorithms can be challenging when the knowledge is not… (see more) guaranteed to be correct. Expert recommendations may contradict data-driven results, and their reliability can vary significantly depending on the domain or specific query. Existing methods based on soft constraints or inconsistencies in predicted causal relationships fail to account for these variations in expertise. To remedy this, we propose L2D-CD, a method for gauging the correctness of expert recommendations and optimally combining them with data-driven causal discovery results. By adapting learning-to-defer (L2D) algorithms for pairwise causal discovery (CD), we learn a deferral function that selects whether to rely on classical causal discovery methods using numerical data or expert recommendations based on textual meta-data. We evaluate L2D-CD on the canonical Tübingen pairs dataset and demonstrate its superior performance compared to both the causal discovery method and the expert used in isolation. Moreover, our approach identifies domains where the expert's performance is strong or weak. Finally, we outline a strategy for generalizing this approach to causal discovery on graphs with more than two variables, paving the way for further research in this area.

2025-03-05

ICLR.cc/2025/Workshop/LLM_Reason_and_Plan (published)

openreview.net

Learning to Defer for Causal Discovery with Imperfect Experts

Oscar Clivio

Divyat Mahajan

Perouz Taslakian

Sara Magliacane

Ioannis Mitliagkas

Valentina Zantedeschi

Alexandre Drouin

Integrating expert knowledge, e.g. from large language models, into causal discovery algorithms can be challenging when the knowledge is not… (see more) guaranteed to be correct. Expert recommendations may contradict data-driven results, and their reliability can vary significantly depending on the domain or specific query. Existing methods based on soft constraints or inconsistencies in predicted causal relationships fail to account for these variations in expertise. To remedy this, we propose L2D-CD, a method for gauging the correctness of expert recommendations and optimally combining them with data-driven causal discovery results. By adapting learning-to-defer (L2D) algorithms for pairwise causal discovery (CD), we learn a deferral function that selects whether to rely on classical causal discovery methods using numerical data or expert recommendations based on textual meta-data. We evaluate L2D-CD on the canonical Tübingen pairs dataset and demonstrate its superior performance compared to both the causal discovery method and the expert used in isolation. Moreover, our approach identifies domains where the expert's performance is strong or weak. Finally, we outline a strategy for generalizing this approach to causal discovery on graphs with more than two variables, paving the way for further research in this area.

2025-03-05

ICLR.cc/2025/Workshop/LLM_Reason_and_Plan (published)

doi.org

openreview.net

Mol-MoE: Training Preference-Guided Routers for Molecule Generation

Diego Calanzone

Pierluca D'Oro

Pierre-Luc Bacon

Recent advances in language models have enabled framing molecule generation as sequence modeling. However, existing approaches often rely on… (see more) single-objective reinforcement learning, limiting their applicability to real-world drug design, where multiple competing properties must be optimized. Traditional multi-objective reinforcement learning (MORL) methods require costly retraining for each new objective combination, making rapid exploration of trade-offs impractical. To overcome these limitations, we introduce Mol-MoE, a mixture-of-experts (MoE) architecture that enables efficient test-time steering of molecule generation without retraining. Central to our approach is a preference-based router training objective that incentivizes the router to combine experts in a way that aligns with user-specified trade-offs. This provides improved flexibility in exploring the chemical property space at test time, facilitating rapid trade-off exploration. Benchmarking against state-of-the-art methods, we show that Mol-MoE achieves superior sample quality and steerability.

2025-03-05

ICLR.cc/2025/Workshop/GEM (published)

doi.org

openreview.net

PREFERENCE OPTIMIZATION FOR CONCEPT BOTTLENECK MODELS

Emiliano Penaloza

Tianyue H. Zhang

Laurent Charlin

Mateo Espinosa Zarlenga

Concept Bottleneck Models (CBMs) propose to enhance the trustworthiness of AI systems by constraining their decisions on a set of human-unde… (see more)rstandable concepts. However, CBMs typically assume that datasets contain accurate concept labels—an assumption often violated in practice, which we show can significantly degrade performance (by 25% in some cases). To address this, we introduce the Concept Preference Optimization (CPO) objective, a new loss function based on Direct Preference Optimization, which effectively mitigates the negative impact of concept mislabeling on CBM performance. We provide an analysis of some key properties of the CPO objective showing it directly optimizes for the concept’s posterior distribution, and contrast it against Binary Cross Entropy (BCE) where we show CPO is inherently less sensitive to concept noise. We empirically confirm our analysis finding that CPO consistently outperforms BCE in three real-world datasets with and without added label noise.

2025-03-05

ICLR.cc/2025/Workshop/Bi-Align (oral)

openreview.net

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications