Samin Yeasar Arnob

Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts

Minseon Kim

Riyasat Ohib

Lucas Caccia

Merging parameter-efficient task experts has recently gained growing attention as a way to build modular architectures that can be rapidly a… (voir plus)dapted on the fly for specific downstream tasks, without requiring additional fine-tuning. Typically, LoRA (Low-Rank Adaptation) serves as the foundational building block of such parameter-efficient modular architectures, leveraging low-rank weight structures to reduce the number of trainable parameters. In this paper, we study the properties of sparse adapters, which train only a subset of weights in the base neural network, as potential building blocks of modular architectures. First, we propose a simple method for training highly effective sparse adapters, which is conceptually simpler than existing methods in the literature and surprisingly outperforms both LoRA and full fine-tuning in our setting. Next, we investigate the merging properties of these sparse adapters by merging adapters for up to 20 natural language processing tasks, thus scaling beyond what is usually studied in the literature. Our findings demonstrate that sparse adapters yield superior in-distribution performance post-merging compared to LoRA or full model merging. Achieving strong held-out performance remains a challenge for all methods considered.

2025-07-06

colmweb.org/COLM/2025/Conference (accepté)

doi.org

openreview.net

Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity

Samin Yeasar Arnob

Scott Fujimoto

Doina Precup

In this paper, we investigate the use of small datasets in the context of offline reinforcement learning (RL). While many common offline RL … (voir plus)benchmarks employ datasets with over a million data points, many offline RL applications rely on considerably smaller datasets. We show that offline RL algorithms can overfit on small datasets, resulting in poor performance. To address this challenge, we introduce"Sparse-Reg": a regularization technique based on sparsity to mitigate overfitting in offline reinforcement learning, enabling effective learning in limited data settings and outperforming state-of-the-art baselines in continuous control.

2025-06-19

ArXiv (prépublication)

doi.org

arxiv.org

Efficient Reinforcement Learning by Discovering Neural Pathways

Samin Yeasar Arnob

Riyasat Ohib

Sergey Plis

Amy Zhang

Alessandro Sordoni

Doina Precup

2023-12-31

Advances in Neural Information Processing Systems 37 (publié)

doi.org

openreview.net

Offline Policy Optimization in RL with Variance Regularizaton

Riashat Islam

Samarth Sinha

Homanga Bharadhwaj

Samin Yeasar Arnob

Zhuoran Yang

Animesh Garg

Zhaoran Wang

Lihong Li

Doina Precup

2022-12-28

ArXiv (prépublication)

doi.org

arxiv.org

Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning

Samin Yeasar Arnob

Riashat Islam

Doina Precup

2021-12-30

ArXiv (prépublication)

arxiv.org

Single-Shot Pruning for Offline Reinforcement Learning

Samin Yeasar Arnob

Riyasat Ohib

Sergey Plis

Doina Precup

2021-12-30

ArXiv (prépublication)

arxiv.org

Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning

We study the problem of off-policy critic evaluation in several variants of value-based off-policy actor-critic algorithms. Off-policy actor… (voir plus)-critic algorithms require an off-policy critic evaluation step, to estimate the value of the new policy after every policy gradient update. Despite enormous success of off-policy policy gradients on control tasks, existing general methods suffer from high variance and instability, partly because the policy improvement depends on gradient of the estimated value function. In this work, we present a new way of off-policy policy evaluation in actor-critic, based on the doubly robust estimators. We extend the doubly robust estimator from off-policy policy evaluation (OPE) to actor-critic algorithms that consist of a reward estimator performance model. We find that doubly robust estimation of the critic can significantly improve performance in continuous control tasks. Furthermore, in cases where the reward function is stochastic that can lead to high variance, doubly robust critic estimation can improve performance under corrupted, stochastic reward signals, indicating its usefulness for robust and safe reinforcement learning.

2019-12-09

arXiv (prépublication)

doi.org

arxiv.org

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Samin Yeasar Arnob

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Samin Yeasar Arnob

Publications