Portrait of Sabyasachi Sahoo

Sabyasachi Sahoo

PhD - Université Laval
Supervisor
Research Topics
Computer Vision
Deep Learning
Online Learning
Representation Learning

Publications

A Layer Selection Approach to Test Time Adaptation
Mostafa ElAraby
Yann Batiste Pequignot
Frederic Precioso
Test Time Adaptation (TTA) addresses the problem of distribution shift by adapting a pretrained model to a new domain during inference. When… (see more) faced with challenging shifts, most methods collapse and perform worse than the original pretrained model. In this paper, we find that not all layers are equally receptive to the adaptation, and the layers with the most misaligned gradients often cause performance degradation. To address this, we propose GALA, a novel layer selection criterion to identify the most beneficial updates to perform during test time adaptation. This criterion can also filter out unreliable samples with noisy gradients. Its simplicity allows seamless integration with existing TTA loss functions, thereby preventing degradation and focusing adaptation on the most trainable layers. This approach also helps to regularize adaptation to preserve the pretrained features, which are crucial for handling unseen domains. Through extensive experiments, we demonstrate that the proposed layer selection framework improves the performance of existing TTA approaches across multiple datasets, domain shifts, model architectures, and TTA losses.
A Layer Selection Approach to Test Time Adaptation
Mostafa ElAraby
Yann Batiste Pequignot
Frederic Precioso
Test Time Adaptation (TTA) addresses the problem of distribution shift by adapting a pretrained model to a new domain during inference. When… (see more) faced with challenging shifts, most methods collapse and perform worse than the original pretrained model. In this paper, we find that not all layers are equally receptive to the adaptation, and the layers with the most misaligned gradients often cause performance degradation. To address this, we propose GALA, a novel layer selection criterion to identify the most beneficial updates to perform during test time adaptation. This criterion can also filter out unreliable samples with noisy gradients. Its simplicity allows seamless integration with existing TTA loss functions, thereby preventing degradation and focusing adaptation on the most trainable layers. This approach also helps to regularize adaptation to preserve the pretrained features, which are crucial for handling unseen domains. Through extensive experiments, we demonstrate that the proposed layer selection framework improves the performance of existing TTA approaches across multiple datasets, domain shifts, model architectures, and TTA losses.
Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers
Yann Batiste Pequignot
Frederic Precioso
Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers
Jonas Ngnaw'e
Yann Batiste Pequignot
Fr'ed'eric Precioso
Despite extensive research on adversarial training strategies to improve robustness, the decisions of even the most robust deep learning mod… (see more)els can still be quite sensitive to imperceptible perturbations, creating serious risks when deploying them for high-stakes real-world applications. While detecting such cases may be critical, evaluating a model's vulnerability at a per-instance level using adversarial attacks is computationally too intensive and unsuitable for real-time deployment scenarios. The input space margin is the exact score to detect non-robust samples and is intractable for deep neural networks. This paper introduces the concept of margin consistency -- a property that links the input space margins and the logit margins in robust models -- for efficient detection of vulnerable samples. First, we establish that margin consistency is a necessary and sufficient condition to use a model's logit margin as a score for identifying non-robust samples. Next, through comprehensive empirical analysis of various robustly trained models on CIFAR10 and CIFAR100 datasets, we show that they indicate strong margin consistency with a strong correlation between their input space margins and the logit margins. Then, we show that we can effectively use the logit margin to confidently detect brittle decisions with such models and accurately estimate robust accuracy on an arbitrarily large test set by estimating the input margins only on a small subset. Finally, we address cases where the model is not sufficiently margin-consistent by learning a pseudo-margin from the feature representation. Our findings highlight the potential of leveraging deep representations to efficiently assess adversarial vulnerability in deployment scenarios.
Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers
Jonas Ngnaw'e
Frederic Precioso
Layerwise Early Stopping for Test Time Adaptation
Mostafa ElAraby
Yann Batiste Pequignot
Frederic Precioso
Hessian Aware Low-Rank Weight Perturbation for Continual Learning
Jiaqi Li
Rui Wang
Yuanhao Lai
Changjian Shui
Charles Ling
Shichun Yang
Boyu Wang
Fan Zhou
Continual learning aims to learn a series of tasks sequentially without forgetting the knowledge acquired from the previous ones. In this wo… (see more)rk, we propose the Hessian Aware Low-Rank Perturbation algorithm for continual learning. By modeling the parameter transitions along the sequential tasks with the weight matrix transformation, we propose to apply the low-rank approximation on the task-adaptive parameters in each layer of the neural networks. Specifically, we theoretically demonstrate the quantitative relationship between the Hessian and the proposed low-rank approximation. The approximation ranks are then globally determined according to the marginal increment of the empirical loss estimated by the layer-specific gradient and low-rank approximation error. Furthermore, we control the model capacity by pruning less important parameters to diminish the parameter growth. We conduct extensive experiments on various benchmarks, including a dataset with large-scale tasks, and compare our method against some recent state-of-the-art methods to demonstrate the effectiveness and scalability of our proposed method. Empirical results show that our method performs better on different benchmarks, especially in achieving task order robustness and handling the forgetting issue. The source code is at https://github.com/lijiaqi/HALRP.
GROOD: Gradient-Aware Out-of-Distribution Detection
Mostafa ElAraby
Paul Novello
GROOD: Gradient-Aware Out-of-Distribution Detection
Mostafa ElAraby
Yann Batiste Pequignot
Paul Novello
Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning
Jiaqi Li
Rui Wang
Yuanhao Lai
Changjian Shui
Charles Ling
Shichun Yang
Boyu Wang
Fan Zhou
Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning
Jiaqi Li
Rui Wang
Yuanhao Lai
Changjian Shui
Charles Ling
Shichun Yang
Boyu Wang
Fan Zhou
GROOD: GRadient-aware Out-Of-Distribution detection in interpolated manifolds
Mostafa ElAraby
Yann Batiste Pequignot
Paul Novello