Portrait of Samira Ebrahimi Kahou

Samira Ebrahimi Kahou

Associate Academic Member
Canada CIFAR AI Chair
Associate Professor, École de technologie supérieure (ETS), Department of Computer Engineering and Information Technology
Adjunct Professor, McGill University, School of Computer Science

Biography

I am an associate professor at the École de technologie supérieure (ÉTS) and adjunct professor at McGill University. Before joining ÉTS, I was a postdoctoral fellow with Doina Precup at McGill / Mila – Quebec Artificial Intelligence Institute. Before that, I was a researcher at Microsoft Research Montréal.

For my PhD at Polytechnique Montréal / Mila (2016), which was supervised by Chris Pal, I worked on computer vision and deep learning for emotion recognition, object tracking and knowledge distillation.

Current Students

Master's Research - École de technologie suprérieure
Professional Master's - Université de Montréal
Principal supervisor :
Master's Research - École de technologie suprérieure
PhD - Université de Montréal
Principal supervisor :
PhD - École de technologie suprérieure
Principal supervisor :
PhD - École de technologie suprérieure
Master's Research - École de technologie suprérieure
Master's Research - École de technologie suprérieure
Master's Research - McGill University
Principal supervisor :

Publications

Neural semantic tagging for natural language-based search in building information models: Implications for practice
Mehrzad Shahinmoghadam
Ali Motamedi
Spectral Temporal Contrastive Learning
Sacha Morin
Somjit Nath
Learning useful data representations without requiring labels is a cornerstone of modern deep learning. Self-supervised learning methods, pa… (see more)rticularly contrastive learning (CL), have proven successful by leveraging data augmentations to define positive pairs. This success has prompted a number of theoretical studies to better understand CL and investigate theoretical bounds for downstream linear probing tasks. This work is concerned with the temporal contrastive learning (TCL) setting where the sequential structure of the data is used instead to define positive pairs, which is more commonly used in RL and robotics contexts. In this paper, we adapt recent work on Spectral CL to formulate Spectral Temporal Contrastive Learning (STCL). We discuss a population loss based on a state graph derived from a time-homogeneous reversible Markov chain with uniform stationary distribution. The STCL loss enables to connect the linear probing performance to the spectral properties of the graph, and can be estimated by considering previously observed data sequences as an ensemble of MCMC chains.
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies
Shiva Kanth Sujit
Pedro Braga
Jorg Bornschein
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from … (see more)scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive, such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there isn't a single established protocol for evaluating offline RL methods. In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of the learning process and the robustness of algorithms to distribution changes in the dataset while also harmonizing the visualization of the offline and online learning phases. Our approach is generally applicable and easy to implement. We compare several existing offline RL algorithms using this approach and present insights from a variety of tasks and offline datasets.
Empowering Clinicians with MeDT: A Framework for Sepsis Treatment
Aamer Abdul Rahman
Pranav Agarwal
Vincent Michalski
Rita Noumeir
RelationalUNet for Image Segmentation
Ivaxi Sheth
Pedro H. M. Braga
Shiva Kanth Sujit
Sahar Dastani
Auxiliary Losses for Learning Generalizable Concept-based Models
Ivaxi Sheth
Prioritizing Samples in Reinforcement Learning with Reducible Loss
Shiva Kanth Sujit
Somjit Nath
Pedro Braga
Fairness Under Demographic Scarce Regime
Patrik Joslin Kenfack
Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographi… (see more)c information is partially available because a record was not maintained throughout data collection or due to privacy reasons. This setting is known as demographic scarce regime. Prior research have shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, the use of proxy-sensitive attributes worsens fairness-accuracy trade-offs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy trade-offs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.
Transformers in Reinforcement Learning: A Survey
Pranav Agarwal
Aamer Abdul Rahman
Pierre-Luc St-Charles
Simon J. D. Prince
Transformers have significantly impacted domains like natural language processing, computer vision, and robotics, where they improve perform… (see more)ance compared to other neural networks. This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability, and partial observability. We begin by providing a brief domain overview of RL, followed by a discussion on the challenges of classical RL algorithms. Next, we delve into the properties of the transformer and its variants and discuss the characteristics that make them well-suited to address the challenges inherent in RL. We examine the application of transformers to various aspects of RL, including representation learning, transition and reward function modeling, and policy optimization. We also discuss recent research that aims to enhance the interpretability and efficiency of transformers in RL, using visualization techniques and efficient training strategies. Often, the transformer architecture must be tailored to the specific needs of a given application. We present a broad overview of how transformers have been adapted for several applications, including robotics, medicine, language modeling, cloud computing, and combinatorial optimization. We conclude by discussing the limitations of using transformers in RL and assess their potential for catalyzing future breakthroughs in this field.
CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning
Nikunj Gupta
Somjit Nath
Before taking actions in an environment with more than one intelligent agent, an autonomous agent may benefit from reasoning about the other… (see more) agents and utilizing a notion of a guarantee or confidence about the behavior of the system. In this article, we propose a novel multi-agent reinforcement learning (MARL) algorithm CAMMARL, which involves modeling the actions of other agents in different situations in the form of confident sets, i.e., sets containing their true actions with a high probability. We then use these estimates to inform an agent's decision-making. For estimating such sets, we use the concept of conformal predictions, by means of which, we not only obtain an estimate of the most probable outcome but get to quantify the operable uncertainty as well. For instance, we can predict a set that provably covers the true predictions with high probabilities (e.g., 95%). Through several experiments in two fully cooperative multi-agent tasks, we show that CAMMARL elevates the capabilities of an autonomous agent in MARL by modeling conformal prediction sets over the behavior of other agents in the environment and utilizing such estimates to enhance its policy learning.
Overcoming Interpretability and Accuracy Trade-off in Medical Imaging
Ivaxi Sheth
Source-free Domain Adaptation Requires Penalized Diversity
Laya Rafiee Sevyeri
Ivaxi Sheth
Farhood Farahnak
Alexandre See
Thomas Fevens
Mohammad Havaei
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance… (see more) of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.