Portrait of Samira Ebrahimi Kahou

Samira Ebrahimi Kahou

Affiliate Member
Associate Professor, University of Calgary, Deparment of Electrical and Software Engineering
Adjunct Professor, École de technologie suprérieure, School of Computer Science
Adjunct Professor, McGill University, School of Computer Science
Research Topics
Computer Vision
Deep Learning
Medical Machine Learning
Multimodal Learning
Natural Language Processing
Reinforcement Learning
Representation Learning

Biography

Samira is an Associate Professor at the University of Calgary in the Schulich School of Engineering. She is also an Adjunct Professor at École de technologie supérieure (ÉTS) in the Department of Software Engineering and Information Technology and McGill University in the School of Computer Science. She is an academic member of Mila - Québec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair. Samira received her Ph.D. in Computer Engineering from Polytechnique Montréal/Mila with an award for the best thesis in the department. Samira also worked as a Postdoctoral Fellow at the McGill School of Computer Science and as a Researcher at Microsoft Research Montréal.

Samira and her group work on solving fundamental problems in representation learning for decision making, with a broad focus on explainability, generalization and efficient learning. Her work has been published in top-tier venues, such as NeurIPS, ICLR, ICML, ICCV, CVPR, TMLR and CoRL. Samira received the 2024 Early Career Excellence in Research Award from the Schulich School of Engineering. Her impactful work in multi-modal learning was recognized twice by ACM ICMI Ten-Year Technical Impact Awards, 2023 as Runner-up and 2025 as Winner.

Current Students

PhD - Université de Montréal
Principal supervisor :
PhD - École de technologie suprérieure
Principal supervisor :
PhD - École de technologie suprérieure
Principal supervisor :
PhD - McGill University
Co-supervisor :
PhD - McGill University
Principal supervisor :

Publications

Survey on <scp>AI</scp> Ethics: A Socio‐Technical Perspective
Dave Mbiazi
Ivaxi Sheth
Patrik Joslin Kenfack
Abstract The past decade has observed a significant advancement in AI, with deep learning‐based models being deployed in diverse scenarios… (see more), including safety‐critical applications. As these AI systems become deeply embedded in our societal infrastructure, the repercussions of their decisions and actions have significant consequences, making the ethical implications of AI deployment highly relevant and essential. The ethical concerns associated with AI are multifaceted, including challenging issues of fairness, privacy and data protection, responsibility and accountability, safety and robustness, transparency and explainability, and environmental impact. These principles together form the foundations of ethical AI considerations that concern every stakeholder in the AI system lifecycle. In light of the present ethical and future x‐risk concerns, governments have shown increasing interest in establishing guidelines for the ethical deployment of AI. This work unifies the current and future ethical concerns of deploying AI into society. While we acknowledge and appreciate the technical surveys for each of the ethical principles concerned, in this paper, we aim to provide a comprehensive overview that not only addresses each principle from a technical point of view but also discusses them from a social perspective.
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
Abdalwhab Bakheet Mohamed Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (see more)ive robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions.
Source-free Domain Adaptation Requires Penalized Diversity
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance… (see more) of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
Handling Delay in Real-Time Reinforcement Learning
Real-time reinforcement learning (RL) introduces several challenges. First, policies are constrained to a fixed number of actions per second… (see more) due to hardware limitations. Second, the environment may change while the network is still computing an action, leading to observational delay. The first issue can partly be addressed with pipelining, leading to higher throughput and potentially better policies. However, the second issue remains: if each neuron operates in parallel with an execution time of
GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities
The rapid evolution of software libraries poses a considerable hurdle for code generation, necessitating continuous adaptation to frequent v… (see more)ersion updates while preserving backward compatibility. While existing code evolution benchmarks provide valuable insights, they typically lack execution-based evaluation for generating code compliant with specific library versions. To address this, we introduce GitChameleon 2.0, a novel, meticulously curated dataset comprising 328 Python code completion problems, each conditioned on specific library versions and accompanied by executable unit tests. GitChameleon 2.0 rigorously evaluates the capacity of contemporary large language models (LLMs), LLM-powered agents, code assistants, and RAG systems to perform version-conditioned code generation that demonstrates functional accuracy through execution. Our extensive evaluations indicate that state-of-the-art systems encounter significant challenges with this task; enterprise models achieving baseline success rates in the 48-51% range, underscoring the intricacy of the problem. By offering an execution-based benchmark emphasizing the dynamic nature of code libraries, GitChameleon 2.0 enables a clearer understanding of this challenge and helps guide the development of more adaptable and dependable AI code generation methods. We make the dataset and evaluation code publicly available at https://github.com/mrcabbage972/GitChameleonBenchmark.
Learning to Play Atari in a World of Tokens
Model-based reinforcement learning agents utilizing transformers have shown improved sample efficiency due to their ability to model extende… (see more)d context, resulting in more accurate world models. However, for complex reasoning and planning tasks, these methods primarily rely on continuous representations. This complicates modeling of discrete properties of the real world such as disjoint object classes between which interpolation is not plausible. In this work, we introduce discrete abstract representations for transformer-based learning (DART), a sample-efficient method utilizing discrete representations for modeling both the world and learning behavior. We incorporate a transformer-decoder for auto-regressive world modeling and a transformer-encoder for learning behavior by attending to task-relevant cues in the discrete representation of the world model. For handling partial observability, we aggregate information from past time steps as memory tokens. DART outperforms previous state-of-the-art methods that do not use look-ahead search on the Atari 100k sample efficiency benchmark with a median human-normalized score of 0.790 and beats humans in 9 out of 26 games. We release our code at https://pranaval.github.io/DART/.
Prioritizing Samples in Reinforcement Learning with Reducible Loss
Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed i… (see more)n the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a naïve strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample. We define the learn-ability of a sample as the steady decrease of the training loss associated with this sample over time. We develop an algorithm to prioritize samples with high learn-ability, while assigning lower priority to those that are hard-to-learn, typically caused by noise or stochasticity. We empirically show that our method is more robust than random sampling and also better than just prioritizing with respect to the training loss, i.e. the temporal difference loss, which is used in prioritized experience replay.
Discovering Object-Centric Generalized Value Functions From Pixels
Deep Reinforcement Learning has shown significant progress in extracting useful representations from high-dimensional inputs albeit using ha… (see more)nd-crafted auxiliary tasks and pseudo rewards. Automatically learning such representations in an object-centric manner geared towards control and fast adaptation remains an open research problem. In this paper, we introduce a method that tries to discover meaningful features from objects, translating them to temporally coherent "question" functions and leveraging the subsequent learned general value functions for control. We compare our approach with state-of-the-art techniques alongside other ablations and show competitive performance in both stationary and non-stationary settings. Finally, we also investigate the discovered general value functions and through qualitative analysis show that the learned representations are not only interpretable but also, centered around objects that are invariant to changes across tasks facilitating fast adaptation.
Towards Policy-Guided Conversational Recommendation with Dialogue Acts
Paul Crook
Y-Lan Boureau
J. Weston
Akbar Karimi
Leonardo Rossi
Andrea Prati
Wenqiang Lei
Xiangnan He
Qingyun Yisong Miao
Richang Wu
Min-Yen Hong
Kan Tat-Seng
Raymond Li
Hannes Schulz
Zujie Liang
Huang Hu
Can Xu
Jian Miao
Lizi Liao … (see 47 more)
Ryuichi Takanobu
Yunshan Ma
Xun Yang
Wenchang Ma
Minlie Huang
Minghao Tu
Iulian Serban
Aaron C. Courville
David Silver
Julian Schrittwieser
K. Simonyan
Ioannis Antonoglou
Aja Huang
A. Guez
Hanlin Zhu
O. Vinyals
Igor Babuschkin
M. Mathieu
Max Jaderberg
Wojciech M. Czar-725 necki
A. Dudzik
Petko Georgiev
Richard Powell
T. Ewalds
Dan Horgan
M. Kroiss
Ivo Danihelka
J. Agapiou
Junhyuk Oh
Valentin Dalibard
David Choi
L. Sifre
Yury Sulsky
Sasha Vezhnevets
James Molloy
Trevor Cai
D. Budden
T. Paine
Ziyu Wang
Tobias Pfaff
Tobias Pohlen
Accounting for Variance in Machine Learning Benchmarks
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (see more)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.
Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction
Alaaeldin El-Nouby
Shikhar Sharma
Hannes Schulz
Layla El Asri
Graham W. Taylor
Conditional text-to-image generation is an active area of research, with many possible applications. Existing research has primarily focused… (see more) on generating a single image from available conditioning information in one step. One practical extension beyond one-step generation is a system that generates an image iteratively, conditioned on ongoing linguistic input or feedback. This is significantly more challenging than one-step generation tasks, as such a system must understand the contents of its generated images with respect to the feedback history, the current feedback, as well as the interactions among concepts present in the feedback history. In this work, we present a recurrent image generation model which takes into account both the generated output up to the current step as well as all past instructions for generation. We show that our model is able to generate the background, add new objects, and apply simple transformations to existing objects. We believe our approach is an important step toward interactive generation. Code and data is available at: https://www.microsoft.com/en-us/research/project/generative-neural-visual-artist-geneva/ .
Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies
Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish durin… (see more)g training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by skipping information through a memory mechanism. We propose a new recurrent architecture (Non-saturating Recurrent Unit; NRU) that relies on a memory mechanism but forgoes both saturating activation functions and saturating gates, in order to further alleviate vanishing gradients. In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures.