Portrait de Samira Ebrahimi Kahou

Samira Ebrahimi Kahou

Membre affilié
Professeure agrégée, University of Calgary, Départment de génie électrique et logiciel
Professeure associée, École de technologie suprérieure, Département de génie logiciel et technologies de l'information
Professeure associée, McGill University, École d'informatique
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage multimodal
Apprentissage par renforcement
Apprentissage profond
Traitement du langage naturel
Vision par ordinateur

Biographie

Samira est professeure agrégée à l’Université de Calgary, à la Schulich School of Engineering. Elle est également professeure associée à l’École de technologie supérieure (ÉTS), au Département de génie logiciel et des technologies de l’information, ainsi qu’à l’Université McGill, à l’École d’informatique. Elle est membre académique de Mila - Institut québécois d’intelligence artificielle et détient une Chaire canadienne CIFAR en IA. Samira a obtenu son doctorat en génie informatique à Polytechnique Montréal/Mila, avec un prix pour la meilleure thèse du département. Elle a également travaillé comme chercheuse postdoctorale à l’École d’informatique de l’Université McGill et comme chercheuse à Microsoft Research Montréal.

Samira et son groupe de recherche travaillent à résoudre des problèmes fondamentaux de l’apprentissage de représentations pour la prise de décision, avec un accent particulier sur l’explicabilité, la généralisation et l’apprentissage efficace. Ses travaux ont été publiés dans des conférences et revues de premier plan telles que NeurIPS, ICLR, ICML, ICCV, CVPR, TMLR et CoRL. Samira a reçu en 2024 le prix d’excellence en recherche en début de carrière de la Schulich School of Engineering. Ses contributions marquantes en apprentissage multimodal ont été reconnues à deux reprises par les prix ACM ICMI Ten-Year Technical Impact Awards : finaliste en 2023 et lauréate en 2025.

Étudiants actuels

Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - École de technologie suprérieure
Superviseur⋅e principal⋅e :
Doctorat - École de technologie suprérieure
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

The Invisible Hand of Physics: When Video Diffusion Models Know More Than They Show
Parsa Esmati
Katja Hofmann
Majid Mirmehdi
Modern video diffusion models generate increasingly realistic and temporally coherent videos, motivating their use as candidate world simula… (voir plus)tors. Yet it remains unclear whether these models internally encode physical structure, or merely reproduce motion patterns seen during training. We study this question by probing video diffusion models along latent trajectories corresponding to real videos with known physical plausibility. To obtain such trajectories, we approximately invert the deterministic sampling process by integrating the learned velocity field backward from a clean video latent to noise, giving access to the model's intermediate states and attention maps. Using these recovered trajectories, we show that physical plausibility is linearly decodable from diffusion transformer states across IntPhys and InfLevel, reaching around 81.27% average accuracy and outperforming dedicated representation-learning baselines such as V-JEPA and VideoMAE. Surprisingly, this signal is absent from the VAE latent input and emerges inside the denoising transformer itself, despite the model not being trained with a self-supervised predictive objective. These findings suggest that physically meaningful representations can arise as a byproduct of generative denoising.
Structured Representation Learning with Locally Linear Embeddings and Adaptive Feature Fusion
Neuroscientific research has revealed that the brain encodes complex behaviors by leveraging structured, low-dimensional manifolds and dynam… (voir plus)ically fusing multiple sources of information through adaptive gating mechanisms. Inspired by these principles, we propose a novel reinforcement learning (RL) framework that encourages the disentanglement of dynamics-specific and reward-specific features, drawing direct parallels to how neural circuits separate and integrate information for efficient decision-making. Our approach leverages locally linear embeddings (LLEs) to capture the intrinsic, locally linear structure inherent in many environments—mirroring the local smoothness observed in neural population activity—while concurrently deriving reward-specific features through the standard RL objective. An attention mechanism, analogous to cortical gating, adaptively fuses these complementary representations on a per-state basis. Experimental results on benchmark tasks demonstrate that our method, grounded in neuroscientific principles, improves learning efficiency and overall performance compared to conventional RL approaches, highlighting the benefits of explicitly modeling local state structures and adaptive feature selection as observed in biological systems.
Bootstrap Sampling Improves Model Soup Performance via Increased Model Diversity for Pneumonia Classification
Sara Early
Omata I. Ehizokhale
Nils D. Forkert
Model soups combine multiple trained neural network checkpoints through weight averaging, often outperforming individual models and achievin… (voir plus)g performance comparable to deep ensembles without increasing inference cost. However, their effectiveness depends critically on checkpoint diversity, and when models are trained on the same dataset, optimization trajectories may converge toward similar regions of parameter space, limiting this diversity. In this work, we investigate bootstrap resampling as a simple data-level mechanism for increasing checkpoint diversity. Using a binary pneumonia classification task and 644 radiographs from the National Institutes of Health (NIH) ChestXray14 dataset, we train pools of convolutional neural networks under varying bootstrap ratios and construct greedy model soups. While checkpoint models trained on the full dataset achieve the highest mean individual accuracy, they are highly similar and offer little complementary signal, limiting the effectiveness of greedy selection. Bootstrap sampling introduces variability in the training data, producing more diverse checkpoints that, although individually weaker, enable greedy soup construction to combine complementary representations and achieve superior overall performance. The strongest model soup, obtained with 70\% bootstrap sampling, achieves a test accuracy of 0.650, representing a 9.8 percentage point improvement over the mean individual checkpoint accuracy (0.551) under the same condition. While absolute performance is limited by the small cohort size and training-from-scratch setting, this result highlights the substantial gains achievable through diversity-driven weight averaging.
Estimation of head motion in structural MRI and its impact on cortical morphometry
Motion-related artifacts are inevitable in Magnetic Resonance Imaging (MRI) and can bias automated neuroanatomical metrics such as cortical … (voir plus)thickness. These biases can interfere with statistical analysis which is a major concern as motion has been shown to be more prominent in certain populations such as children or individuals with ADHD. Manual review cannot objectively quantify motion in anatomical scans, and existing quantitative automated approaches often require specialized hardware or custom acquisition protocols. Here, we train a 3D convolutional neural network to estimate a summary motion metric in retrospective routine research scans by leveraging a large training dataset of synthetically motion-corrupted volumes. We validate our method with one held-out site from our training cohort and with 14 fully independent datasets, including one with manual ratings, achieving a Spearman Rank correlation of 0.71 vs. manual labels. We also tested the correlation of our predicted motion score with morphometric measurements known to be impacted by motion, achieving significant correlation on most datasets. Furthermore, our predicted motion correlates with subject age in line with prior studies. Our approach shows good generalization across scanner brands and protocols, enabling objective, scalable motion assessment in structural MRI studies without prospective motion correction. Finally, we provide empirical evidence that our motion estimator significantly improve model fitness when studying cortical thickness and volume. Our final model is made openly and freely available through “Agitation," a tool usable as a CLI, python package and integrated in Nipoppy and Boutiques. By providing reliable motion estimates, our method offers researchers a tool to assess and account for potential biases in cortical morphometric analyses.
Survey on <scp>AI</scp> Ethics: A Socio‐Technical Perspective
Dave Mbiazi
Ivaxi Sheth
Patrik Joslin Kenfack
Abstract The past decade has observed a significant advancement in AI, with deep learning‐based models being deployed in diverse scenarios… (voir plus), including safety‐critical applications. As these AI systems become deeply embedded in our societal infrastructure, the repercussions of their decisions and actions have significant consequences, making the ethical implications of AI deployment highly relevant and essential. The ethical concerns associated with AI are multifaceted, including challenging issues of fairness, privacy and data protection, responsibility and accountability, safety and robustness, transparency and explainability, and environmental impact. These principles together form the foundations of ethical AI considerations that concern every stakeholder in the AI system lifecycle. In light of the present ethical and future x‐risk concerns, governments have shown increasing interest in establishing guidelines for the ethical deployment of AI. This work unifies the current and future ethical concerns of deploying AI into society. While we acknowledge and appreciate the technical surveys for each of the ethical principles concerned, in this paper, we aim to provide a comprehensive overview that not only addresses each principle from a technical point of view but also discusses them from a social perspective.
Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots
Abdalwhab Bakheet Mohamed Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (voir plus)ive robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions.
Source-free Domain Adaptation Requires Penalized Diversity
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance… (voir plus) of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
Learning From the Past with Cascading Eligibility Traces
Tokiniaina Raharison Ralambomihanta
Blake A. Richards
Handling Delay in Real-Time Reinforcement Learning
Real-time reinforcement learning (RL) introduces several challenges. First, policies are constrained to a fixed number of actions per second… (voir plus) due to hardware limitations. Second, the environment may change while the network is still computing an action, leading to observational delay. The first issue can partly be addressed with pipelining, leading to higher throughput and potentially better policies. However, the second issue remains: if each neuron operates in parallel with an execution time of
GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities
The rapid evolution of software libraries poses a considerable hurdle for code generation, necessitating continuous adaptation to frequent v… (voir plus)ersion updates while preserving backward compatibility. While existing code evolution benchmarks provide valuable insights, they typically lack execution-based evaluation for generating code compliant with specific library versions. To address this, we introduce GitChameleon 2.0, a novel, meticulously curated dataset comprising 328 Python code completion problems, each conditioned on specific library versions and accompanied by executable unit tests. GitChameleon 2.0 rigorously evaluates the capacity of contemporary large language models (LLMs), LLM-powered agents, code assistants, and RAG systems to perform version-conditioned code generation that demonstrates functional accuracy through execution. Our extensive evaluations indicate that state-of-the-art systems encounter significant challenges with this task; enterprise models achieving baseline success rates in the 48-51% range, underscoring the intricacy of the problem. By offering an execution-based benchmark emphasizing the dynamic nature of code libraries, GitChameleon 2.0 enables a clearer understanding of this challenge and helps guide the development of more adaptable and dependable AI code generation methods. We make the dataset and evaluation code publicly available at https://github.com/mrcabbage972/GitChameleonBenchmark.
Learning to Play Atari in a World of Tokens
Model-based reinforcement learning agents utilizing transformers have shown improved sample efficiency due to their ability to model extende… (voir plus)d context, resulting in more accurate world models. However, for complex reasoning and planning tasks, these methods primarily rely on continuous representations. This complicates modeling of discrete properties of the real world such as disjoint object classes between which interpolation is not plausible. In this work, we introduce discrete abstract representations for transformer-based learning (DART), a sample-efficient method utilizing discrete representations for modeling both the world and learning behavior. We incorporate a transformer-decoder for auto-regressive world modeling and a transformer-encoder for learning behavior by attending to task-relevant cues in the discrete representation of the world model. For handling partial observability, we aggregate information from past time steps as memory tokens. DART outperforms previous state-of-the-art methods that do not use look-ahead search on the Atari 100k sample efficiency benchmark with a median human-normalized score of 0.790 and beats humans in 9 out of 26 games. We release our code at https://pranaval.github.io/DART/.
Prioritizing Samples in Reinforcement Learning with Reducible Loss
Most reinforcement learning algorithms take advantage of an experience replay buffer to repeatedly train on samples the agent has observed i… (voir plus)n the past. Not all samples carry the same amount of significance and simply assigning equal importance to each of the samples is a naïve strategy. In this paper, we propose a method to prioritize samples based on how much we can learn from a sample. We define the learn-ability of a sample as the steady decrease of the training loss associated with this sample over time. We develop an algorithm to prioritize samples with high learn-ability, while assigning lower priority to those that are hard-to-learn, typically caused by noise or stochasticity. We empirically show that our method is more robust than random sampling and also better than just prioritizing with respect to the training loss, i.e. the temporal difference loss, which is used in prioritized experience replay.