Portrait de Samira Ebrahimi Kahou

Samira Ebrahimi Kahou

Membre affilié
Chaire en IA Canada-CIFAR
Professeure adjointe, University of Calgary, Départment de génie électrique et logiciel
Professeure adjointe, École de technologie suprérieure, Département de génie logiciel et technologies de l'information
Professeure adjointe, McGill University, École d'informatique
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage multimodal
Apprentissage par renforcement
Apprentissage profond
Traitement du langage naturel
Vision par ordinateur

Biographie

Je suis professeure adjointe à l'Université de Calgary, à l'école d'ingénierie Schulich au département de génie électrique et logiciel. Je suis aussi professeure adjointe au département de génie logiciel et technologies de l'information de l'École de technologie supérieure (ÉTS) et professeure adjointe à l'école d'informatique de l’Université McGill. Avant de me joindre à l'ÉTS, j'ai été stagiaire postdoctorale auprès de la professeure Doina Precup à l’Université McGill / Mila – Institut québécois d’intelligence artificielle. Préalablement à mon postdoctorat, j'ai été chercheuse à Microsoft Research, à Montréal. J'ai obtenu mon doctorat à Polytechnique Montréal / Mila en 2016 sous la supervision du professeur Chris Pal. Pendant mes études doctorales, j'ai travaillé sur la vision par ordinateur et l'apprentissage profond appliqués à la reconnaissance des émotions, au suivi d'objets et à la distillation de connaissances.

Étudiants actuels

Maîtrise recherche - École de technologie suprérieure
Doctorat - École de technologie suprérieure
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - McGill
Co-superviseur⋅e :
Maîtrise professionnelle - UdeM
Doctorat - École de technologie suprérieure
Superviseur⋅e principal⋅e :
Maîtrise recherche - École de technologie suprérieure
Doctorat - École de technologie suprérieure
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - École de technologie suprérieure
Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

Zero-Shot Anomaly Detection with Dual-Branch Prompt Learning
Zero-shot anomaly detection (ZSAD) enables identifying and localizing defects in unseen categories by relying solely on generalizable featur… (voir plus)es rather than requiring any labeled examples of anomalies. However, existing ZSAD methods, whether using fixed or learned prompts, struggle under domain shifts because their training data are derived from limited training domains and fail to generalize to new distributions. In this paper, we introduce PILOT, a framework designed to overcome these challenges through two key innovations: (1) a novel dual-branch prompt learning mechanism that dynamically integrates a pool of learnable prompts with structured semantic attributes, enabling the model to adaptively weight the most relevant anomaly cues for each input image; and (2) a label-free test-time adaptation strategy that updates the learnable prompt parameters using high-confidence pseudo-labels from unlabeled test data. Extensive experiments on 13 industrial and medical benchmarks demonstrate that PILOT achieves state-of-the-art performance in both anomaly detection and localization under domain shift.
GitChameleon: Evaluating AI Code Generation Against Python Library Version Incompatibilities
The rapid evolution of software libraries poses a considerable hurdle for code generation, necessitating continuous adaptation to frequent v… (voir plus)ersion updates while preserving backward compatibility. While existing code evolution benchmarks provide valuable insights, they typically lack execution-based evaluation for generating code compliant with specific library versions. To address this, we introduce GitChameleon, a novel, meticulously curated dataset comprising 328 Python code completion problems, each conditioned on specific library versions and accompanied by executable unit tests. GitChameleon rigorously evaluates the capacity of contemporary large language models (LLMs), LLM-powered agents, code assistants, and RAG systems to perform version-conditioned code generation that demonstrates functional accuracy through execution. Our extensive evaluations indicate that state-of-the-art systems encounter significant challenges with this task; enterprise models achieving baseline success rates in the 48-51\% range, underscoring the intricacy of the problem. By offering an execution-based benchmark emphasizing the dynamic nature of code libraries, GitChameleon enables a clearer understanding of this challenge and helps guide the development of more adaptable and dependable AI code generation methods. We make the dataset and evaluation code publicly available at https://github.com/mrcabbage972/GitChameleonBenchmark.
GitChameleon: Evaluating AI Code Generation Against Python Library Version Incompatibilities
The rapid evolution of software libraries poses a considerable hurdle for code generation, necessitating continuous adaptation to frequent v… (voir plus)ersion updates while preserving backward compatibility. While existing code evolution benchmarks provide valuable insights, they typically lack execution-based evaluation for generating code compliant with specific library versions. To address this, we introduce GitChameleon, a novel, meticulously curated dataset comprising 328 Python code completion problems, each conditioned on specific library versions and accompanied by executable unit tests. GitChameleon rigorously evaluates the capacity of contemporary large language models (LLMs), LLM-powered agents, code assistants, and RAG systems to perform version-conditioned code generation that demonstrates functional accuracy through execution. Our extensive evaluations indicate that state-of-the-art systems encounter significant challenges with this task; enterprise models achieving baseline success rates in the 48-51\% range, underscoring the intricacy of the problem. By offering an execution-based benchmark emphasizing the dynamic nature of code libraries, GitChameleon enables a clearer understanding of this challenge and helps guide the development of more adaptable and dependable AI code generation methods. We make the dataset and evaluation code publicly available at https://github.com/mrcabbage972/GitChameleonBenchmark.
Revisiting Laplacian Representations for Value Function Approximation in Deep RL
Proto-value functions (PVFs) introduced Laplacian embeddings as an effective feature basis for value-function approximation; however, their … (voir plus)utility remained limited to small, fully known state spaces. Recent work has scaled Laplacian embeddings to high-dimensional inputs, using them for reward shaping and option discovery in goal-directed tasks, yet only as auxiliary signals, rather than directly using them as features for value functions. In this paper, we learn Laplacian eigenvectors online and employ them as features for Q-learning in 23 Atari games. We empirically demonstrate that these online–learned embeddings substantially improve model-free RL in large, high-dimensional domains. We demonstrate that enriching state representations with action embeddings yields additional gains under both behavior-policy and uniform-random policies. Additionally, we introduce the Fusion architecture, which augments the representation with useful inductive bias at the embedding level. To assess the usefulness of each embedding used in the Fusion architecture, we use Shapley values analysis.
Behavioral Suite Analysis of Self-Supervised Learning in Atari
Towards Fair In-Context Learning with Tabular Foundation Models
Patrik Joslin Kenfack
Tabular foundational models have shown promising in-context learning capabilities on structured data by using training examples as context w… (voir plus)ithout further parameter adjustments. This emerging approach positions itself as a competitive alternative to traditional gradient-boosted tree methods. However, while biases in conventional machine learning models are well documented, it remains unclear how these biases manifest in Tabular ICL. The paper investigates the fairness implications of Tabular ICL and explores three preprocessing strategies—correlation removal, group-balanced demonstration selection, and uncertainty-based demonstration selection—to address bias. Comprehensive experiments indicate that uncertainty-based demonstration selection consistently enhances group fairness in the predictions. The source code for reproducing the results of this work can be found at https://anonymous.4open.science/r/Fair-TabICL-DD84.
Towards Fair In-Context Learning with Tabular Foundation Models
Patrik Joslin Kenfack
Tabular foundational models have shown promising in-context learning capabilities on structured data by using training examples as context w… (voir plus)ithout further parameter adjustments. This emerging approach positions itself as a competitive alternative to traditional gradient-boosted tree methods. However, while biases in conventional machine learning models are well documented, it remains unclear how these biases manifest in Tabular ICL. The paper investigates the fairness implications of Tabular ICL and explores three preprocessing strategies—correlation removal, group-balanced demonstration selection, and uncertainty-based demonstration selection—to address bias. Comprehensive experiments indicate that uncertainty-based demonstration selection consistently enhances group fairness in the predictions. The source code for reproducing the results of this work can be found at https://anonymous.4open.science/r/Fair-TabICL-DD84.
GradTune: Last-layer Fine-tuning for Group Robustness Without Group Annotation
Patrik Joslin Kenfack
This work addresses the limitations of deep neural networks (DNNs) in generalizing beyond training data due to spurious correlations. Recent… (voir plus) research has demonstrated that models trained with empirical risk minimization learn both core and spurious features, often upweighting spurious ones in the final classification, which can frequently lead to poor performance on minority groups. Deep Feature Reweighting alleviates this issue by retraining the model's last classification layer using a group-balanced held-out validation set. However, relying on spurious feature labels during training or validation limits practical application, as spurious features are not always known or costly to annotate. Our preliminary experiments reveal that ERM-trained models exhibit higher gradient norms on minority group samples in the hold-out dataset. Leveraging these insights, we propose an alternative approach called GradTune, which fine-tunes the last classification layer using high-gradient norm samples. Our results on four well-established benchmarks demonstrate that the proposed method can achieve competitive performance compared to existing methods without requiring group labels during training or validation.
GradTune: Last-layer Fine-tuning for Group Robustness Without Group Annotation
Patrik Joslin Kenfack
This work addresses the limitations of deep neural networks (DNNs) in generalizing beyond training data due to spurious correlations. Recent… (voir plus) research has demonstrated that models trained with empirical risk minimization learn both core and spurious features, often upweighting spurious ones in the final classification, which can frequently lead to poor performance on minority groups. Deep Feature Reweighting alleviates this issue by retraining the model's last classification layer using a group-balanced held-out validation set. However, relying on spurious feature labels during training or validation limits practical application, as spurious features are not always known or costly to annotate. Our preliminary experiments reveal that ERM-trained models exhibit higher gradient norms on minority group samples in the hold-out dataset. Leveraging these insights, we propose an alternative approach called GradTune, which fine-tunes the last classification layer using high-gradient norm samples. Our results on four well-established benchmarks demonstrate that the proposed method can achieve competitive performance compared to existing methods without requiring group labels during training or validation.
Towards personalized healthcare without harm via bias modulation
Frank Ngaha
Patrik Joslin Kenfack
Clinical prediction models are often personalized to target heterogeneous sub-groups by using demographic attributes such as race and gender… (voir plus) to train the model. Traditional personalization approaches involve using demographic attributes in input features or training multiple sub-models for different population subgroups (decoupling model). However, these methods often harm the performance at the subgroup level compared to non-personalized models. This paper presents a novel personalization method to improve model performance at the sub-group level. Our method involves a two-step process: first, we train a model to predict group attributes, and then we use this model to learn data-dependent biases to modulate a second model for diagnosis prediction. Our results demonstrate that this joint architecture achieves consistent performance gains across all sub-groups in the Heart dataset. Furthermore, in the mortality dataset, it improves performance in two of the four sub-groups. A comparison of our method with the traditional decoupled personalization method demonstrated a greater performance gain in the sub-groups with less harm. This approach offers a more effective and scalable solution for personalized models, which could have a positive impact in healthcare and other areas that require predictive models that take sub-group information into account.
Towards personalized healthcare without harm via bias modulation
Frank Ngaha
Patrik Joslin Kenfack
Personalized machine learning models have gained significant importance in various domains, including healthcare. However, designing efficie… (voir plus)nt personalized models remains a challenge. Traditional approaches often involve training multiple sub-models for different population sub-groups, which can be costly and does not always guarantee improved performance across all sub-groups. This paper presents a novel approach to improving model performance at the sub-group level by leveraging bias and training a joint model. Our method involves a two-step process: first, we train a model to predict group attributes, and then we use this model to learn data-dependent biases to modulate a second model for diagnosis prediction. Our results demonstrate that this joint architecture achieves consistent performance gains across all sub-groups in the Heart dataset. Furthermore, in the mortality dataset, it improves performance in two of the four sub-groups. A comparison of our method with the traditional decoupled personalization method demonstrated a greater performance gain in the sub-groups with less harm. This approach offers a more effective and scalable solution for personalization of models, which could have positive impact in healthcare and other areas that require predictive models which take sub-group information into account.
Learning Multi-agent Multi-machine Tending by Mobile Robots
Abdalwhab Abdalwhab
David St-Onge
Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (voir plus)ive robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions.