Portrait de Samira Ebrahimi Kahou

Samira Ebrahimi Kahou

Membre affilié
Professeure agrégée, University of Calgary, Départment de génie électrique et logiciel
Professeure associée, École de technologie suprérieure, Département de génie logiciel et technologies de l'information
Professeure associée, McGill University, École d'informatique
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage multimodal
Apprentissage par renforcement
Apprentissage profond
Traitement du langage naturel
Vision par ordinateur

Biographie

Samira est professeure agrégée à l’Université de Calgary, à la Schulich School of Engineering. Elle est également professeure associée à l’École de technologie supérieure (ÉTS), au Département de génie logiciel et des technologies de l’information, ainsi qu’à l’Université McGill, à l’École d’informatique. Elle est membre académique de Mila - Institut québécois d’intelligence artificielle et détient une Chaire canadienne CIFAR en IA. Samira a obtenu son doctorat en génie informatique à Polytechnique Montréal/Mila, avec un prix pour la meilleure thèse du département. Elle a également travaillé comme chercheuse postdoctorale à l’École d’informatique de l’Université McGill et comme chercheuse à Microsoft Research Montréal.

Samira et son groupe de recherche travaillent à résoudre des problèmes fondamentaux de l’apprentissage de représentations pour la prise de décision, avec un accent particulier sur l’explicabilité, la généralisation et l’apprentissage efficace. Ses travaux ont été publiés dans des conférences et revues de premier plan telles que NeurIPS, ICLR, ICML, ICCV, CVPR, TMLR et CoRL. Samira a reçu en 2024 le prix d’excellence en recherche en début de carrière de la Schulich School of Engineering. Ses contributions marquantes en apprentissage multimodal ont été reconnues à deux reprises par les prix ACM ICMI Ten-Year Technical Impact Awards : finaliste en 2023 et lauréate en 2025.

Étudiants actuels

Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - École de technologie suprérieure
Superviseur⋅e principal⋅e :
Doctorat - École de technologie suprérieure
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :

Publications

Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies
Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish durin… (voir plus)g training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by skipping information through a memory mechanism. We propose a new recurrent architecture (Non-saturating Recurrent Unit; NRU) that relies on a memory mechanism but forgoes both saturating activation functions and saturating gates, in order to further alleviate vanishing gradients. In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures.
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Batch normalization has been widely used to improve optimization in deep neural networks. While the uncertainty in batch statistics can act … (voir plus)as a regularizer, using these dataset statistics specific to the training set impairs generalization in certain tasks. Recently, alternative methods for normalizing feature activations in neural networks have been proposed. Among them, group normalization has been shown to yield similar, in some domains even superior performance to batch normalization. All these methods utilize a learned affine transformation after the normalization operation to increase representational power. Methods used in conditional computation define the parameters of these transformations as learnable functions of conditioning information. In this work, we study whether and where the conditional formulation of group normalization can improve generalization compared to conditional batch normalization. We evaluate performances on the tasks of visual question answering, few-shot learning, and conditional image generation.
FigureQA: An Annotated Figure Dataset for Visual Reasoning
Adam Atkinson
Ákos Kádár
Adam Trischler
We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are s… (voir plus)ynthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements. We study the proposed visual reasoning task by training several models, including the recently proposed Relation Network as a strong baseline. Preliminary results indicate that the task poses a significant machine learning challenge. We envision FigureQA as a first step towards developing models that can intuitively recognize patterns from visual representations of data.
Towards Deep Conversational Recommendations
There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendat… (voir plus)ion is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, until now there has been no publicly available large-scale dataset consisting of real-world dialogues centered around recommendations. To address this issue and to facilitate our exploration here, we have collected ReDial, a dataset consisting of over 10,000 conversations centered around the theme of providing movie recommendations. We make this data available to the community for further research. Second, we use this dataset to explore multiple facets of conversational recommendations. In particular we explore new neural architectures, mechanisms, and methods suitable for composing conversational recommendation systems. Our dataset allows us to systematically probe model sub-components addressing different parts of the overall problem domain ranging from: sentiment analysis and cold-start recommendation generation to detailed aspects of how natural language is used in this setting in the real world. We combine such sub-components into a full-blown dialogue system and examine its behavior.
RATM: Recurrent Attentive Tracking Model
We present an attention-based modular neural framework for computer vision. The framework uses a soft attention mechanism allowing models to… (voir plus) be trained with gradient descent. It consists of three modules: a recurrent attention module controlling where to look in an image or video frame, a feature-extraction module providing a representation of what is seen, and an objective module formalizing why the model learns its attentive behavior. The attention module allows the model to focus computation on task-related information in the input. We apply the framework to several object tracking tasks and explore various design choices. We experiment with three data sets, bouncing ball, moving digits and the real-world KTH data set. The proposed Recurrent Attentive Tracking Model performs well on all three tasks and can generalize to related but previously unseen sequences from a challenging tracking data set.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-Rfou
Amjad Almahairi
Christof Angermueller
Frédéric Bastien
Justin Bayer
Anatoly Belikov
Alexander Belopolsky
Josh Bleecher Snyder
Pierre-Luc Carrier
Paul Christiano
Myriam Côté
Yann N. Dauphin
Julien Demouth
Sander Dieleman
Ziye Fan
Mathieu Germain
Matt Graham
Balázs Hidasi
Arjun Jain
Kai Jia
Mikhail Korobov
Vivek Kulkarni
Pascal Lamblin
Eric Larsen
Sean Lee
Simon Lefrancois
Jesse A. Livezey
Cory Lorenz
Jeremiah Lowin
Qianli Ma
Robert T. McGibbon
Mehdi Mirza
Alberto Orlandi
Christopher Pal
Colin Raffel
Daniel Renshaw
Matthew Rocklin
Adriana Romero
Markus Roth
Peter Sadowski
John Salvatier
Jan Schlüter
John Schulman
Gabriel Schwartz
Iulian Vlad Serban
Samira Shabanian
Sigurd Spieckermann
S. Ramana Subramanyam
Gijs van Tulder
Sebastian Urban
Dustin J. Webb
Matthew Willson
Lijun Xue
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.