Portrait de Irina Rish

Irina Rish

Membre académique principal
Chaire en IA Canada-CIFAR
Professeure titulaire, Université de Montréal, Département d'informatique et de recherche opérationnelle

Biographie

Irina Rish est professeure titulaire à l'Université de Montréal (UdeM), où elle dirige le Laboratoire d'IA autonome. Membre du corps professoral de Mila – Institut québécois d’intelligence artificielle, elle est titulaire d'une chaire d'excellence en recherche du Canada (CERC) et d'une chaire en IA Canada-CIFAR. Irina dirige le projet INCITE du ministère américain de l'Environnement au sujet des modèles de fondation évolutifs sur les superordinateurs Summit et Frontier à l'Oak Ridge Leadership Computing Facility (OLCF). Elle est cofondatrice et directrice scientifique de Nolano.ai.

Ses recherches actuelles portent sur les lois de mise à l'échelle neuronale et les comportements émergents (capacités et alignement) dans les modèles de fondation, ainsi que sur l'apprentissage continu, la généralisation hors distribution et la robustesse. Avant de se joindre à l'UdeM en 2019, Irina était chercheuse au Centre de recherche IBM Thomas J. Watson, où elle a travaillé sur divers projets à l'intersection des neurosciences et de l'IA, et dirigé le défi NeuroAI. Elle a reçu plusieurs prix IBM : ceux de l’excellence et de l’innovation exceptionnelle (2018), celui de la réalisation technique exceptionnelle (2017), et celui de l’accomplissement en recherche (2009). Elle détient 64 brevets et a écrit plus de 120 articles de recherche, plusieurs chapitres de livres, trois livres publiés et une monographie sur la modélisation éparse.

Étudiants actuels

Doctorat - Université de Montréal
Superviseur⋅e principal⋅e :
Maîtrise recherche - Université de Montréal
Doctorat - Université de Montréal
Visiteur de recherche indépendant
Maîtrise recherche - Université de Montréal
Maîtrise recherche - Université de Montréal
Doctorat - Université de Montréal
Co-superviseur⋅e :
Collaborateur·rice de recherche
Doctorat - Université de Montréal
Co-superviseur⋅e :
Collaborateur·rice de recherche - Université de Montréal
Stagiaire de recherche - Technical University of Munich
Maîtrise recherche - Université de Montréal
Maîtrise recherche - Université de Montréal
Doctorat - McGill University
Superviseur⋅e principal⋅e :
Visiteur de recherche indépendant - Université de Montréal
Co-superviseur⋅e :
Doctorat - Concordia University
Superviseur⋅e principal⋅e :
Doctorat - Université de Montréal
Co-superviseur⋅e :
Collaborateur·rice alumni - Université de Montréal
Co-superviseur⋅e :
Maîtrise recherche - Université de Montréal
Co-superviseur⋅e :
Doctorat - Université de Montréal
Doctorat - Université de Montréal
Collaborateur·rice de recherche
Doctorat - Université de Montréal
Doctorat - McGill University
Superviseur⋅e principal⋅e :
Stagiaire de recherche - Université de Montréal
Maîtrise professionnelle - Université de Montréal
Doctorat - Université de Montréal
Superviseur⋅e principal⋅e :
Stagiaire de recherche - Université de Montréal
Collaborateur·rice de recherche - Politecnico di Milano
Doctorat - Université de Montréal
Co-superviseur⋅e :
Maîtrise recherche - Université de Montréal
Maîtrise recherche - Université de Montréal
Co-superviseur⋅e :
Maîtrise recherche - Université de Montréal
Collaborateur·rice de recherche - Université de Montréal
Doctorat - Université de Montréal
Maîtrise recherche - Université de Montréal
Maîtrise recherche - Université de Montréal
Doctorat - Université de Montréal
Co-superviseur⋅e :
Doctorat - Concordia University
Superviseur⋅e principal⋅e :
Postdoctorat - Université de Montréal
Superviseur⋅e principal⋅e :

Publications

Understanding Continual Learning Settings with Data Distribution Drift Analysis
Timothee LESORT
Massimo Caccia
Classical machine learning algorithms often assume that the data are drawn i.i.d. from a stationary probability distribution. Recently, cont… (voir plus)inual learning emerged as a rapidly growing area of machine learning where this assumption is relaxed, i.e. where the data distribution is non-stationary and changes over time. This paper represents the state of data distribution by a context variable
Predicting Infectiousness for Proactive Contact Tracing
Prateek Gupta
Nasim Rahaman
Martin Weiss
Tristan Deleu
Meng Qu
Victor Schmidt
Pierre-Luc St-Charles
Hannah Alsdurf
Olexa Bilaniuk
gaetan caron
pierre luc carrier
Joumana Ghosn
satya ortiz gagne
Bernhard Schölkopf … (voir 3 de plus)
abhinav sharma
andrew williams
The COVID-19 pandemic has spread rapidly worldwide, overwhelming manual contact tracing in many countries and resulting in widespread lockdo… (voir plus)wns for emergency containment. Large-scale digital contact tracing (DCT) has emerged as a potential solution to resume economic and social activity while minimizing spread of the virus. Various DCT methods have been proposed, each making trade-offs between privacy, mobility restrictions, and public health. The most common approach, binary contact tracing (BCT), models infection as a binary event, informed only by an individual's test results, with corresponding binary recommendations that either all or none of the individual's contacts quarantine. BCT ignores the inherent uncertainty in contacts and the infection process, which could be used to tailor messaging to high-risk individuals, and prompt proactive testing or earlier warnings. It also does not make use of observations such as symptoms or pre-existing medical conditions, which could be used to make more accurate infectiousness predictions. In this paper, we use a recently-proposed COVID-19 epidemiological simulator to develop and test methods that can be deployed to a smartphone to locally and proactively predict an individual's infectiousness (risk of infecting others) based on their contact history and other information, while respecting strong privacy constraints. Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT). We find a deep-learning based PCT method which improves over BCT for equivalent average mobility, suggesting PCT could help in safe re-opening and second-wave prevention.
Adversarial Feature Desensitization
Reza Bayat
Adam Ibrahim
Kartik Ahuja
Mojtaba Faramarzi
Touraj Laleh
Neural networks are known to be vulnerable to adversarial attacks -- slight but carefully constructed perturbations of the inputs which can … (voir plus)drastically impair the network's performance. Many defense methods have been proposed for improving robustness of deep networks by training them on adversarially perturbed inputs. However, these models often remain vulnerable to new types of attacks not seen during training, and even to slightly stronger versions of previously seen attacks. In this work, we propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs. This is achieved through a game where we learn features that are both predictive and robust (insensitive to adversarial attacks), i.e. cannot be used to discriminate between natural and adversarial data. Empirical results on several benchmarks demonstrate the effectiveness of the proposed approach against a wide range of attack types and attack strengths. Our code is available at https://github.com/BashivanLab/afd.
Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization
Kartik Ahuja
Ethan Caballero
Dinghuai Zhang
Jean-Christophe Gagnon-Audet
The invariance principle from causality is at the heart of notable approaches such as invariant risk minimization (IRM) that seek to address… (voir plus) out-of-distribution (OOD) generalization failures. Despite the promising theory, invariance principle-based approaches fail in common classification tasks, where invariant (causal) features capture all the information about the label. Are these failures due to the methods failing to capture the invariance? Or is the invariance principle itself insufficient? To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD. In contrast to the linear regression tasks, we show that for linear classification tasks we need much stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible. Furthermore, even with appropriate restrictions on distribution shifts in place, we show that the invariance principle alone is insufficient. We prove that a form of the information bottleneck constraint along with invariance helps address key failures when invariant features capture all the information about the label and also retains the existing success when they do not. We propose an approach that incorporates both of these principles and demonstrate its effectiveness in several experiments.
COVI-AgentSim: an Agent-based Model for Evaluating Methods of Digital Contact Tracing
Prateek Gupta
Martin Weiss
Nasim Rahaman
Hannah Alsdurf
abhinav sharma
Nanor Minoyan
Soren Harnois-Leblanc
Victor Schmidt
Pierre-Luc St-Charles
Tristan Deleu
andrew williams
Akshay Patel
Meng Qu
Olexa Bilaniuk
gaetan caron
pierre luc carrier
satya ortiz gagne
Marc-Andre Rousseau
Joumana Ghosn
Yang Zhang
Bernhard Schölkopf
Joanna Merckx
Survey on Applications of Multi-Armed and Contextual Bandits
Djallel Bouneffouf
Charu Aggarwal
In recent years, the multi-armed bandit (MAB) framework has attracted a lot of attention in various applications, from recommender systems a… (voir plus)nd information retrieval to healthcare and finance. This success is due to its stellar performance combined with attractive properties, such as learning from less feedback. The multiarmed bandit field is currently experiencing a renaissance, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical bandit problem. This article aims to provide a comprehensive review of top recent developments in multiple real-life applications of the multi-armed bandit. Specifically, we introduce a taxonomy of common MAB-based applications and summarize the state-of-the-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this burgeoning field.
Chaotic Continual Learning
Touraj Laleh
Mojtaba Faramarzi
Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (voir plus)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.
COVI White Paper
Hannah Alsdurf
Tristan Deleu
Prateek Gupta
Daphne Ippolito
Richard Janda
Max Jarvie
Tyler J. Kolody
Sekoul Krastev
Robert Obryk
Dan Pilat
Valerie Pisano
Benjamin Prud'homme
Meng Qu
Nasim Rahaman
Jean-franois Rousseau
abhinav sharma
Brooke Struck … (voir 3 de plus)
Martin Weiss
Yun William Yu
An Empirical Study of Human Behavioral Agents in Bandits, Contextual Bandits and Reinforcement Learning.
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an enviro… (voir plus)nment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting.
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Modeling Dialogues with Hashcode Representations: A Nonparametric Approach
Sahil Garg
Guillermo Cecchi
Palash Goyal
Shuyang Gao
Sarik Ghazarian
Greg Ver Steeg
Aram Galstyan
We propose a novel dialogue modeling framework, the first-ever nonparametric kernel functions based approach for dialogue modeling, which le… (voir plus)arns hashcodes as text representations; unlike traditional deep learning models, it handles well relatively small datasets, while also scaling to large ones. We also derive a novel lower bound on mutual information, used as a model-selection criterion favoring representations with better alignment between the utterances of participants in a collaborative dialogue setting, as well as higher predictability of the generated responses. As demonstrated on three real-life datasets, including prominently psychotherapy sessions, the proposed approach significantly outperforms several state-of-art neural network based dialogue systems, both in terms of computational efficiency, reducing training time from days or weeks to hours, and the response quality, achieving an order of magnitude improvement over competitors in frequency of being chosen as the best model by human evaluators.
Towards Lifelong Self-Supervision For Unpaired Image-to-Image Translation
Victor Schmidt
Makesh Narsimhan Sreedhar
Mostafa ElAraby
Unpaired Image-to-Image Translation (I2IT) tasks often suffer from lack of data, a problem which self-supervised learning (SSL) has recently… (voir plus) been very popular and successful at tackling. Leveraging auxiliary tasks such as rotation prediction or generative colorization, SSL can produce better and more robust representations in a low data regime. Training such tasks along an I2IT task is however computationally intractable as model size and the number of task grow. On the other hand, learning sequentially could incur catastrophic forgetting of previously learned tasks. To alleviate this, we introduce Lifelong Self-Supervision (LiSS) as a way to pre-train an I2IT model (e.g., CycleGAN) on a set of self-supervised auxiliary tasks. By keeping an exponential moving average of past encoders and distilling the accumulated knowledge, we are able to maintain the network's validation performance on a number of tasks without any form of replay, parameter isolation or retraining techniques typically used in continual learning. We show that models trained with LiSS perform better on past tasks, while also being more robust than the CycleGAN baseline to color bias and entity entanglement (when two entities are very close).