Portrait of Irina Rish

Irina Rish

Core Academic Member
Canada CIFAR AI Chair
Full Professor, Université de Montréal, Department of Computer Science and Operations Research Department
Research Topics
Computational Neuroscience
Deep Learning
Generative Models
Multimodal Learning
Natural Language Processing
Online Learning
Reinforcement Learning

Biography

Irina Rish is a full professor at the Université de Montréal (UdeM), where she leads the Autonomous AI Lab, and a core academic member of Mila – Quebec Artificial Intelligence Institute.

In addition to holding a Canada Excellence Research Chair (CERC) and a CIFAR Chair, she leads the U.S. Department of Energy’s INCITE project on Scalable Foundation Models on Summit & Frontier supercomputers at the Oak Ridge Leadership Computing Facility. She co-founded and serves as CSO of Nolano.ai.

Rish’s current research interests include neural scaling laws and emergent behaviors (capabilities and alignment) in foundation models, as well as continual learning, out-of-distribution generalization and robustness.

Before joining UdeM in 2019, she was a research scientist at the IBM T.J. Watson Research Center, where she worked on various projects at the intersection of neuroscience and AI, and led the Neuro-AI challenge. She was awarded the IBM Eminence & Excellence Award and IBM Outstanding Innovation Award (2018), IBM Outstanding Technical Achievement Award (2017) and IBM Research Accomplishment Award (2009).

She holds 64 patents and has published 120 research papers, several book chapters, three edited books and a monograph on sparse modeling.

Current Students

Research Intern
PhD - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Master's Research - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
PhD - Concordia University
Principal supervisor :
PhD - Université de Montréal
Collaborating researcher - Université de Montréal
Master's Research - Concordia University
Principal supervisor :
PhD - Université de Montréal
Collaborating researcher
Co-supervisor :
Independent visiting researcher - -
Collaborating researcher - Université de Montréal
Collaborating Alumni - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
Master's Research - Concordia University
Principal supervisor :
Collaborating researcher - Université de Montréal
Collaborating Alumni - Université de Montréal
PhD - Concordia University
Principal supervisor :
Master's Research - Université de Montréal
Collaborating Alumni - Université de Montréal
Collaborating researcher
PhD - Université de Montréal
Collaborating researcher - Université de Montréal
Collaborating researcher - McGill University
Principal supervisor :
PhD - Université de Montréal
Co-supervisor :
Collaborating researcher
Co-supervisor :
Collaborating researcher - Polytechnique Montréal
PhD - McGill University
Principal supervisor :
Master's Research - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
PhD - McGill University
Principal supervisor :
Collaborating researcher
PhD - Concordia University
Principal supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
Co-supervisor :
Collaborating Alumni - Université de Montréal
Undergraduate - McGill University
PhD - Université de Montréal
Co-supervisor :
Master's Research - Université de Montréal
PhD - McGill University

Publications

Survey on Applications of Multi-Armed and Contextual Bandits
Djallel Bouneffouf
Charu Aggarwal
In recent years, the multi-armed bandit (MAB) framework has attracted a lot of attention in various applications, from recommender systems a… (see more)nd information retrieval to healthcare and finance. This success is due to its stellar performance combined with attractive properties, such as learning from less feedback. The multiarmed bandit field is currently experiencing a renaissance, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical bandit problem. This article aims to provide a comprehensive review of top recent developments in multiple real-life applications of the multi-armed bandit. Specifically, we introduce a taxonomy of common MAB-based applications and summarize the state-of-the-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this burgeoning field.
Chaotic Continual Learning
Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (see more)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.
An Empirical Study of Human Behavioral Agents in Bandits, Contextual Bandits and Reinforcement Learning.
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an enviro… (see more)nment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting.
Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Modeling Dialogues with Hashcode Representations: A Nonparametric Approach
Sahil Garg
Guillermo Cecchi
Palash Goyal
Shuyang Gao
Sarik Ghazarian
Greg Ver Steeg
Aram Galstyan
We propose a novel dialogue modeling framework, the first-ever nonparametric kernel functions based approach for dialogue modeling, which le… (see more)arns hashcodes as text representations; unlike traditional deep learning models, it handles well relatively small datasets, while also scaling to large ones. We also derive a novel lower bound on mutual information, used as a model-selection criterion favoring representations with better alignment between the utterances of participants in a collaborative dialogue setting, as well as higher predictability of the generated responses. As demonstrated on three real-life datasets, including prominently psychotherapy sessions, the proposed approach significantly outperforms several state-of-art neural network based dialogue systems, both in terms of computational efficiency, reducing training time from days or weeks to hours, and the response quality, achieving an order of magnitude improvement over competitors in frequency of being chosen as the best model by human evaluators.
Towards Lifelong Self-Supervision For Unpaired Image-to-Image Translation
Unpaired Image-to-Image Translation (I2IT) tasks often suffer from lack of data, a problem which self-supervised learning (SSL) has recently… (see more) been very popular and successful at tackling. Leveraging auxiliary tasks such as rotation prediction or generative colorization, SSL can produce better and more robust representations in a low data regime. Training such tasks along an I2IT task is however computationally intractable as model size and the number of task grow. On the other hand, learning sequentially could incur catastrophic forgetting of previously learned tasks. To alleviate this, we introduce Lifelong Self-Supervision (LiSS) as a way to pre-train an I2IT model (e.g., CycleGAN) on a set of self-supervised auxiliary tasks. By keeping an exponential moving average of past encoders and distilling the accumulated knowledge, we are able to maintain the network's validation performance on a number of tasks without any form of replay, parameter isolation or retraining techniques typically used in continual learning. We show that models trained with LiSS perform better on past tasks, while also being more robust than the CycleGAN baseline to color bias and entity entanglement (when two entities are very close).
Beyond Backprop: Different Approaches to Credit Assignment in Neural Nets
Resting-state connectivity stratifies premanifest Huntington’s disease by longitudinal cognitive decline rate
Pablo Polosecki
Eduardo Castro
Dorian Pustina
John H. Warner
Andrew Wood
Cristina Sampaio
Guillermo Cecchi
COVI White Paper - Version 1.1
Hannah Alsdurf
Prateek Gupta
Daphne Ippolito
Richard Janda
Max Jarvies
Tyler Kolody
Sekoul Krastev
Robert Obryk
Dan Pilat
Nasim Rahaman
Jean-François Rousseau
Abhinav Sharma
Brooke Struck … (see 3 more)
Yun William Yu
The SARS-CoV-2 (Covid-19) pandemic has caused significant strain on public health institutions around the world. Contact tracing is an essen… (see more)tial tool to change the course of the Covid-19 pandemic. Manual contact tracing of Covid-19 cases has significant challenges that limit the ability of public health authorities to minimize community infections. Personalized peer-to-peer contact tracing through the use of mobile apps has the potential to shift the paradigm. Some countries have deployed centralized tracking systems, but more privacy-protecting decentralized systems offer much of the same benefit without concentrating data in the hands of a state authority or for-profit corporations. Machine learning methods can circumvent some of the limitations of standard digital tracing by incorporating many clues and their uncertainty into a more graded and precise estimation of infection risk. The estimated risk can provide early risk awareness, personalized recommendations and relevant information to the user. Finally, non-identifying risk data can inform epidemiological models trained jointly with the machine learning predictor. These models can provide statistical evidence for the importance of factors involved in disease transmission. They can also be used to monitor, evaluate and optimize health policy and (de)confinement scenarios according to medical and economic productivity indicators. However, such a strategy based on mobile apps and machine learning should proactively mitigate potential ethical and privacy risks, which could have substantial impacts on society (not only impacts on health but also impacts such as stigmatization and abuse of personal data). Here, we present an overview of the rationale, design, ethical considerations and privacy strategy of `COVI,' a Covid-19 public peer-to-peer contact tracing and risk awareness mobile application developed in Canada.
Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning
Massimo Caccia
Pau Rodríguez
Lucas Caccia
Alexandre Lacoste
Continual learning studies agents that learn from streams of tasks without forgetting previous ones while adapting to new ones. Two recent c… (see more)ontinual-learning scenarios have opened new avenues of research. In meta-continual learning, the model is pre-trained to minimize catastrophic forgetting of previous tasks. In continual-meta learning, the aim is to train agents for faster remembering of previous tasks through adaptation. In their original formulations, both methods have limitations. We stand on their shoulders to propose a more general scenario, OSAKA, where an agent must quickly solve new (out-of-distribution) tasks, while also requiring fast remembering. We show that current continual learning, meta-learning, meta-continual learning, and continual-meta learning techniques fail in this new scenario. We propose Continual-MAML, an online extension of the popular MAML algorithm as a strong baseline for this scenario. We empirically show that Continual-MAML is better suited to the new scenario than the aforementioned methodologies, as well as standard continual learning and meta-learning approaches.
A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Drawing an inspiration from behavioral studies of human decision making, we propose here a more general and flexible parametric framework fo… (see more)r reinforcement learning that extends standard Q-learning to a two-stream model for processing positive and negative rewards, and allows to incorporate a wide range of reward-processing biases -- an important component of human decision making which can help us better understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems, as well as various neuropsychiatric conditions associated with disruptions in normal reward processing. From the computational perspective, we observe that the proposed Split-QL model and its clinically inspired variants consistently outperform standard Q-Learning and SARSA methods, as well as recently proposed Double Q-Learning approaches, on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the Pac-Man game in a lifelong learning setting across different reward stationarities.