Portrait of Chris Pal

Chris Pal

Core Academic Member
Canada CIFAR AI Chair
Full Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research

Biography

Christopher Pal is a Canada CIFAR AI Chair, full professor at Polytechnique Montréal and adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Distinguished Scientist at ServiceNow Research.

Pal has been involved in AI and machine learning research for over twenty-five years and has published extensively on large-scale language modelling methods and generative modelling techniques. He has a PhD in computer science from the University of Waterloo.

Current Students

PhD - Université de Montréal
PhD - Polytechnique Montréal
PhD - Université de Montréal
Master's Research - Polytechnique Montréal
PhD - McGill University
Principal supervisor :
Master's Research - Université de Montréal
Co-supervisor :
PhD - Polytechnique Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
Collaborating researcher - Université de Montréal
Principal supervisor :
PhD - École de technologie suprérieure
PhD - Polytechnique Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Postdoctorate - Université de Montréal
Master's Research - Polytechnique Montréal
PhD - McGill University
Principal supervisor :
PhD - Polytechnique Montréal
Postdoctorate - HEC Montréal
Principal supervisor :
Master's Research - Polytechnique Montréal
PhD - Université de Montréal
PhD - Polytechnique Montréal

Publications

Proactive Contact Tracing
Prateek Gupta
Martin Weiss
Nasim Rahaman
Hannah Alsdurf
Nanor Minoyan
Soren Harnois-Leblanc
Joanna Merckx
andrew williams
Victor Schmidt
Pierre-Luc St-Charles
Akshay Patel
Yang Zhang
Bernhard Schölkopf
Score-based Diffusion Models in Function Space
Jae Hyun Lim
Nikola B. Kovachki
R. Baptista
Christopher Beckham
Kamyar Azizzadenesheli
Jean Kossaifi
Vikram Voleti
Jiaming Song
Karsten Kreis
J. Kautz
Arash Vahdat
Animashree Anandkumar
Language Decision Transformers with Exponential Tilt for Interactive Text Environments
Nicolas Gontier
Pau Rodriguez
Issam Hadj Laradji
David Vazquez
ArK: Augmented Reality with Knowledge Emergent Infrastructure
Qiuyuan Huang
J. Park
Abhinav Gupta
Pan Lu
Paul N. Bennett
Ran Gong
Subhojit Som
Baolin Peng
Owais Khan Mohammed
Yejin Choi
Jianfeng Gao
Despite the growing adoption of mixed reality and interactive AI, it remains challenging to generate high-quality 2D/3D scenes in unseen env… (see more)ironments. Typically, an AI agent requires collecting extensive training data for every new task, which can be costly or impossible for many domains. In this study, we develop an infinite agent that learns to transfer knowledge memory from general foundation models (e.g., GPT4, DALLE) to novel domains or scenarios for scene understanding and generation in physical or virtual worlds. Central to our approach is the interactive emerging mechanism, dubbed Augmented Reality with Knowledge Emergent Infrastructure (ArK) , which leverages knowledge-memory to generate scenes in unseen physical worlds and virtual reality environments. The knowledge interactive emergent ability (Figure 1) is demonstrated through i) micro-action of cross-modality : in multi-modality models to collect a large amount of relevant knowledge-memory data for each interaction task (e.g., unseen scene understanding) from the physical reality; and ii) macro-behavior of reality-agnostic : in mix-reality environments to improve interactions that tailor to different characterized roles, target variables, collaborative information, and so on. We validate ArK’s effectiveness in scene generation and editing tasks and show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes, highlighting its potential in applications such as metaverse and gaming simulation.
Block-State Transformers
Mahan Fathi
Jonathan Pilault
Orhan Firat
Ross Goroshin
Exploring validation metrics for offline model-based optimisation
Christopher Beckham
Alexandre Piché
David Vazquez
In offline model-based optimisation (MBO) we are interested in using machine learning to de-sign candidates that maximise some measure of d… (see more)esirability through an expensive but real-world scoring process. Offline MBO tries to approximate this expensive scoring function and use that to evaluate generated designs, however evaluation is non-exact because one approximation is being evaluated with another. Instead, we ask ourselves: if we did have the real world scoring function at hand, what cheap-to-compute validation metrics would correlate best with this? Since the real-world scoring function is available for simulated MBO datasets, insights obtained from this can be transferred over to real-world offline MBO tasks where the real-world scoring function is expensive to compute. To address this, we propose a conceptual evaluation framework that is amenable to measuring extrapolation, and apply this to conditional denoising diffusion models. Empirically, we find that two validation metrics – agreement and Frechet distance – correlate quite well with the ground truth. When there is high variability in conditional generation, feedback is required in the form of an approximated version of the real-world scoring function. Furthermore, we find that generating high-scoring samples may require heavily weighting the generative model in favour of sample quality, potentially at the cost of sample diversity.
Towards Learning to Imitate from a Single Video Demonstration
Florian Golemo
Agents that can learn to imitate given video observation -- \emph{without direct access to state or action information} are more applicable … (see more)to learning in the natural world. However, formulating a reinforcement learning (RL) agent that facilitates this goal remains a significant challenge. We approach this challenge using contrastive training to learn a reward function comparing an agent's behaviour with a single demonstration. We use a Siamese recurrent neural network architecture to learn rewards in space and time between motion clips while training an RL policy to minimize this distance. Through experimentation, we also find that the inclusion of multi-task data and additional image encoding losses improve the temporal consistency of the learned rewards and, as a result, significantly improves policy learning. We demonstrate our approach on simulated humanoid, dog, and raptor agents in 2D and a quadruped and a humanoid in 3D. We show that our method outperforms current state-of-the-art techniques in these environments and can learn to imitate from a single video demonstration.
Workflow Discovery from Dialogues in the Low Data Regime
Amine El hattami
Issam Hadj Laradji
Stefania Raimondo
David Vazquez
Pau Rodriguez
Text-based dialogues are now widely used to solve real-world problems. In cases where solution strategies are already known, they can someti… (see more)mes be codified into workflows and used to guide humans or artificial agents through the task of helping clients. We introduce a new problem formulation that we call Workflow Discovery (WD) in which we are interested in the situation where a formal workflow may not yet exist. Still, we wish to discover the set of actions that have been taken to resolve a particular problem. We also examine a sequence-to-sequence (Seq2Seq) approach for this novel task. We present experiments where we extract workflows from dialogues in the Action-Based Conversations Dataset (ABCD). Since the ABCD dialogues follow known workflows to guide agents, we can evaluate our ability to extract such workflows using ground truth sequences of actions. We propose and evaluate an approach that conditions models on the set of possible actions, and we show that using this strategy, we can improve WD performance. Our conditioning approach also improves zero-shot and few-shot WD performance when transferring learned models to unseen domains within and across datasets. Further, on ABCD a modified variant of our Seq2Seq method achieves state-of-the-art performance on related but different problems of Action State Tracking (AST) and Cascading Dialogue Success (CDS) across many evaluation metrics.
Implicit Offline Reinforcement Learning via Supervised Learning
Alexandre Piché
Rafael Pardinas
David Vazquez
Igor Mordatch
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset of varied b… (see more)ehaviors. It is as simple as supervised learning and Behavior Cloning (BC) but takes advantage of the return information. On BC tasks, implicit models have been shown to match or outperform explicit ones. Despite the benefits of using implicit models to learn robotic skills via BC, Offline RL via Supervised Learning algorithms have been limited to explicit models. We show how implicit models leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets. Furthermore, we show how closely related our implicit methods are to other popular RL via Supervised Learning algorithms.
SMPL-IK: Learned Morphology-Aware Inverse Kinematics for AI Driven Artistic Workflows
Vikram Voleti
Boris Oreshkin
Florent Bocquelet
Félix Harvey
Louis-Simon Ménard
Does Entity Abstraction Help Generative Transformers Reason?
Nicolas Gontier
We study the utility of incorporating entity type abstractions into pre-trained Transformers and test these methods on four NLP tasks requir… (see more)ing different forms of logical reasoning: (1) compositional language understanding with text-based relational reasoning (CLUTRR), (2) abductive reasoning (ProofWriter), (3) multi-hop question answering (HotpotQA), and (4) conversational question answering (CoQA). We propose and empirically explore three ways to add such abstraction: (i) as additional input embeddings, (ii) as a separate sequence to encode, and (iii) as an auxiliary prediction task for the model. Overall, our analysis demonstrates that models with abstract entity knowledge performs better than without it. The best abstraction aware models achieved an overall accuracy of 88.8% and 91.8% compared to the baseline model achieving 62.9% and 89.8% on CLUTRR and ProofWriter respectively. However, for HotpotQA and CoQA, we find that F1 scores improve by only 0.5% on average. Our results suggest that the benefit of explicit abstraction is significant in formally defined logical reasoning settings requiring many reasoning hops, but point to the notion that it is less beneficial for NLP tasks having less formal logical structure.
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation
Vikram Voleti
Alexia Jolicoeur-Martineau
Video prediction is a challenging task. The quality of video frames from current state-of-the-art (SOTA) generative models tends to be poor … (see more)and generalization beyond the training data is difficult. Furthermore, existing prediction frameworks are typically not capable of simultaneously handling other video-related tasks such as unconditional generation or interpolation. In this work, we devise a general-purpose framework called Masked Conditional Video Diffusion (MCVD) for all of these video synthesis tasks using a probabilistic conditional score-based denoising diffusion model, conditioned on past and/or future frames. We train the model in a manner where we randomly and independently mask all the past frames or all the future frames. This novel but straightforward setup allows us to train a single model that is capable of executing a broad range of video tasks, specifically: future/past prediction -- when only future/past frames are masked; unconditional generation -- when both past and future frames are masked; and interpolation -- when neither past nor future frames are masked. Our experiments show that this approach can generate high-quality frames for diverse types of videos. Our MCVD models are built from simple non-recurrent 2D-convolutional architectures, conditioning on blocks of frames and generating blocks of frames. We generate videos of arbitrary lengths autoregressively in a block-wise manner. Our approach yields SOTA results across standard video prediction and interpolation benchmarks, with computation times for training models measured in 1-12 days using