Chris Pal

Biography

Christopher Pal is a Canada CIFAR AI Chair, full professor at Polytechnique Montréal and adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Distinguished Scientist at ServiceNow Research.

Pal has been involved in AI and machine learning research for over twenty-five years and has published extensively on large-scale language modelling methods and generative modelling techniques. He has a PhD in computer science from the University of Waterloo.

Current Students

Mai Ababneh

Collaborating researcher - Formerly McGill University (but ending)

Paul Barde

Collaborating researcher - McGill University

Principal supervisor :

Master's Research - Université de Montréal

Can (Sam) Chen

Collaborating Alumni - McGill University

Principal supervisor :

Xue (Steve) Liu

Léa Demeule

PhD - Université de Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Chris Emezue

Master's Research - Université de Montréal

Co-supervisor :

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Co-supervisor :

Yousef Kotp

Master's Research - Concordia University

Co-supervisor :

PhD - Polytechnique Montréal

Co-supervisor :

Master's Research - Université de Montréal

Olga Luo

PhD - Université de Montréal

Aristides Milios

PhD - Université de Montréal

Joel Moniz

PhD - Polytechnique Montréal

Jonathan Pilault

PhD - Polytechnique Montréal

Juan Rodriguez

PhD - École de technologie suprérieure

Luke Rowe

PhD - Université de Montréal

Principal supervisor :

Gaurav Sahu

Postdoctorate - HEC Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Principal supervisor :

Collaborating researcher - McGill University

Principal supervisor :

Postdoctorate - Polytechnique Montréal

Co-supervisor :

PhD - Université de Montréal

Direct Behavior Specification via Constrained Reinforcement Learning

Joanna Wolski

Collaborating researcher

Blog Posts

August 31, 2022

Julien Roy

Roger Girgis

Joshua Romoff

Pierre-Luc Bacon

Chris Pal

Read the article

Publications

Medical Imaging with Deep Learning: MIDL 2020 - Short Paper Track

Tal Arbel

Ismail Ben Ayed

Marleen de Bruijne

Maxime Descoteaux

Hervé Lombaert

This compendium gathers all the accepted extended abstracts from the Third International Conference on Medical Imaging with Deep Learning (M… (see more)IDL 2020), held in Montreal, Canada, 6-9 July 2020. Note that only accepted extended abstracts are listed here, the Proceedings of the MIDL 2020 Full Paper Track are published in the Proceedings of Machine Learning Research (PMLR).

2020-06-29

ArXiv (preprint)

Active Domain Randomization

Bhairav Mehta

Manfred Diaz

Florian Golemo

Liam Paull

Domain randomization is a popular technique for improving domain transfer, often used in a zero-shot setting when the target domain is unkno… (see more)wn or cannot easily be used for training. In this work, we empirically examine the effects of domain randomization on agent generalization. Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters. We propose Active Domain Randomization, a novel algorithm that learns a parameter sampling strategy. Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances. We find that training more frequently on these instances leads to better overall agent generalization. In addition, when domain randomization and policy transfer fail, Active Domain Randomization offers more insight into the deficiencies of both the chosen parameter ranges and the learned policy, allowing for more focused debugging. Our experiments across various physics-based simulated and a real-robot task show that this enhancement leads to more robust, consistent policies.

2020-05-12

Proceedings of the Conference on Robot Learning (published)

proceedings.mlr.press

Leveraging cluster backbones for improving MAP inference in statistical relational models

Mohamed Hamza Ibrahim

Gilles Pesant

2020-05-07

Annals of Mathematics and Artificial Intelligence (published)

doi.org

Role-Wise Data Augmentation for Knowledge Distillation

Jie Fu

Xue Geng

Zhijian Duan

Bohan Zhuang

Xingdi Yuan

Adam Trischler

Jie Lin

Vijay Chandrasekhar

Hao Dong

2020-04-19

ArXiv (preprint)

Revision in Continuous Space: Unsupervised Text Style Transfer without Adversarial Learning

Dayiheng Liu

Jie Fu

Yidan Zhang

Jiancheng Lv

Typical methods for unsupervised text style transfer often rely on two key ingredients: 1) seeking the explicit disentanglement of the conte… (see more)nt and the attributes, and 2) troublesome adversarial learning. In this paper, we show that neither of these components is indispensable. We propose a new framework that utilizes the gradients to revise the sentence in a continuous space during inference to achieve text style transfer. Our method consists of three key components: a variational auto-encoder (VAE), some attribute predictors (one for each attribute), and a content predictor. The VAE and the two types of predictors enable us to perform gradient-based optimization in the continuous space, which is mapped from sentences in a discrete space, to find the representation of a target sentence with the desired attributes and preserved content. Moreover, the proposed method naturally has the ability to simultaneously manipulate multiple fine-grained attributes, such as sentence length and the presence of specific words, when performing text style transfer tasks. Compared with previous adversarial learning based methods, the proposed method is more interpretable, controllable and easier to train. Extensive experimental studies on three popular text style transfer tasks show that the proposed method significantly outperforms five state-of-the-art methods.

2020-04-03

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

Curriculum in Gradient-Based Meta-Reinforcement Learning

Bhairav Mehta

Tristan Deleu

Sharath Chandra Raparthy

Liam Paull

Gradient-based meta-learners such as Model-Agnostic Meta-Learning (MAML) have shown strong few-shot performance in supervised and reinforcem… (see more)ent learning settings. However, specifically in the case of meta-reinforcement learning (meta-RL), we can show that gradient-based meta-learners are sensitive to task distributions. With the wrong curriculum, agents suffer the effects of meta-overfitting, shallow adaptation, and adaptation instability. In this work, we begin by highlighting intriguing failure cases of gradient-based meta-RL and show that task distributions can wildly affect algorithmic outputs, stability, and performance. To address this problem, we leverage insights from recent literature on domain randomization and propose meta Active Domain Randomization (meta-ADR), which learns a curriculum of tasks for gradient-based meta-RL in a similar as ADR does for sim2real transfer. We show that this approach induces more stable policies on a variety of simulated locomotion and navigation tasks. We assess in- and out-of-distribution generalization and find that the learned task distributions, even in an unstructured task space, greatly improve the adaptation performance of MAML. Finally, we motivate the need for better benchmarking in meta-RL that prioritizes \textit{generalization} over single-task adaption performance.

2020-02-19

ArXiv (preprint)

Exploring Structural Inductive Biases in Emergent Communication

Agnieszka M Slowik

Abhinav Gupta

William L. Hamilton

M. Jamnik

S. Holden

Human language and thought are characterized by the ability to systematically generate a potentially infinite number of complex structures (… (see more)e.g., sentences) from a finite set of familiar components (e.g., words). Recent works in emergent communication have discussed the propensity of artificial agents to develop a systematically compositional language through playing co-operative referential games. The degree of structure in the input data was found to affect the compositionality of the emerged communication protocols. Thus, we explore various structural priors in multi-agent communication and propose a novel graph referential game. We compare the effect of structural inductive bias (bag-of-words, sequences and graphs) on the emergence of compositional understanding of the input concepts measured by topographic similarity and generalization to unseen combinations of familiar properties. We empirically show that graph neural networks induce a better compositional language prior and a stronger generalization to out-of-domain data. We further perform ablation studies that show the robustness of the emerged protocol in graph referential games.

2020-02-04

arXiv.org (preprint)

dblp.uni-trier.de

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Adversarial imitation learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones … (see more)-- and a generator's policy to produce trajectories that can fool this discriminator. This alternated optimization is known to be delicate in practice since it compounds unstable adversarial training with brittle and sample-inefficient reinforcement learning. We propose to remove the burden of the policy optimization steps by leveraging a novel discriminator formulation. Specifically, our discriminator is explicitly conditioned on two policies: the one from the previous generator's iteration and a learnable policy. When optimized, this discriminator directly learns the optimal generator's policy. Consequently, our discriminator's update solves the generator's optimization problem for free: learning a policy that imitates the expert does not require an additional optimization loop. This formulation effectively cuts by half the implementation and computational burden of adversarial imitation learning algorithms by removing the reinforcement learning phase altogether. We show on a variety of tasks that our simpler approach is competitive to prevalent imitation learning methods.

Finding and Visualizing Weaknesses of Deep Reinforcement Learning Agents

Christian Rupprecht

Cyril Ibrahim

As deep reinforcement learning driven by visual perception becomes more widely used there is a growing need to better understand and probe t… (see more)he learned agents. Understanding the decision making process and its relationship to visual inputs can be very valuable to identify problems in learned behavior. However, this topic has been relatively under-explored in the research community. In this work we present a method for synthesizing visual inputs of interest for a trained agent. Such inputs or states could be situations in which specific actions are necessary. Further, critical states in which a very high or a very low reward can be achieved are often interesting to understand the situational awareness of the system as they can correspond to risky states. To this end, we learn a generative model over the state space of the environment and use its latent space to optimize a target function for the state of interest. In our experiments we show that this method can generate insights for a variety of environments and reinforcement learning methods. We explore results in the standard Atari benchmark games as well as in an autonomous driving simulator. Based on the efficiency with which we have been able to identify behavioural weaknesses with this technique, we believe this general approach could serve as an important tool for AI safety applications.

2020-01-01

ICLR.cc/2020/Conference (poster)

Measuring Systematic Generalization in Neural Proof Generation with Transformers

Nicolas Gontier

Koustuv Sinha

Siva Reddy

We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded… (see more) in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.

A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms

Yoshua Bengio

Tristan Deleu

Nasim Rahaman

Nan Rosemary Ke

We propose to meta-learn causal structures based on how fast a learner adapts to new distributions arising from sparse distributional change… (see more)s, e.g. due to interventions, actions of agents and other sources of non-stationarities. We show that under this assumption, the correct causal structural choices lead to faster adaptation to modified distributions because the changes are concentrated in one or just a few mechanisms when the learned knowledge is modularized appropriately. This leads to sparse expected gradients and a lower effective number of degrees of freedom needing to be relearned while adapting to the change. It motivates using the speed of adaptation to a modified distribution as a meta-learning objective. We demonstrate how this can be used to determine the cause-effect relationship between two observed variables. The distributional changes do not need to correspond to standard interventions (clamping a variable), and the learner has no direct knowledge of these interventions. We show that causal structures can be parameterized via continuous variables and learned end-to-end. We then explore how these ideas could be used to also learn an encoder that would map low-level observed variables to unobserved causal variables leading to faster adaptation out-of-distribution, learning a representation space where one can satisfy the assumptions of independent mechanisms and of small and sparse changes in these mechanisms due to actions and non-stationarities.

2020-01-01

ICLR (published)

Reinforced active learning for image segmentation

Arantxa Casanova

Pedro O. Pinheiro

Negar Rostamzadeh

Learning-based approaches for semantic segmentation have two inherent challenges. First, acquiring pixel-wise labels is expensive and time-c… (see more)onsuming. Second, realistic segmentation datasets are highly unbalanced: some categories are much more abundant than others, biasing the performance to the most represented ones. In this paper, we are interested in focusing human labelling effort on a small subset of a larger pool of data, minimizing this effort while maximizing performance of a segmentation model on a hold-out set. We present a new active learning strategy for semantic segmentation based on deep reinforcement learning (RL). An agent learns a policy to select a subset of small informative image regions -- opposed to entire images -- to be labeled, from a pool of unlabeled data. The region selection decision is made based on predictions and uncertainties of the segmentation model being trained. Our method proposes a new modification of the deep Q-network (DQN) formulation for active learning, adapting it to the large-scale nature of semantic segmentation problems. We test the proof of concept in CamVid and provide results in the large-scale dataset Cityscapes. On Cityscapes, our deep RL region-based DQN approach requires roughly 30% less additional labeled data than our most competitive baseline to reach the same performance. Moreover, we find that our method asks for more labels of under-represented categories compared to the baselines, improving their performance and helping to mitigate class imbalance.

2020-01-01

ICLR (published)