Chris Pal

Biography

Christopher Pal is a Canada CIFAR AI Chair, full professor at Polytechnique Montréal and adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Distinguished Scientist at ServiceNow Research.

Pal has been involved in AI and machine learning research for over twenty-five years and has published extensively on large-scale language modelling methods and generative modelling techniques. He has a PhD in computer science from the University of Waterloo.

Current Students

Mai Ababneh

Collaborating researcher - Formerly McGill University (but ending)

Paul Barde

Collaborating researcher - McGill University

Principal supervisor :

Master's Research - Université de Montréal

Can (Sam) Chen

Collaborating Alumni - McGill University

Principal supervisor :

Xue (Steve) Liu

Léa Demeule

PhD - Université de Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Chris Emezue

Master's Research - Université de Montréal

Co-supervisor :

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Co-supervisor :

Yousef Kotp

Master's Research - Concordia University

Co-supervisor :

PhD - Polytechnique Montréal

Co-supervisor :

Master's Research - Université de Montréal

Olga Luo

PhD - Université de Montréal

Aristides Milios

PhD - Université de Montréal

Joel Moniz

PhD - Polytechnique Montréal

Jonathan Pilault

PhD - Polytechnique Montréal

Juan Rodriguez

PhD - École de technologie suprérieure

Luke Rowe

PhD - Université de Montréal

Principal supervisor :

Gaurav Sahu

Postdoctorate - HEC Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Principal supervisor :

Collaborating researcher - McGill University

Principal supervisor :

Postdoctorate - Polytechnique Montréal

Co-supervisor :

PhD - Université de Montréal

Direct Behavior Specification via Constrained Reinforcement Learning

Joanna Wolski

Collaborating researcher

Blog Posts

August 31, 2022

Julien Roy

Roger Girgis

Joshua Romoff

Pierre-Luc Bacon

Chris Pal

Read the article

Publications

Neural Multisensory Scene Inference

Pedro O. Pinheiro

For embodied agents to infer representations of the underlying 3D physical world they inhabit, they should efficiently combine multisensory … (see more)cues from numerous trials, e.g., by looking at and touching objects. Despite its importance, multisensory 3D scene representation learning has received less attention compared to the unimodal setting. In this paper, we propose the Generative Multisensory Network (GMN) for learning latent representations of 3D scenes which are partially observable through multiple sensory modalities. We also introduce a novel method, called the Amortized Product-of-Experts, to improve the computational efficiency and the robustness to unseen combinations of modalities at test time. Experimental results demonstrate that the proposed model can efficiently infer robust modality-invariant 3D-scene representations from arbitrary combinations of modalities and perform accurate cross-modal generation. To perform this exploration we have also developed a novel multi-sensory simulation environment for embodied agents.

Real-Time Reinforcement Learning

Simon Ramstedt

Recurrent transition networks for character locomotion

Félix Harvey

We present a novel approach, based on deep recurrent neural networks, to automatically generate transition animations given a past context o… (see more)f a few frames, a target character state and optionally local terrain information. The proposed Recurrent Transition Network (RTN) is trained without any gait, phase, contact or action labels. Our system produces realistic and fluid transitions that rival the quality of Motion Capture-based animations, even without any inverse-kinematics post-process. Our system could accelerate the creation of transition variations for large coverage or even replace transition nodes in a game's animation graph. The RTN also shows impressive results on a temporal super-resolution task.

2018-12-04

SIGGRAPH Asia 2018 Technical Briefs (published)

doi.org

Deep Learning recognizes weather and climate patterns

Karthik Kashinath

M. Prabhat

Mayur Mudigonda

Ankur Mahesh

Sookyung Kim

Yunjie Liu

Samira Ebrahimi Kahou

B. Toms

Evan Racah

Christopher Beckham

Tegan Maharaj

Jim Biard

K. Kunkel

Dean Nesbit Williams

Travis O'Brien

M. Wehner

W. Collins

A Survey of Mobile Computing for the Visually Impaired

Margaux Luck

The number of visually impaired or blind (VIB) people in the world is estimated at several hundred million. Based on a series of interviews … (see more)with the VIB and developers of assistive technology, this paper provides a survey of machine-learning based mobile applications and identifies the most relevant applications. We discuss the functionality of these apps, how they align with the needs and requirements of the VIB users, and how they can be improved with techniques such as federated learning and model compression. As a result of this study we identify promising future directions of research in mobile perception, micro-navigation, and content-summarization.

2018-11-25

ArXiv (preprint)

Probabilistic Planning with Sequential Monte Carlo methods

Alexandre Piché

Valentin Thomas

Cyril Ibrahim

Yoshua Bengio

2018-09-27

International Conference on Learning Representations (published)

Focused Hierarchical RNNs for Conditional Sequence Processing

Nan Rosemary Ke

Adam Trischler

Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most o… (see more)f these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence modelling tasks which allows them to attend to key parts of the input as needed. We formulate this using a multi-layer conditional sequence encoder that reads in one token at a time and makes a discrete decision on whether the token is relevant to the context or question being asked. The discrete gating mechanism takes in the context embedding and the current hidden state as inputs and controls information flow into the layer above. We train it using policy gradient methods. We evaluate this method on several types of tasks with different attributes. First, we evaluate the method on synthetic tasks which allow us to evaluate the model for its generalization ability and probe the behavior of the gates in more controlled settings. We then evaluate this approach on large scale Question Answering tasks including the challenging MS MARCO and SearchQA tasks. Our models shows consistent improvements for both tasks over prior work and our baselines. It has also shown to generalize significantly better on synthetic tasks as compared to the baselines.

2018-07-03

Proceedings of the 35th International Conference on Machine Learning (published)

proceedings.mlr.press

Fashion-Gen: The Generative Fashion Dataset and Challenge

Negar Rostamzadeh

Seyedarian Hosseini

Thomas Boquet

Wojciech Stokowiec

Ying Zhang

Christian Jauvin

We introduce a new dataset of 293,008 high definition (1360 x 1360 pixels) fashion images paired with item descriptions provided by professi… (see more)onal stylists. Each item is photographed from a variety of angles. We provide baseline results on 1) high-resolution image generation, and 2) image generation conditioned on the given text descriptions. We invite the community to improve upon these baselines. In this paper, we also outline the details of a challenge that we are launching based upon this dataset.

2018-06-21

ArXiv (preprint)

Inferring Identity Factors for Grouped Examples

Shawn Tan

Aaron Courville

We propose a method for modelling groups of face images from the same identity. The model is trained to infer a distribution over the latent… (see more) space for identity given a small set of “training data”. One can then sample images using that latent representation to produce images of the same identity. We demonstrate that the model extracts disentangled factors for identity factors and image-specific vectors. We also perform generative classification over identities to assess its feasibility for few-shot face recognition.

2018-02-12

(published)

Deep Complex Networks

At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and re… (see more)presentations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are competitive with their real-valued counterparts. We test deep complex models on several computer vision tasks, on music transcription using the MusicNet dataset and on Speech spectrum prediction using TIMIT. We achieve state-of-the-art performance on these audio-related tasks.

2018-01-01

ICLR.cc/2018/Conference (poster)

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Sandeep Subramanian

Adam Trischler

Yoshua Bengio

A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on l… (see more)arge amounts of text in an unsupervised manner. These representations are typically used as general purpose features for words across a range of NLP problems. However, extending this success to learning representations of sequences of words, such as sentences, remains an open problem. Recent work has explored unsupervised as well as supervised learning techniques with different training objectives to learn general purpose fixed-length sentence representations. In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model. We train this model on several data sources with multiple training objectives on over 100 million sentences. Extensive experiments demonstrate that sharing a single recurrent sentence encoder across weakly related tasks leads to consistent improvements over previous methods. We present substantial improvements in the context of transfer learning and low-resource settings using our learned general-purpose representations.

2018-01-01

ICLR (Poster) (published)

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding

Nan Rosemary Ke

Michael Curtis Mozer

Learning long-term dependencies in extended temporal sequences requires credit assignment to events far back in the past. The most common me… (see more)thod for training recurrent neural networks, back-propagation through time (BPTT), requires credit information to be propagated backwards through every single step of the forward computation, potentially over thousands or millions of time steps. This becomes computationally expensive or even infeasible when used with long sequences. Importantly, biological brains are unlikely to perform such detailed reverse replay over very long sequences of internal states (consider days, months, or years.) However, humans are often reminded of past memories or mental states which are associated with the current mental state. We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state. Based on this principle, we study a novel algorithm which only back-propagates through a few of these temporal skip connections, realized by a learned attention mechanism that associates current states with relevant past states. We demonstrate in experiments that our method matches or outperforms regular BPTT and truncated BPTT in tasks involving particularly long-term dependencies, but without requiring the biologically implausible backward replay through the whole history of states. Additionally, we demonstrate that the proposed method transfers to longer sequences significantly better than LSTMs trained with BPTT and LSTMs trained with full self-attention.