Portrait of Chris Pal

Chris Pal

Core Academic Member
Canada CIFAR AI Chair
Full Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Assistant Professor, Université de Montréal, Department of Computer Science and Operations Research
Research Topics
Deep Learning

Biography

Christopher Pal is a Canada CIFAR AI Chair, full professor at Polytechnique Montréal and adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Distinguished Scientist at ServiceNow Research.

Pal has been involved in AI and machine learning research for over twenty-five years and has published extensively on large-scale language modelling methods and generative modelling techniques. He has a PhD in computer science from the University of Waterloo.

Current Students

Research Intern - McGill University
Postdoctorate - HEC Montréal
Principal supervisor :
Collaborating researcher - McGill University
Principal supervisor :
Master's Research - Université de Montréal
PhD - Polytechnique Montréal
PhD - McGill University
Principal supervisor :
PhD - Université de Montréal
Principal supervisor :
PhD - Polytechnique Montréal
Master's Research - Université de Montréal
Co-supervisor :
Collaborating Alumni - Polytechnique Montréal
PhD - Polytechnique Montréal
Postdoctorate - McGill University
Co-supervisor :
Master's Research - Polytechnique Montréal
PhD - Université de Montréal
Co-supervisor :
Master's Research - Concordia University
Co-supervisor :
Collaborating researcher - Université de Montréal
Master's Research - Université de Montréal
PhD - Université de Montréal
PhD - Polytechnique Montréal
PhD - Polytechnique Montréal
PhD - École de technologie suprérieure
PhD - Université de Montréal
Principal supervisor :
Postdoctorate - HEC Montréal
Principal supervisor :
PhD - Polytechnique Montréal
Principal supervisor :
PhD - McGill University
Principal supervisor :
PhD - Polytechnique Montréal

Publications

Adversarial Mixup Resynthesizers
Christopher Beckham
Sina Honari
Alex Lamb
Vikas Verma
Farnoosh Ghadiri
In this paper, we explore new approaches to combining information encoded within the learned representations of autoencoders. We explore mod… (see more)els that are capable of combining the attributes of multiple inputs such that a resynthesised output is trained to fool an adversarial discriminator for real versus synthesised data. Furthermore, we explore the use of such an architecture in the context of semi-supervised learning, where we learn a mixing function whose objective is to produce interpolations of hidden states, or masked combinations of latent representations that are consistent with a conditioned class label. We show quantitative and qualitative evidence that such a formulation is an interesting avenue of research.
Neural Multisensory Scene Inference
Jae Hyun Lim
Pedro O. Pinheiro
Sungjin Ahn
For embodied agents to infer representations of the underlying 3D physical world they inhabit, they should efficiently combine multisensory … (see more)cues from numerous trials, e.g., by looking at and touching objects. Despite its importance, multisensory 3D scene representation learning has received less attention compared to the unimodal setting. In this paper, we propose the Generative Multisensory Network (GMN) for learning latent representations of 3D scenes which are partially observable through multiple sensory modalities. We also introduce a novel method, called the Amortized Product-of-Experts, to improve the computational efficiency and the robustness to unseen combinations of modalities at test time. Experimental results demonstrate that the proposed model can efficiently infer robust modality-invariant 3D-scene representations from arbitrary combinations of modalities and perform accurate cross-modal generation. To perform this exploration we have also developed a novel multi-sensory simulation environment for embodied agents.
Real-Time Reinforcement Learning
Simon Ramstedt
Recurrent transition networks for character locomotion
Félix Harvey
We present a novel approach, based on deep recurrent neural networks, to automatically generate transition animations given a past context o… (see more)f a few frames, a target character state and optionally local terrain information. The proposed Recurrent Transition Network (RTN) is trained without any gait, phase, contact or action labels. Our system produces realistic and fluid transitions that rival the quality of Motion Capture-based animations, even without any inverse-kinematics post-process. Our system could accelerate the creation of transition variations for large coverage or even replace transition nodes in a game's animation graph. The RTN also shows impressive results on a temporal super-resolution task.
Deep Learning recognizes weather and climate patterns
Karthik Kashinath
M. Prabhat
Mayur Mudigonda
Ankur Mahesh
Sookyung Kim
Yunjie Liu
B. Toms
Evan Racah
Christopher Beckham
Jim Biard
K. Kunkel
Dean Nesbit Williams
Travis O'Brien
M. Wehner
W. Collins
A Survey of Mobile Computing for the Visually Impaired
Martin Weiss
Margaux Luck
Roger Girgis
Joseph Paul Cohen
The number of visually impaired or blind (VIB) people in the world is estimated at several hundred million. Based on a series of interviews … (see more)with the VIB and developers of assistive technology, this paper provides a survey of machine-learning based mobile applications and identifies the most relevant applications. We discuss the functionality of these apps, how they align with the needs and requirements of the VIB users, and how they can be improved with techniques such as federated learning and model compression. As a result of this study we identify promising future directions of research in mobile perception, micro-navigation, and content-summarization.
Probabilistic Planning with Sequential Monte Carlo methods
Alexandre Piché
Valentin Thomas
Cyril Ibrahim
Focused Hierarchical RNNs for Conditional Sequence Processing
Nan Rosemary Ke
Konrad Żołna
Zhouhan Lin
Adam Trischler
Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most o… (see more)f these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence modelling tasks which allows them to attend to key parts of the input as needed. We formulate this using a multi-layer conditional sequence encoder that reads in one token at a time and makes a discrete decision on whether the token is relevant to the context or question being asked. The discrete gating mechanism takes in the context embedding and the current hidden state as inputs and controls information flow into the layer above. We train it using policy gradient methods. We evaluate this method on several types of tasks with different attributes. First, we evaluate the method on synthetic tasks which allow us to evaluate the model for its generalization ability and probe the behavior of the gates in more controlled settings. We then evaluate this approach on large scale Question Answering tasks including the challenging MS MARCO and SearchQA tasks. Our models shows consistent improvements for both tasks over prior work and our baselines. It has also shown to generalize significantly better on synthetic tasks as compared to the baselines.
Fashion-Gen: The Generative Fashion Dataset and Challenge
Seyedarian Hosseini
Thomas Boquet
Wojciech Stokowiec
Ying Zhang
Christian Jauvin
We introduce a new dataset of 293,008 high definition (1360 x 1360 pixels) fashion images paired with item descriptions provided by professi… (see more)onal stylists. Each item is photographed from a variety of angles. We provide baseline results on 1) high-resolution image generation, and 2) image generation conditioned on the given text descriptions. We invite the community to improve upon these baselines. In this paper, we also outline the details of a challenge that we are launching based upon this dataset.
Inferring Identity Factors for Grouped Examples
We propose a method for modelling groups of face images from the same identity. The model is trained to infer a distribution over the latent… (see more) space for identity given a small set of “training data”. One can then sample images using that latent representation to produce images of the same identity. We demonstrate that the model extracts disentangled factors for identity factors and image-specific vectors. We also perform generative classification over identities to assess its feasibility for few-shot face recognition.
Deep Complex Networks
Chiheb Trabelsi
Olexa Bilaniuk
Ying Zhang
Dmitriy Serdyuk
Sandeep Subramanian
Joao Felipe Santos
Soroush Mehri
At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and re… (see more)presentations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are competitive with their real-valued counterparts. We test deep complex models on several computer vision tasks, on music transcription using the MusicNet dataset and on Speech spectrum prediction using TIMIT. We achieve state-of-the-art performance on these audio-related tasks.
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
Sandeep Subramanian
Adam Trischler
A lot of the recent success in natural language processing (NLP) has been driven by distributed vector representations of words trained on l… (see more)arge amounts of text in an unsupervised manner. These representations are typically used as general purpose features for words across a range of NLP problems. However, extending this success to learning representations of sequences of words, such as sentences, remains an open problem. Recent work has explored unsupervised as well as supervised learning techniques with different training objectives to learn general purpose fixed-length sentence representations. In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model. We train this model on several data sources with multiple training objectives on over 100 million sentences. Extensive experiments demonstrate that sharing a single recurrent sentence encoder across weakly related tasks leads to consistent improvements over previous methods. We present substantial improvements in the context of transfer learning and low-resource settings using our learned general-purpose representations.