Chris Pal

Biography

Christopher Pal is a Canada CIFAR AI Chair, full professor at Polytechnique Montréal and adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Distinguished Scientist at ServiceNow Research.

Pal has been involved in AI and machine learning research for over twenty-five years and has published extensively on large-scale language modelling methods and generative modelling techniques. He has a PhD in computer science from the University of Waterloo.

Current Students

Mai Ababneh

Research Intern - McGill University

ababneh.mai@gmail.com

Shubham Agarwal

Postdoctorate - HEC Montréal

Principal supervisor :

Paul Barde

Collaborating researcher - McGill University

Principal supervisor :

Derek Nowrouzezahrai

paul.b.barde@gmail.com

Master's Research - Université de Montréal

Chris Beckham

PhD - Polytechnique Montréal

Can (Sam) Chen

PhD - McGill University

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Chris Emezue

Master's Research - Université de Montréal

Co-supervisor :

Collaborating Alumni - Polytechnique Montréal

Roger Girgis

PhD - Polytechnique Montréal

Florian Golemo

Postdoctorate - McGill University

Co-supervisor :

Master's Research - Polytechnique Montréal

PhD - Université de Montréal

Co-supervisor :

Yousef Kotp

Master's Research - Concordia University

Co-supervisor :

Collaborating researcher - Université de Montréal

Master's Research - Université de Montréal

Olga Luo

PhD - Université de Montréal

Joel Moniz

PhD - Polytechnique Montréal

Jonathan Pilault

PhD - Polytechnique Montréal

Juan Rodriguez

PhD - École de technologie suprérieure

Luke Rowe

PhD - Université de Montréal

Principal supervisor :

Gaurav Sahu

Postdoctorate - HEC Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Principal supervisor :

PhD - McGill University

Principal supervisor :

PhD - Polytechnique Montréal

Direct Behavior Specification via Constrained Reinforcement Learning

Blog Posts

August 31, 2022

Julien Roy

Roger Girgis

Joshua Romoff

Pierre-Luc Bacon

Chris Pal

Read the article

Publications

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Juan Rodriguez

Xiangru Jian

Siba Smarak Panigrahi

Tianyu Zhang

Aarash Feizi

Abhay Puri

Akshay Kalkunte

Franccois Savard

Ahmed Masry

Shravan Nayak

Rabiul Awal

Mahsa Massoud

Amirhossein Abaskohi

Zichao Li

Suyuchen Wang

Pierre-Andre Noel

Mats Leon Richter

Saverio Vadacchino

Shubbam Agarwal

Sanket Biswas … (see 23 more)

Sara Shanian

Ying Zhang

Noah Bolger

Kurt MacDonald

Simon Fauvel

Sathwik Tejaswi

Srinivas Sunkara

Joao Monteiro

Krishnamurthy Dj Dvijotham

Torsten Scholak

Sepideh Kharaghani

Sean Hughes

M. Özsu

Issam Hadj Laradji

Spandanna Gella

Perouz Taslakian

David Vazquez

Sai Rajeswar

Multimodal AI has the potential to significantly enhance document-understanding tasks, such as processing receipts, understanding workflows,… (see more) extracting data from documents, and summarizing reports. Code generation tasks that require long-structured outputs can also be enhanced by multimodality. Despite this, their use in commercial applications is often limited due to limited access to training data and restrictive licensing, which hinders open access. To address these limitations, we introduce BigDocs-7.5M, a high-quality, open-access dataset comprising 7.5 million multimodal documents across 30 tasks. We use an efficient data curation process to ensure our data is high-quality and license-permissive. Our process emphasizes accountability, responsibility, and transparency through filtering rules, traceable metadata, and careful content analysis. Additionally, we introduce BigDocs-Bench, a benchmark suite with 10 novel tasks where we create datasets that reflect real-world use cases involving reasoning over Graphical User Interfaces (GUI) and code generation from images. Our experiments show that training with BigDocs-Bench improves average performance up to 25.8% over closed-source GPT-4o in document reasoning and structured output tasks such as Screenshot2HTML or Image2Latex generation. Finally, human evaluations showed a preference for outputs from models trained on BigDocs over GPT-4o. This suggests that BigDocs can help both academics and the open-source community utilize and improve AI tools to enhance multimodal capabilities and document reasoning. The project is hosted at https://bigdocs.github.io .

2024-12-05

ArXiv (preprint)

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Juan Rodriguez

Xiangru Jian

Siba Smarak Panigrahi

Tianyu Zhang

Aarash Feizi

Abhay Puri

Akshay Kalkunte

Franccois Savard

Ahmed Masry

Shravan Nayak

Rabiul Awal

Mahsa Massoud

Amirhossein Abaskohi

Zichao Li

Suyuchen Wang

Pierre-Andre Noel

Mats Leon Richter

Saverio Vadacchino

Shubbam Agarwal

Sanket Biswas … (see 23 more)

Sara Shanian

Ying Zhang

Noah Bolger

Kurt MacDonald

Simon Fauvel

Sathwik Tejaswi

Srinivas Sunkara

Joao Monteiro

Krishnamurthy Dj Dvijotham

Torsten Scholak

Sepideh Kharaghani

Sean Hughes

M. Özsu

Issam Hadj Laradji

Spandanna Gella

Perouz Taslakian

David Vazquez

Sai Rajeswar

2024-12-05

ArXiv (preprint)

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Juan Rodriguez

Xiangru Jian

Siba Smarak Panigrahi

Tianyu Zhang

Aarash Feizi

Abhay Puri

Akshay Kalkunte

Franccois Savard

Ahmed Masry

Shravan Nayak

Rabiul Awal

Mahsa Massoud

Amirhossein Abaskohi

Zichao Li

Suyuchen Wang

Pierre-Andre Noel

Mats Leon Richter

Saverio Vadacchino

Shubbam Agarwal

Sanket Biswas … (see 23 more)

Sara Shanian

Ying Zhang

Noah Bolger

Kurt MacDonald

Simon Fauvel

Sathwik Tejaswi

Srinivas Sunkara

Joao Monteiro

Krishnamurthy Dj Dvijotham

Torsten Scholak

Sepideh Kharaghani

Sean Hughes

M. Özsu

Issam Hadj Laradji

Spandanna Gella

Perouz Taslakian

David Vazquez

Sai Rajeswar

2024-12-05

ArXiv (preprint)

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Juan Rodriguez

Xiangru Jian

Siba Smarak Panigrahi

Tianyu Zhang

Aarash Feizi

Abhay Puri

Akshay Kalkunte

Franccois Savard

Ahmed Masry

Shravan Nayak

Rabiul Awal

Mahsa Massoud

Amirhossein Abaskohi

Zichao Li

Suyuchen Wang

Pierre-Andre Noel

M. L. Richter

Saverio Vadacchino

Shubbam Agarwal

Sanket Biswas … (see 23 more)

Sara Shanian

Ying Zhang

Noah Bolger

Kurt MacDonald

Simon Fauvel

Sathwik Tejaswi

Srinivas Sunkara

Joao Monteiro

Krishnamurthy Dj Dvijotham

Torsten Scholak

Sepideh Kharagani

Sean Hughes

M. Özsu

Issam Hadj Laradji

Spandanna Gella

Perouz Taslakian

David Vazquez

Sai Rajeswar

2024-12-05

ArXiv (preprint)

ParetoFlow: Guided Flows in Multi-Objective Optimization

Ye Yuan

Can Chen

Xue (Steve) Liu

In offline multi-objective optimization (MOO), we leverage an offline dataset of designs and their associated labels to simultaneously minim… (see more)ize multiple objectives. This setting more closely mirrors complex real-world problems compared to single-objective optimization. Recent works mainly employ evolutionary algorithms and Bayesian optimization, with limited attention given to the generative modeling capabilities inherent in such data. In this study, we explore generative modeling in offline MOO through flow matching, noted for its effectiveness and efficiency. We introduce ParetoFlow, specifically designed to guide flow sampling to approximate the Pareto front. Traditional predictor (classifier) guidance is inadequate for this purpose because it models only a single objective. In response, we propose a multi-objective predictor guidance module that assigns each sample a weight vector, representing a weighted distribution across multiple objective predictions. A local filtering scheme is introduced to address non-convex Pareto fronts. These weights uniformly cover the entire objective space, effectively directing sample generation towards the Pareto front. Since distributions with similar weights tend to generate similar samples, we introduce a neighboring evolution module to foster knowledge sharing among neighboring distributions. This module generates offspring from these distributions, and selects the most promising one for the next iteration. Our method achieves state-of-the-art performance across various tasks.

2024-12-04

ArXiv (preprint)

ParetoFlow: Guided Flows in Multi-Objective Optimization

Ye Yuan

Can Chen

Xue (Steve) Liu

2024-12-04

ArXiv (preprint)

IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez

Nicholas Botzer

David Vazquez

Marco Pedersoli

Issam Hadj Laradji

In today's digitally driven world, dialogue systems play a pivotal role in enhancing user interactions, from customer service to virtual ass… (see more)istants. In these dialogues, it is important to identify user's goals automatically to resolve their needs promptly. This has necessitated the integration of models that perform Intent Detection. However, users' intents are diverse and dynamic, making it challenging to maintain a fixed set of predefined intents. As a result, a more practical approach is to develop a model capable of identifying new intents as they emerge. We address the challenge of Intent Discovery, an area that has drawn significant attention in recent research efforts. Existing methods need to train on a substantial amount of data for correctly identifying new intents, demanding significant human effort. To overcome this, we introduce IntentGPT, a novel training-free method that effectively prompts Large Language Models (LLMs) such as GPT-4 to discover new intents with minimal labeled data. IntentGPT comprises an \textit{In-Context Prompt Generator}, which generates informative prompts for In-Context Learning, an \textit{Intent Predictor} for classifying and discovering user intents from utterances, and a \textit{Semantic Few-Shot Sampler} that selects relevant few-shot examples and a set of known intents to be injected into the prompt. Our experiments show that IntentGPT outperforms previous methods that require extensive domain-specific data and fine-tuning, in popular benchmarks, including CLINC and BANKING, among others.

2024-11-16

ArXiv (preprint)

IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez

Nicholas Botzer

David Vazquez

Marco Pedersoli

Issam Hadj Laradji

2024-11-16

ArXiv (preprint)

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Juan A. Rodriguez

Xiangru Jian

Siba Smarak Panigrahi

Tianyu Zhang

Aarash Feizi

Abhay Puri

Akshay Kalkunte Suresh

François Savard

Ahmed Masry

Shravan Nayak

Rabiul Awal

Mahsa Massoud

Amirhossein Abaskohi

Zichao Li

Suyuchen Wang

Pierre-Andre Noel

Mats Leon Richter

Saverio Vadacchino

Shubham Agarwal

Sanket Biswas … (see 23 more)

Sara Shanian

Ying Zhang

Noah Bolger

Kurt MacDonald

Simon Fauvel

Sathwik Tejaswi Madhusudhan

Srinivas Sunkara

Joao Monteiro

Krishnamurthy Dj Dvijotham

Torsten Scholak

Sepideh Kharaghani

Sean Hughes

M. Özsu

Issam Hadj Laradji

Spandana Gella

Perouz Taslakian

David Vazquez

Sai Rajeswar

2024-10-10

NeurIPS.cc/2024/Workshop/RBFM (poster)

openreview.net

Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Ge Ya Luo

Gian Mario Favero

Zhi Hao Luo

Alexia Jolicoeur-Martineau

The Fr\'echet Video Distance (FVD) is a widely adopted metric for evaluating video generation distribution quality. However, its effectivene… (see more)ss relies on critical assumptions. Our analysis reveals three significant limitations: (1) the non-Gaussianity of the Inflated 3D Convnet (I3D) feature space; (2) the insensitivity of I3D features to temporal distortions; (3) the impractical sample sizes required for reliable estimation. These findings undermine FVD's reliability and show that FVD falls short as a standalone metric for video generation evaluation. After extensive analysis of a wide range of metrics and backbone architectures, we propose JEDi, the JEPA Embedding Distance, based on features derived from a Joint Embedding Predictive Architecture, measured using Maximum Mean Discrepancy with polynomial kernel. Our experiments on multiple open-source datasets show clear evidence that it is a superior alternative to the widely used FVD metric, requiring only 16% of the samples to reach its steady value, while increasing alignment with human evaluation by 34%, on average.

2024-10-07

ArXiv (preprint)

Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality

Ge Ya Luo

Gian Favero

Zhi Hao Luo

Alexia Jolicoeur-Martineau

2024-10-07

ArXiv (preprint)

Robust Guided Diffusion for Offline Black-Box Optimization

Can Chen

Christopher Beckham

Zixuan Liu

Xue (Steve) Liu

Offline black-box optimization aims to maximize a black-box function using an offline dataset of designs and their measured properties. Two … (see more)main approaches have emerged: the forward approach, which learns a mapping from input to its value, thereby acting as a proxy to guide optimization, and the inverse approach, which learns a mapping from value to input for conditional generation. (a) Although proxy-free~(classifier-free) diffusion shows promise in robustly modeling the inverse mapping, it lacks explicit guidance from proxies, essential for generating high-performance samples beyond the training distribution. Therefore, we propose \textit{proxy-enhanced sampling} which utilizes the explicit guidance from a trained proxy to bolster proxy-free diffusion with enhanced sampling control. (b) Yet, the trained proxy is susceptible to out-of-distribution issues. To address this, we devise the module \textit{diffusion-based proxy refinement}, which seamlessly integrates insights from proxy-free diffusion back into the proxy for refinement. To sum up, we propose \textit{\textbf{R}obust \textbf{G}uided \textbf{D}iffusion for Offline Black-box Optimization}~(\textbf{RGD}), combining the advantages of proxy~(explicit guidance) and proxy-free diffusion~(robustness) for effective conditional generation. RGD achieves state-of-the-art results on various design-bench tasks, underscoring its efficacy. Our code is at https://github.com/GGchen1997/RGD.

2024-10-01

ArXiv (preprint)