Chris Pal

Biography

Christopher Pal is a Canada CIFAR AI Chair, full professor at Polytechnique Montréal and adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal. He is also a Distinguished Scientist at ServiceNow Research.

Pal has been involved in AI and machine learning research for over twenty-five years and has published extensively on large-scale language modelling methods and generative modelling techniques. He has a PhD in computer science from the University of Waterloo.

Current Students

Mai Ababneh

Research Intern - McGill University

ababneh.mai@gmail.com

Shubham Agarwal

Postdoctorate - HEC Montréal

Principal supervisor :

Paul Barde

Collaborating researcher - McGill University

Principal supervisor :

Derek Nowrouzezahrai

paul.b.barde@gmail.com

Master's Research - Université de Montréal

Chris Beckham

PhD - Polytechnique Montréal

Can (Sam) Chen

PhD - McGill University

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Chris Emezue

Master's Research - Université de Montréal

Co-supervisor :

Collaborating Alumni - Polytechnique Montréal

Roger Girgis

PhD - Polytechnique Montréal

Florian Golemo

Postdoctorate - McGill University

Co-supervisor :

Master's Research - Polytechnique Montréal

PhD - Université de Montréal

Co-supervisor :

Yousef Kotp

Master's Research - Concordia University

Co-supervisor :

Collaborating researcher - Université de Montréal

Master's Research - Université de Montréal

Olga Luo

PhD - Université de Montréal

Joel Moniz

PhD - Polytechnique Montréal

Jonathan Pilault

PhD - Polytechnique Montréal

Juan Rodriguez

PhD - École de technologie suprérieure

Luke Rowe

PhD - Université de Montréal

Principal supervisor :

Gaurav Sahu

Postdoctorate - HEC Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Principal supervisor :

PhD - McGill University

Principal supervisor :

PhD - Polytechnique Montréal

Direct Behavior Specification via Constrained Reinforcement Learning

Blog Posts

August 31, 2022

Julien Roy

Roger Girgis

Joshua Romoff

Pierre-Luc Bacon

Chris Pal

Read the article

Publications

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Gaurav Sahu

Abhay Puri

Juan A. Rodriguez

Alexandre Drouin

Perouz Taslakian

Valentina Zantedeschi

Alexandre Lacoste

David Vazquez

Sai Rajeswar

Issam Hadj Laradji

Data analytics is essential for extracting valuable insights from data that can assist organizations in making effective decisions. We intro… (see more)duce InsightBench, a benchmark dataset with three key features. First, it consists of 100 datasets representing diverse business use cases such as finance and incident management, each accompanied by a carefully curated set of insights planted in the datasets. Second, unlike existing benchmarks focusing on answering single queries, InsightBench evaluates agents based on their ability to perform end-to-end data analytics, including formulating questions, interpreting answers, and generating a summary of insights and actionable steps. Third, we conducted comprehensive quality assurance to ensure that each dataset in the benchmark had clear goals and included relevant and meaningful questions and analysis. Furthermore, we implement a two-way evaluation mechanism using LLaMA-3 as an effective, open-source evaluator to assess agents' ability to extract insights. We also propose AgentPoirot, our baseline data analysis agent capable of performing end-to-end data analytics. Our evaluation on InsightBench shows that AgentPoirot outperforms existing approaches (such as Pandas Agent) that focus on resolving single queries. We also compare the performance of open- and closed-source LLMs and various evaluation strategies. Overall, this benchmark serves as a testbed to motivate further development in comprehensive automated data analytics.

2024-07-08

ArXiv (preprint)

Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

Benno Krojer

Dheeraj Vattikonda

Luis Lara

Varun Jampani

Eva Portelance

Siva Reddy

An image editing model should be able to perform diverse edits, ranging from object replacement, changing attributes or style, to performing… (see more) actions or movement, which require many forms of reasoning. Current general instruction-guided editing models have significant shortcomings with action and reasoning-centric edits. Object, attribute or stylistic changes can be learned from visually static datasets. On the other hand, high-quality data for action and reasoning-centric edits is scarce and has to come from entirely different sources that cover e.g. physical dynamics, temporality and spatial reasoning. To this end, we meticulously curate the AURORA Dataset (Action-Reasoning-Object-Attribute), a collection of high-quality training data, human-annotated and curated from videos and simulation engines. We focus on a key aspect of quality training data: triplets (source image, prompt, target image) contain a single meaningful visual change described by the prompt, i.e., truly minimal changes between source and target images. To demonstrate the value of our dataset, we evaluate an AURORA-finetuned model on a new expert-curated benchmark (AURORA-Bench) covering 8 diverse editing tasks. Our model significantly outperforms previous editing models as judged by human raters. For automatic evaluations, we find important flaws in previous metrics and caution their use for semantically hard editing tasks. Instead, we propose a new automatic metric that focuses on discriminative understanding. We hope that our efforts : (1) curating a quality training dataset and an evaluation benchmark, (2) developing critical evaluations, and (3) releasing a state-of-the-art model, will fuel further progress on general image editing.

2024-07-03

ArXiv (preprint)

RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content

Joao Monteiro

Pierre-Andre Noel

Étienne Marcotte

Sai Rajeswar

Valentina Zantedeschi

David Vazquez

Perouz Taslakian

Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includ… (see more)es encyclopedic documents that harbor a vast amount of general knowledge (e.g., Wikipedia) but also potentially overlap with benchmark datasets used for evaluating LLMs. Consequently, evaluating models on test splits that might have leaked into the training set is prone to misleading conclusions. To foster sound evaluation of language models, we introduce a new test dataset named RepLiQA, suited for question-answering and topic retrieval tasks. RepLiQA is a collection of five splits of test sets, four of which have not been released to the internet or exposed to LLM APIs prior to this publication. Each sample in RepLiQA comprises (1) a reference document crafted by a human annotator and depicting an imaginary scenario (e.g., a news article) absent from the internet; (2) a question about the document's topic; (3) a ground-truth answer derived directly from the information in the document; and (4) the paragraph extracted from the reference document containing the answer. As such, accurate answers can only be generated if a model can find relevant content within the provided document. We run a large-scale benchmark comprising several state-of-the-art LLMs to uncover differences in performance across models of various types and sizes in a context-conditional language modeling setting. Released splits of RepLiQA can be found here: https://huggingface.co/datasets/ServiceNow/repliqa.

2024-06-17

ArXiv (preprint)

RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content

Joao Monteiro

Pierre-Andre Noel

Étienne Marcotte

Sai Rajeswar

Valentina Zantedeschi

David Vazquez

Perouz Taslakian

2024-06-17

ArXiv (preprint)

Exploring validation metrics for offline model-based optimisation with diffusion models

Christopher Beckham

Alexandre Piché

David Vazquez

2024-06-13

TMLR (accepted)

openreview.net

Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Ge Ya Luo

Zhi Hao Luo

Anthony Gosselin

Alexia Jolicoeur-Martineau

With recent advances in video prediction, controllable video generation has been attracting more attention. Generating high fidelity videos … (see more)according to simple and flexible conditioning is of particular interest. To this end, we propose a controllable video generation model using pixel level renderings of 2D or 3D bounding boxes as conditioning. In addition, we also create a bounding box predictor that, given the initial and ending frames' bounding boxes, can predict up to 15 bounding boxes per frame for all the frames in a 25-frame clip. We perform experiments across 3 well-known AV video datasets: KITTI, Virtual-KITTI 2 and BDD100k.

2024-06-09

ArXiv (preprint)

Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion

Ge Ya Luo

Zhi Hao Luo

Anthony Gosselin

Alexia Jolicoeur-Martineau

Controllable video generation has attracted significant attention, largely due to advances in video diffusion models. In domains such as aut… (see more)onomous driving, it is essential to develop highly accurate predictions for object motions. This paper tackles a crucial challenge of how to exert precise control over object motion for realistic video synthesis. To accomplish this, we 1) control object movements using bounding boxes and extend this control to the renderings of 2D or 3D boxes in pixel space, 2) employ a distinct, specialized model to forecast the trajectories of object bounding boxes based on their previous and, if desired, future positions, and 3) adapt and enhance a separate video diffusion network to create video content based on these high quality trajectory forecasts. Our method, Ctrl-V, leverages modified and fine-tuned Stable Video Diffusion (SVD) models to solve both trajectory and video generation. Extensive experiments conducted on the KITTI, Virtual-KITTI 2, BDD100k, and nuScenes datasets validate the effectiveness of our approach in producing realistic and controllable video generation.

2024-06-09

ArXiv (preprint)

LLMs can learn self-restraint through iterative self-reflection

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their level of … (see more)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood, which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a utility function that can encourage the model to produce responses only when it is confident in them. This utility function can be used to score generation of different length and abstention. To optimize this function, we introduce ReSearch, a process of"self-reflection"consisting of iterative self-prompting and self-evaluation. We use the ReSearch algorithm to generate synthetic data on which we finetune our models. Compared to their original versions, our resulting models generate fewer \emph{hallucinations} overall at no additional inference cost, for both known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to abstain by augmenting the samples generated by the model during the search procedure with an answer expressing abstention.

2024-05-15

ArXiv (preprint)

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

Jo˜ao Monteiro

Étienne Marcotte

Pierre-Andre Noel

Valentina Zantedeschi

David Vazquez

Perouz Taslakian

In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference informati… (see more)on. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right context isn't known in advance, caching ICL can be challenging. This work addresses these limitations by introducing models that, inspired by the encoder-decoder architecture, use cross-attention to condition generation on reference text without the prompt. More precisely, we leverage pre-trained decoder-only models and only train a small number of added layers. We use Question-Answering (QA) as a testbed to evaluate the ability of our models to perform conditional generation and observe that they outperform ICL, are comparable to fine-tuned prompted LLMs, and drastically reduce the space footprint relative to standard KV caching by two orders of magnitude.

2024-04-23

ArXiv (preprint)

XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

Jo˜ao Monteiro

Étienne Marcotte

Pierre-Andre Noel

Valentina Zantedeschi

David Vazquez

Perouz Taslakian

2024-04-23

ArXiv (preprint)

CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning

Luke Rowe

Roger Girgis

Anthony Gosselin

Bruno Carrez

Florian Golemo

Felix Heide

Liam Paull

Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However… (see more), agents replayed from offline data are not reactive and hard to intuitively control. Existing approaches address these challenges by proposing methods that rely on heuristics or generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning (RL) to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through a physics-enhanced Nocturne simulator to generate a diverse offline RL dataset, annotated with various rewards. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including adversarial behaviours. We show that CtRL-Sim can generate realistic safety-critical scenarios while providing fine-grained control over agent behaviours.

2024-03-29

ArXiv (preprint)

Language Models Can Reduce Asymmetry in Information Markets

Nasim Rahaman

Martin Weiss

Manuel Wüthrich

Yoshua Bengio

Erran L. Li

Bernhard Schölkopf

This work addresses the buyer's inspection paradox for information markets. The paradox is that buyers need to access information to determi… (see more)ne its value, while sellers need to limit access to prevent theft. To study this, we introduce an open-source simulated digital marketplace where intelligent agents, powered by language models, buy and sell information on behalf of external participants. The central mechanism enabling this marketplace is the agents' dual capabilities: they not only have the capacity to assess the quality of privileged information but also come equipped with the ability to forget. This ability to induce amnesia allows vendors to grant temporary access to proprietary information, significantly reducing the risk of unauthorized retention while enabling agents to accurately gauge the information's relevance to specific queries or tasks. To perform well, agents must make rational decisions, strategically explore the marketplace through generated sub-queries, and synthesize answers from purchased information. Concretely, our experiments (a) uncover biases in language models leading to irrational behavior and evaluate techniques to mitigate these biases, (b) investigate how price affects demand in the context of informational goods, and (c) show that inspection and higher budgets both lead to higher quality outcomes.

2024-03-21

ArXiv (preprint)