Chris Pal

shubham.agarwal@mila.quebec

Biographie

Christopher Pal est titulaire d'une chaire en IA Canada-CIFAR, professeur titulaire à Polytechnique Montréal et professeur adjoint au Département d'informatique et de recherche opérationnelle (DIRO) de l'Université de Montréal. Il est également chercheur émérite à ServiceNow Research. Il est engagé dans la recherche sur l'intelligence artificielle et l'apprentissage automatique depuis plus de 25 ans, publiant souvent des travaux sur les méthodes de modélisation du langage à grande échelle et les techniques de modélisation générative. Il a obtenu un doctorat en informatique à l'Université de Waterloo.

Étudiants actuels

Shubham Agarwal

Postdoctorat - HEC

Superviseur⋅e principal⋅e :

Laurent Charlin

Site web

christopher.beckham@mila.quebec

Paul Barde

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - Polytechnique

Georges Belanger Albarran

Maîtrise recherche - UdeM

georges.belangeralbarran@mila.quebec

Simon Chamorro

Maîtrise recherche - Polytechnique

chamorrs@mila.quebec

matthew.fortier@mila.quebec

Can (Sam) Chen

Doctorat - McGill

Superviseur⋅e principal⋅e :

Doctorat - Polytechnique

elhattaa@mila.quebec

Chris Emezue

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Derek Nowrouzezahrai

chris.emezue@mila.quebec

Maîtrise recherche - Polytechnique

anthony.gosselin@mila.quebec

Roger Girgis

Doctorat - Polytechnique

girgisro@mila.quebec

Anthony Gosselin

Maîtrise recherche - Polytechnique

Doctorat - UdeM

Co-superviseur⋅e :

Sarath Chandar Anbil Parthipan

Doctorat - UdeM

Baudchon Hugo Baudchon

Collaborateur·rice de recherche - Université de Montréal

Superviseur⋅e principal⋅e :

Étienne Laliberté

hugo.baudchon@mila.quebec

Doctorat - UdeM

Collaborateur·rice de recherche

michelle.lin@mila.quebec

Doctorat - UdeM

Z Luo

Doctorat - Polytechnique

Doctorat - UdeM

Doctorat - UdeM

Doctorat - Polytechnique

Postdoctorat - UdeM

mats-leon.richter@mila.quebec

juan.rodriguez@mila.quebec

Juan Rodriguez

Doctorat - École de technologie suprérieure

Site web

Luke Rowe

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Liam Paull

luke.rowe@mila.quebec

aditya.sharma@mila.quebec

Julien Roy

Doctorat - Polytechnique

Co-superviseur⋅e :

Doctorat

Superviseur⋅e principal⋅e :

Amal Zouaq

Doctorat - McGill

Superviseur⋅e principal⋅e :

Derek Nowrouzezahrai

mattie.tesfaldet@mila.quebec

Doctorat - UdeM

Doctorat - Polytechnique

Billets de blogue

Direct Behavior Specification via Constrained Reinforcement Learning

31 août 2022

Spécification directe du comportement par apprentissage par renforcement sous contrainte

par

Julien Roy

Roger Girgis

Joshua Romoff

Pierre-Luc Bacon

Chris Pal

Lire l'article

Publications

CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning

Luke Rowe

Roger Girgis

Anthony Gosselin

Bruno Carrez

Florian Golemo

Felix Heide

Liam Paull

Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However… (voir plus), agents replayed from offline data do not react to the actions of the AV, and their behaviour cannot be easily controlled to simulate counterfactual scenarios. Existing approaches have attempted to address these shortcomings by proposing methods that rely on heuristics or learned generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning within a physics-enhanced Nocturne simulator to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through the Nocturne simulator to generate a diverse offline reinforcement learning dataset, annotated with various reward terms. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including those representing adversarial behaviours. We demonstrate that CtRL-Sim can efficiently generate diverse and realistic safety-critical scenarios while providing fine-grained control over agent behaviours. Further, we show that fine-tuning our model on simulated safety-critical scenarios generated by our model enhances this controllability.

2024-03-29

ArXiv (prépublication)

Language Models Can Reduce Asymmetry in Information Markets

Nasim Rahaman

Martin Weiss

Manuel Wüthrich

Yoshua Bengio

Erran L. Li

Bernhard Schölkopf

2024-03-21

ArXiv (prépublication)

Multi-Resolution Continuous Normalizing Flows

Vikram Voleti

Chris Finlay

Adam M. Oberman

2024-03-21

Annals of Mathematics and Artificial Intelligence (publié)

IntentGPT: Few-shot Intent Discovery with Large Language Models

Juan A. Rodriguez

Nicholas Botzer

David Vazquez

Marco Pedersoli

Issam Hadj Laradji

2024-03-11

ICLR.cc/2024/Workshop/LLMAgents (poster)

Self-evaluation and self-prompting to improve the reliability of LLMs

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

In order to safely deploy Large Language Models (LLMs), they must be capable of dynamically adapting their behavior based on their level of … (voir plus)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a simple objective that can encourage the model to produce generation that the model is confident in. To optimize this objective, we introduce ReSearch, an iterative search algorithm based on self-evaluation and self-prompting. Our method results in fewer hallucinations overall, both for known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to decline, when the model assesses that it cannot provide a response without a high proportion of hallucination.

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (publié)

Self-evaluation and self-prompting to improve the reliability of LLMs

Alexandre Piché

Aristides Milios

Dzmitry Bahdanau

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (publié)

Reinforcement Learning for Blind Stair Climbing with Legged and Wheeled-Legged Robots

Simon Chamorro

Victor Klemm

Miguel I. Valls

Roland Siegwart

2024-02-09

ArXiv (prépublication)

LitLLM: A Toolkit for Scientific Literature Review

Shubham Agarwal

Issam Hadj Laradji

Laurent Charlin

Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (voir plus) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.

2024-02-02

ArXiv (prépublication)

Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models

Pablo Pernias

Dominic Rampas

Mats Leon Richter

Marc Aubreville

2024-01-16

ICLR.cc/2024/Conference (présentation orale)

Capture the Flag: Uncovering Data Insights with Large Language Models

Issam Hadj Laradji

Perouz Taslakian

Sai Rajeswar

Valentina Zantedeschi

Alexandre Lacoste

Nicolas Chapados

David Vazquez

Alexandre Drouin

The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. Howev… (voir plus)er, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasoning and code generation techniques. We propose a new evaluation methodology based on a"capture the flag"principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset. We further propose two proof-of-concept agents, with different inner workings, and compare their ability to capture such flags in a real-world sales dataset. While the work reported here is preliminary, our results are sufficiently interesting to mandate future exploration by the community.

2023-12-21

ArXiv (prépublication)

StarVector: Generating Scalable Vector Graphics Code from Images

Juan A. Rodriguez

Shubham Agarwal

Issam Hadj Laradji

Pau Rodriguez

David Vazquez

Marco Pedersoli

Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution,… (voir plus) versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simpler ones that require extensive processing and simplification. This paper introduces StarVector, a multimodal SVG generation model that effectively integrates Code Generation Large Language Models (CodeLLMs) and vision models. Our approach utilizes a CLIP image encoder to extract visual representations from pixel-based images, which are then transformed into visual tokens via an adapter module. These visual tokens are pre-pended to the SVG token embeddings, and the sequence is modeled by the StarCoder model using next-token prediction, effectively learning to align the visual and code tokens. This enables StarVector to generate unrestricted SVGs that accurately represent pixel images. To evaluate StarVector's performance, we present SVG-Bench, a comprehensive benchmark for evaluating SVG methods across multiple datasets and relevant metrics. Within this benchmark, we introduce novel datasets including SVG-Stack, a large-scale dataset of real-world SVG examples, and use it to pre-train StarVector as a large foundation model for SVGs. Our results demonstrate significant enhancements in visual quality and complexity handling over current methods, marking a notable advancement in SVG generation technology. Code and models: https://github.com/joanrod/star-vector

2023-12-17

ArXiv (prépublication)

Capture the Flag: Uncovering Data Insights with Large Language Models

Issam Hadj Laradji

Perouz Taslakian

Sai Rajeswar

Valentina Zantedeschi

Alexandre Lacoste

Nicolas Chapados

David Vazquez

Alexandre Drouin

2023-11-07

NeurIPS.cc/2023/Workshop/FMDM (publié)