Portrait de Chris Pal

Chris Pal

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur titulaire, Polytechnique Montréal, Département de génie informatique et de génie logiciel
Professeur associé, Université de Montréal, Département d'informatique et de recherche opérationnelle
Sujets de recherche
Apprentissage profond

Biographie

Christopher Pal est titulaire d'une chaire en IA Canada-CIFAR, professeur titulaire à Polytechnique Montréal et professeur adjoint au Département d'informatique et de recherche opérationnelle (DIRO) de l'Université de Montréal. Il est également chercheur émérite à ServiceNow Research. Il est engagé dans la recherche sur l'intelligence artificielle et l'apprentissage automatique depuis plus de 25 ans, publiant souvent des travaux sur les méthodes de modélisation du langage à grande échelle et les techniques de modélisation générative. Il a obtenu un doctorat en informatique à l'Université de Waterloo.

Étudiants actuels

Stagiaire de recherche - McGill
Postdoctorat - HEC
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - UdeM
Doctorat - Polytechnique
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - Polytechnique
Maîtrise recherche - UdeM
Co-superviseur⋅e :
Collaborateur·rice alumni - Polytechnique
Doctorat - Polytechnique
Postdoctorat - McGill
Co-superviseur⋅e :
Maîtrise recherche - Polytechnique
Doctorat - UdeM
Co-superviseur⋅e :
Collaborateur·rice de recherche - UdeM
Maîtrise recherche - UdeM
Doctorat - UdeM
Doctorat - Polytechnique
Doctorat - Polytechnique
Doctorat - École de technologie suprérieure
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - Polytechnique
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - Polytechnique

Publications

Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Pablo Pernias
Dominic Rampas
Mats Leon Richter
Marc Aubreville
Würstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion Models
Pablo Pernias
Dominic Rampas
Mats Leon Richter
Marc Aubreville
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Jo˜ao Monteiro
Étienne Marcotte
Pierre-Andre Noel
Valentina Zantedeschi
David Vazquez
Perouz Taslakian
In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference informati… (voir plus)on. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right context isn't known in advance, caching ICL can be challenging. This work addresses these limitations by introducing models that, inspired by the encoder-decoder architecture, use cross-attention to condition generation on reference text without the prompt. More precisely, we leverage pre-trained decoder-only models and only train a small number of added layers. We use Question-Answering (QA) as a testbed to evaluate the ability of our models to perform conditional generation and observe that they outperform ICL, are comparable to fine-tuned prompted LLMs, and drastically reduce the space footprint relative to standard KV caching by two orders of magnitude.
Capture the Flag: Uncovering Data Insights with Large Language Models
Issam Hadj Laradji
Perouz Taslakian
Sai Rajeswar
Valentina Zantedeschi
Alexandre Lacoste
David Vazquez
The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. Howev… (voir plus)er, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasoning and code generation techniques. We propose a new evaluation methodology based on a"capture the flag"principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset. We further propose two proof-of-concept agents, with different inner workings, and compare their ability to capture such flags in a real-world sales dataset. While the work reported here is preliminary, our results are sufficiently interesting to mandate future exploration by the community.
Capture the Flag: Uncovering Data Insights with Large Language Models
Issam Hadj Laradji
Perouz Taslakian
Sai Rajeswar
Valentina Zantedeschi
Alexandre Lacoste
David Vazquez
The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. Howev… (voir plus)er, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasoning and code generation techniques. We propose a new evaluation methodology based on a"capture the flag"principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset. We further propose two proof-of-concept agents, with different inner workings, and compare their ability to capture such flags in a real-world sales dataset. While the work reported here is preliminary, our results are sufficiently interesting to mandate future exploration by the community.
StarVector: Generating Scalable Vector Graphics Code from Images and Text
Juan A. Rodriguez
Abhay Puri
Shubham Agarwal
Issam Hadj Laradji
Pau Rodriguez
Sai Rajeswar
David Vazquez
StarVector: Generating Scalable Vector Graphics Code from Images and Text
Juan A. Rodriguez
Abhay Puri
Shubham Agarwal
Issam Hadj Laradji
Pau Rodriguez
Sai Rajeswar
David Vazquez
Scalable Vector Graphics (SVGs) are vital for modern image rendering due to their scalability and versatility. Previous SVG generation metho… (voir plus)ds have focused on curve-based vectorization, lacking semantic understanding, often producing artifacts, and struggling with SVG primitives beyond path curves. To address these issues, we introduce StarVector, a multimodal large language model for SVG generation. It performs image vectorization by understanding image semantics and using SVG primitives for compact, precise outputs. Unlike traditional methods, StarVector works directly in the SVG code space, leveraging visual understanding to apply accurate SVG primitives. To train StarVector, we create SVG-Stack, a diverse dataset of 2M samples that enables generalization across vectorization tasks and precise use of primitives like ellipses, polygons, and text. We address challenges in SVG evaluation, showing that pixel-based metrics like MSE fail to capture the unique qualities of vector graphics. We introduce SVG-Bench, a benchmark across 10 datasets, and 3 tasks: Image-to-SVG, Text-to-SVG generation, and diagram generation. Using this setup, StarVector achieves state-of-the-art performance, producing more compact and semantically rich SVGs.
StarVector: Generating Scalable Vector Graphics Code from Images and Text
Juan A. Rodriguez
Abhay Puri
Shubham Agarwal
Issam Hadj Laradji
Pau Rodriguez
Sai Rajeswar
David Vazquez
StarVector: Generating Scalable Vector Graphics Code from Images
Juan A. Rodriguez
Shubham Agarwal
Abhay Puri
Issam Hadj Laradji
Pau Rodriguez
David Vazquez
Sai Rajeswar
Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution,… (voir plus) versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simpler ones that require extensive processing and simplification. This paper introduces StarVector, a multimodal SVG generation model that effectively integrates Code Generation Large Language Models (CodeLLMs) and vision models. Our approach utilizes a CLIP image encoder to extract visual representations from pixel-based images, which are then transformed into visual tokens via an adapter module. These visual tokens are pre-pended to the SVG token embeddings, and the sequence is modeled by the StarCoder model using next-token prediction, effectively learning to align the visual and code tokens. This enables StarVector to generate unrestricted SVGs that accurately represent pixel images. To evaluate StarVector's performance, we present SVG-Bench, a comprehensive benchmark for evaluating SVG methods across multiple datasets and relevant metrics. Within this benchmark, we introduce novel datasets including SVG-Stack, a large-scale dataset of real-world SVG examples, and use it to pre-train StarVector as a large foundation model for SVGs. Our results demonstrate significant enhancements in visual quality and complexity handling over current methods, marking a notable advancement in SVG generation technology. Code and models: https://github.com/joanrod/star-vector
StarVector: Generating Scalable Vector Graphics Code from Images and Text
Juan A. Rodriguez
Abhay Puri
Shubham Agarwal
Issam Hadj Laradji
Pau Rodriguez
Sai Rajeswar
David Vazquez
Scalable Vector Graphics (SVGs) are vital for modern image rendering due to their scalability and versatility. Previous SVG generation metho… (voir plus)ds have focused on curve-based vectorization, lacking semantic understanding, often producing artifacts, and struggling with SVG primitives beyond path curves. To address these issues, we introduce StarVector, a multimodal large language model for SVG generation. It performs image vectorization by understanding image semantics and using SVG primitives for compact, precise outputs. Unlike traditional methods, StarVector works directly in the SVG code space, leveraging visual understanding to apply accurate SVG primitives. To train StarVector, we create SVG-Stack, a diverse dataset of 2M samples that enables generalization across vectorization tasks and precise use of primitives like ellipses, polygons, and text. We address challenges in SVG evaluation, showing that pixel-based metrics like MSE fail to capture the unique qualities of vector graphics. We introduce SVG-Bench, a benchmark across 10 datasets, and 3 tasks: Image-to-SVG, Text-to-SVG generation, and diagram generation. Using this setup, StarVector achieves state-of-the-art performance, producing more compact and semantically rich SVGs.
Capture the Flag: Uncovering Data Insights with Large Language Models
Issam Hadj Laradji
Perouz Taslakian
Sai Rajeswar
Valentina Zantedeschi
Alexandre Lacoste
David Vazquez
Are Diffusion Models Vision-And-Language Reasoners?
Benno Krojer
Elinor Poole-Dayan
Vikram Voleti
Text-conditioned image generation models have recently shown immense qualitative success using denoising diffusion processes. However, unlik… (voir plus)e discriminative vision-and-language models, it is a non-trivial task to subject these diffusion-based generative models to automatic fine-grained quantitative evaluation of high-level phenomena such as compositionality. Towards this goal, we perform two innovations. First, we transform diffusion-based models (in our case, Stable Diffusion) for any image-text matching (ITM) task using a novel method called DiffusionITM. Second, we introduce the Generative-Discriminative Evaluation Benchmark (GDBench) benchmark with 7 complex vision-and-language tasks, bias evaluation and detailed analysis. We find that Stable Diffusion + DiffusionITM is competitive on many tasks and outperforms CLIP on compositional tasks like like CLEVR and Winoground. We further boost its compositional performance with a transfer setup by fine-tuning on MS-COCO while retaining generative capabilities. We also measure the stereotypical bias in diffusion models, and find that Stable Diffusion 2.1 is, for the most part, less biased than Stable Diffusion 1.5. Overall, our results point in an exciting direction bringing discriminative and generative model evaluation closer. We will release code and benchmark setup soon.