Portrait de Laurent Charlin

Laurent Charlin

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur associé, HEC Montréal, Département de sciences de la décision
Professeur agrégé, Université de Montréal, Département d'informatique et de recherche opérationnelle
Directeur scientifique par intérim, Équipe de direction
Sujets de recherche
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Exploration des données
IA pour la science
Modèles génératifs
Modèles probabilistes
Recherche d'information
Réseaux de neurones en graphes
Systèmes de recommandation
Traitement du langage naturel

Biographie

Laurent Charlin est Directeur scientifique par intérim à Mila – Institut québécois d’intelligence artificielle, titulaire d’une chaire en IA Canada-CIFAR et professeur associé à HEC Montréal. Il est également membre principal à Mila.

Ses recherches portent sur le développement de nouveaux modèles d'apprentissage automatique pour aider à la prise de décision. Ses travaux récents concernent l'apprentissage à partir de données qui évoluent dans le temps. Il travaille également sur des applications dans des domaines tels que les systèmes de recommandation et l'optimisation.

Il est l'auteur de publications très citées sur les systèmes de dialogue (chatbots). Laurent Charlin a codéveloppé le Toronto Paper Matching System (TPMS), qui a été largement utilisé dans les conférences d'informatique pour faire correspondre les réviseur·euse·s aux articles. Il a également contribué à plusieurs MOOC récents, et a donné des conférences d'introduction et des interviews dans les médias pour contribuer au transfert de connaissances et améliorer la culture de l'IA.

Étudiants actuels

Maîtrise recherche - HEC
Postdoctorat - HEC
Co-superviseur⋅e :
Maîtrise recherche - HEC
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Maîtrise recherche - HEC
Doctorat - Université Laval
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - Concordia
Superviseur⋅e principal⋅e :
Collaborateur·rice alumni - UdeM
Doctorat - UdeM
Doctorat - UdeM

Publications

PREFERENCE OPTIMIZATION FOR CONCEPT BOTTLENECK MODELS
Emiliano Penaloza
Tianyue H. Zhang
Mateo Espinosa Zarlenga
Concept Bottleneck Models (CBMs) propose to enhance the trustworthiness of AI systems by constraining their decisions on a set of human-unde… (voir plus)rstandable concepts. However, CBMs typically assume that datasets contain accurate concept labels—an assumption often violated in practice, which we show can significantly degrade performance (by 25% in some cases). To address this, we introduce the Concept Preference Optimization (CPO) objective, a new loss function based on Direct Preference Optimization, which effectively mitigates the negative impact of concept mislabeling on CBM performance. We provide an analysis of some key properties of the CPO objective showing it directly optimizes for the concept’s posterior distribution, and contrast it against Binary Cross Entropy (BCE) where we show CPO is inherently less sensitive to concept noise. We empirically confirm our analysis finding that CPO consistently outperforms BCE in three real-world datasets with and without added label noise.
Integrating Present and Past in Unsupervised Continual Learning
Yipeng Zhang
Richard Zemel
Mengye Ren
We formulate a unifying framework for *unsupervised continual learning (UCL)*, which disentangles learning objectives that are specific to t… (voir plus)he present and the past data, encompassing *stability*, *plasticity*, and *cross-task consolidation*. The framework reveals that many existing UCL approaches overlook cross-task consolidation and try to balance plasticity and stability in a shared embedding space. This results in worse performance due to a lack of within-task data diversity and reduced effectiveness in learning the current task. Our method, *Osiris*, which explicitly optimizes all three objectives on separate embedding spaces, achieves state-of-the-art performance on all benchmarks, including two novel ones proposed in this paper featuring semantically structured task sequences. Finally, we show some preliminary evidence that continual models can benefit from such more realistic learning scenarios.
TEARS: Text Representations for Scrutable Recommendations
Emiliano Penaloza
Olivier Gouvert
Haolun Wu
Traditional recommender systems rely on high-dimensional (latent) embeddings for modeling user-item interactions, often resulting in opaque … (voir plus)representations that lack interpretability. Moreover, these systems offer limited control to users over their recommendations. Inspired by recent work, we introduce TExtuAl Representations for Scrutable recommendations (TEARS) to address these challenges. Instead of representing a user’s interests through latent embed- dings, TEARS encodes them in natural text, providing transparency and allowing users to edit them. To encode such preferences, we use modern LLMs to generate high-quality user summaries which we find uniquely capture user preferences. Using these summaries we take a hybrid approach where we use an optimal transport procedure to align the summaries’ representations with the repre- sentation of a standard VAE for collaborative filtering. We find this approach can surpass the performance of the three popular VAE models while providing user-controllable recommendations. We further analyze the controllability of TEARS through three simu- lated user tasks to evaluate the effectiveness of user edits on their summaries. Our code and all user-summaries can be seen in an anonymized repository.
LitLLMs, LLMs for Literature Review: Are we there yet?
Shubham Agarwal
Gaurav Sahu
Abhay Puri
Issam Hadj Laradji
Krishnamurthy Dj Dvijotham
Jason Stanley
LLMs for Literature Review: Are we there yet?
Shubham Agarwal
Gaurav Sahu
Abhay Puri
Issam Hadj Laradji
Krishnamurthy Dj Dvijotham
Jason Stanley
Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially du… (voir plus)e to the recent influx of research papers. This paper explores the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We decompose the task into two components: 1. Retrieving related works given a query abstract, and 2. Writing a literature review based on the retrieved results. We analyze how effective LLMs are for both components. For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods, while providing insights into the LLM's decision-making process. In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We release this evaluation protocol to promote additional research and development in this regard. Our empirical results suggest that LLMs show promising potential for writing literature reviews when the task is decomposed into smaller components of retrieval and planning. Further, we demonstrate that our planning-based approach achieves higher-quality reviews by minimizing hallucinated references in the generated review by 18-26% compared to existing simpler LLM-based generation methods.
LLMs for Literature Review: Are we there yet?
Shubham Agarwal
Gaurav Sahu
Abhay Puri
Issam Hadj Laradji
Krishnamurthy Dj Dvijotham
Jason Stanley
Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially du… (voir plus)e to the recent influx of research papers. This paper explores the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We decompose the task into two components: 1. Retrieving related works given a query abstract, and 2. Writing a literature review based on the retrieved results. We analyze how effective LLMs are for both components. For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods, while providing insights into the LLM's decision-making process. In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We release this evaluation protocol to promote additional research and development in this regard. Our empirical results suggest that LLMs show promising potential for writing literature reviews when the task is decomposed into smaller components of retrieval and planning. Further, we demonstrate that our planning-based approach achieves higher-quality reviews by minimizing hallucinated references in the generated review by 18-26% compared to existing simpler LLM-based generation methods.
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Oleksiy Ostapenko
Zhan Su
Edoardo Ponti
Matheus Pereira
Lucas Caccia
Learning to Design Data-structures: A Case Study of Nearest Neighbor Search
Omar Salemohamed
Vatsal Sharan
Shivam Garg
Gregory Valiant
We propose a general framework for automating data-structure design and apply it to the problem of nearest neighbor search. Our model adapts… (voir plus) to the underlying data distribution and provides fine-grained control over query and space complexity, enabling the discovery of solutions tailored to problem-specific constraints. We are able to reverse-engineer learned algorithms in several settings. In 1D, the model discovers optimal distribution (in)dependent algorithms such as binary search and variants of interpolation search. In higher dimensions, the model learns solutions that resemble K-d trees in some regimes, while in others, have elements of locality-sensitive hashing.
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Oleksiy Ostapenko
Zhan Su
Edoardo Ponti
Matheus Pereira
Lucas Caccia
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trai… (voir plus)ned adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.
Applying Recurrent Neural Networks and Blocked Cross-Validation to Model Conventional Drinking Water Treatment Processes
Aleksandar Jakovljevic
Benoit Barbeau
The jar test is the current standard method for predicting the performance of a conventional drinking water treatment (DWT) process and opti… (voir plus)mizing the coagulant dose. This test is time-consuming and requires human intervention, meaning it is infeasible for making continuous process predictions. As a potential alternative, we developed a machine learning (ML) model from historical DWT plant data that can operate continuously using real-time sensor data without human intervention for predicting clarified water turbidity 15 min in advance. We evaluated three types of models: multilayer perceptron (MLP), the long short-term memory (LSTM) recurrent neural network (RNN), and the gated recurrent unit (GRU) RNN. We also employed two training methodologies: the commonly used holdout method and the theoretically correct blocked cross-validation (BCV) method. We found that the RNN with GRU was the best model type overall and achieved a mean absolute error on an independent production set of as low as 0.044 NTU. We further found that models trained using BCV typically achieve errors equal to or lower than their counterparts trained using holdout. These results suggest that RNNs trained using BCV are superior for the development of ML models for DWT processes compared to those reported in earlier literature.
Price Forecasting in the Ontario Electricity Market via TriConvGRU Hybrid Model: Univariate vs. Multivariate Frameworks
Behdad Ehsani
Pierre-Olivier Pineau
Electricity price forecasting is a challenging task for decision-makers in deregulated power markets due to the inherent characteristics of … (voir plus)electricity prices, e.g., high frequency and volatility. Therefore, accurate forecasting of electricity prices can assist market participants in maximizing their profit. Accordingly, we proposed a novel hybrid Deep Learning model to forecast one-step, two-step, and three-step ahead Ontario electricity prices based on a Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU). Our model consists of three consecutive CNN-GRU models combined in parallel with different input data. We downsampled input data via pooling layers at the beginning of two streams of the model to capture different frequencies of price patterns concurrently. Also, a set of external variables, including previous prices, electricity load, generation, import and export, and weather data, were considered in our forecasting models to test whether these features improve the efficiency of the models. Finally, three experiments in various weeks of 2022 were carried out in the Ontario electricity market to assess the model. The results indicate that the proposed model reduced the forecasting error significantly by 63.3% in the first experiment, 41.8% in the second, and 28.2% in the third, on average, with respect to a Root Mean Square Error (RMSE). Also, the proposed model was compared with outperformed several baseline models, including statistical time-series, Machine Learning, and Deep Learning models. Furthermore, the comparison of results in univariate and multivariate settings indicated that adding variables to forecasting models did not help reduce forecasting errors.
LitLLM: A Toolkit for Scientific Literature Review
Shubham Agarwal
Issam Hadj Laradji
Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (voir plus) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.