Portrait de Laurent Charlin

Laurent Charlin

Membre académique principal
Chaire en IA Canada-CIFAR
Professeur associé, HEC Montréal, Département de sciences de la décision
Professeur agrégé, Université de Montréal, Département d'informatique et de recherche opérationnelle
Sujets de recherche
Apprentissage de représentations
Apprentissage par renforcement
Apprentissage profond
Causalité
Exploration des données
Modèles génératifs
Modèles probabilistes
Recherche d'information
Réseaux de neurones en graphes
Systèmes de recommandation
Traitement du langage naturel

Biographie

Laurent Charlin est titulaire d’une chaire en IA Canada-CIFAR à Mila – Institut québécois d’intelligence artificielle et professeur associé à HEC Montréal. Il est également membre principal à Mila. Ses recherches portent sur le développement de nouveaux modèles d'apprentissage automatique pour aider à la prise de décision. Ses travaux récents concernent l'apprentissage à partir de données qui évoluent dans le temps. Il travaille également sur des applications dans des domaines tels que les systèmes de recommandation et l'optimisation. Il est l'auteur de publications très citées sur les systèmes de dialogue (chatbots). Laurent Charlin a codéveloppé le Toronto Paper Matching System (TPMS), qui a été largement utilisé dans les conférences d'informatique pour faire correspondre les réviseur·euse·s aux articles. Il a également contribué à plusieurs MOOC récents, et a donné des conférences d'introduction et des interviews dans les médias pour contribuer au transfert de connaissances et améliorer la culture de l'IA.

Étudiants actuels

Postdoctorat - HEC
Co-superviseur⋅e :
Maîtrise recherche - HEC
Doctorat - UdeM
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - UdeM
Postdoctorat
Doctorat - Université Laval
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - Concordia
Superviseur⋅e principal⋅e :
Doctorat - UdeM
Doctorat - UdeM

Publications

Towards Modular LLMs by Building and Reusing a Library of LoRAs
Oleksiy Ostapenko
Zhan Su
Edoardo Ponti
Matheus Pereira
Lucas Caccia
Learning to Design Data-structures: A Case Study of Nearest Neighbor Search
Omar Salemohamed
Vatsal Sharan
Shivam Garg
Gregory Valiant
We propose a general framework for automating data-structure design and apply it to the problem of nearest neighbor search. Our model adapts… (voir plus) to the underlying data distribution and provides fine-grained control over query and space complexity, enabling the discovery of solutions tailored to problem-specific constraints. We are able to reverse-engineer learned algorithms in several settings. In 1D, the model discovers optimal distribution (in)dependent algorithms such as binary search and variants of interpolation search. In higher dimensions, the model learns solutions that resemble K-d trees in some regimes, while in others, have elements of locality-sensitive hashing.
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Oleksiy Ostapenko
Zhan Su
Edoardo Ponti
Matheus Pereira
Lucas Caccia
The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trai… (voir plus)ned adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.
Applying Recurrent Neural Networks and Blocked Cross-Validation to Model Conventional Drinking Water Treatment Processes
Aleksandar Jakovljevic
Benoit Barbeau
The jar test is the current standard method for predicting the performance of a conventional drinking water treatment (DWT) process and opti… (voir plus)mizing the coagulant dose. This test is time-consuming and requires human intervention, meaning it is infeasible for making continuous process predictions. As a potential alternative, we developed a machine learning (ML) model from historical DWT plant data that can operate continuously using real-time sensor data without human intervention for predicting clarified water turbidity 15 min in advance. We evaluated three types of models: multilayer perceptron (MLP), the long short-term memory (LSTM) recurrent neural network (RNN), and the gated recurrent unit (GRU) RNN. We also employed two training methodologies: the commonly used holdout method and the theoretically correct blocked cross-validation (BCV) method. We found that the RNN with GRU was the best model type overall and achieved a mean absolute error on an independent production set of as low as 0.044 NTU. We further found that models trained using BCV typically achieve errors equal to or lower than their counterparts trained using holdout. These results suggest that RNNs trained using BCV are superior for the development of ML models for DWT processes compared to those reported in earlier literature.
LitLLM: A Toolkit for Scientific Literature Review
Shubham Agarwal
Issam Hadj Laradji
Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (voir plus) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.
Improving the Generalizability and Robustness of Large-Scale Traffic Signal Control
Tianyu Shi
François-Xavier Devailly
Denis Larocque
A number of deep reinforcement-learning (RL) approaches propose to control traffic signals. Compared to traditional approaches, RL approache… (voir plus)s can learn from higher-dimensionality input road and vehicle sensors and better adapt to varying traffic conditions resulting in reduced travel times (in simulation). However, these RL methods require training from massive traffic sensor data. To offset this relative inefficiency, some recent RL methods have the ability to first learn from small-scale networks and then generalize to unseen city-scale networks without additional retraining (zero-shot transfer). In this work, we study the robustness of such methods along two axes. First, sensor failures and GPS occlusions create missing-data challenges and we show that recent methods remain brittle in the face of these missing data. Second, we provide a more systematic study of the generalization ability of RL methods to new networks with different traffic regimes. Again, we identify the limitations of recent approaches. We then propose using a combination of distributional and vanilla reinforcement learning through a policy ensemble. Building upon the state-of-the-art previous model which uses a decentralized approach for large-scale traffic signal control with graph convolutional networks (GCNs), we first learn models using a distributional reinforcement learning (DisRL) approach. In particular, we use implicit quantile networks (IQN) to model the state-action return distribution with quantile regression. For traffic signal control problems, an ensemble of standard RL and DisRL yields superior performance across different scenarios, including different levels of missing sensor data and traffic flow patterns. Furthermore, the learning scheme of the resulting model can improve zero-shot transferability to different road network structures, including both synthetic networks and real-world networks (e.g., Luxembourg, Manhattan). We conduct extensive experiments to compare our approach to multi-agent reinforcement learning and traditional transportation approaches. Results show that the proposed method improves robustness and generalizability in the face of missing data, varying road networks, and traffic flows.
Model-based graph reinforcement learning for inductive traffic signal control
François-Xavier Devailly
Denis Larocque
Most reinforcement learning methods for adaptive-traffic-signal-control require training from scratch to be applied on any new intersection … (voir plus)or after any modification to the road network, traffic distribution, or behavioral constraints experienced during training. Considering 1) the massive amount of experience required to train such methods, and 2) that experience must be gathered by interacting in an exploratory fashion with real road-network-users, such a lack of transferability limits experimentation and applicability. Recent approaches enable learning policies that generalize for unseen road-network topologies and traffic distributions, partially tackling this challenge. However, the literature remains divided between the learning of cyclic (the evolution of connectivity at an intersection must respect a cycle) and acyclic (less constrained) policies, and these transferable methods 1) are only compatible with cyclic constraints and 2) do not enable coordination. We introduce a new model-based method, MuJAM, which, on top of enabling explicit coordination at scale for the first time, pushes generalization further by allowing a generalization to the controllers' constraints. In a zero-shot transfer setting involving both road networks and traffic settings never experienced during training, and in a larger transfer experiment involving the control of 3,971 traffic signal controllers in Manhattan, we show that MuJAM, using both cyclic and acyclic constraints, outperforms domain-specific baselines as well as another transferable approach.
Operational Research: methods and applications
Fotios Petropoulos
Gilbert Laporte
Emel Aktas
Sibel A. Alumur
Claudia Archetti
Hayriye Ayhan
Maria Battarra
Julia A. Bennell
Jean-Marie Bourjolly
John E. Boylan
Michèle Breton
David Canca
Bo Chen
Cihan Tugrul Cicek
Louis Anthony Cox
Christine S.M. Currie
Erik Demeulemeester
Li Ding
Stephen M. Disney … (voir 62 de plus)
Matthias Ehrgott
Martin J. Eppler
Güneş Erdoğan
Bernard Fortz
L. Alberto Franco
Jens Frische
Salvatore Greco
Amanda J. Gregory
Raimo P. Hämäläinen
Willy Herroelen
Mike Hewitt
Jan Holmström
John N. Hooker
Tuğçe Işık
Jill Johnes
Bahar Y. Kara
Özlem Karsu
Katherine Kent
Charlotte Köhler
Martin Kunc
Yong-Hong Kuo
Judit Lienert
Adam N. Letchford
Janny Leung
Dong Li
Haitao Li
Ivana Ljubić
Sebastián Lozano
Virginie Lurkin
Silvano Martello
Ian G. McHale
Gerald Midgley
John D.W. Morecroft
Akshay Mutha
Ceyda Oğuz
Sanja Petrovic
Ulrich Pferschy
Harilaos N. Psaraftis
Sam Rose
Lauri Saarinen
Said Salhi
Jing-Sheng Song
Dimitrios Sotiros
Kathryn E. Stecke
Arne K. Strauss
İstenç Tarhan
Clemens Thielen
Paolo Toth
Greet Vanden Berghe
Christos Vasilakis
Vikrant Vaze
Daniele Vigo
Kai Virtanen
Xun Wang
Rafał Weron
Leroy White
Tom Van Woensel
Mike Yearworth
E. Alper Yıldırım
Georges Zaccour
Xuying Zhao
Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a … (voir plus)diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes.
Predictive inference for travel time on transportation networks
Mohamad Elmasri
Aurélie Labbe
Denis Larocque
Challenging Common Assumptions about Catastrophic Forgetting and Knowledge Accumulation
Timothee LESORT
Oleksiy Ostapenko
Pau Rodriguez
Diganta Misra
Md Rifat Arefin
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges
Massimo Caccia
Jonas Mueller
Taesup Kim
Rasool Fakoor
A Case Study of Instruction Tuning with Mixture of Parameter-Efficient Experts
Oleksiy Ostapenko
Lucas Caccia
Zhan Su
We study the applicability of mixture of parameter-efficient experts (MoPEs) for instruction-tuning large decoder-only language models. Rece… (voir plus)nt literature indicates that MoPEs might enhance performance in specific multi-task instruction-following datasets. In this paper, we extend such previous results and study applicability of MoPEs in settings previously overlooked: a) with open-domain instruction-following datasets; b) with recent decoder-only models and c) with downstream out-of-distribution test sets. We build on top of LLaMA1-13B/-7B and LLaMA2-13B. We study different variants of learned routing, namely per-example routing ([PE]), and a more expensive per-token ([PT]) routing. Overall, we are unable to substantiate strong performance gains observed in related studies in our setting. We observe occasional enhancements of LLAMA2 fine-tuned on Open Platypus dataset in 0-shot SNI evaluation and TruthfulQA evaluation after fine-tuning on a subset of Flan. We shed some light on the inner workings of MoPEs by comparing different routing strategies. We find that [PE] routing tends to collapse at downstream evaluation time reducing the importance of router's application. We plan to publicly release our code.