Laurent Charlin

sara.ebrahim-elkafrawy@mila.quebec

Dieng Awa

Master's Research - HEC Montréal

awa.dieng@mila.quebec

David Berger

PhD - Université de Montréal

bergerda@mila.quebec

Sal Elkafrawy

PhD - Université de Montréal

olivier.gouvert@mila.quebec

Olivier Gouvert

Postdoctorate

Shubham Gupta

PhD - Université Laval

Principal supervisor :

Cem (Yusuf) Subakan

shubham.gupta@mila.quebec

Ben Hudson

PhD - Université de Montréal

Co-supervisor :

Emma Frejinger

ben.hudson@mila.quebec

mizu.nishikawa-toomey@mila.quebec

Mizu Nishikawa-Toomey

PhD - Université de Montréal

Co-supervisor :

Dhanya Sridhar

PhD - Concordia University

Principal supervisor :

Mirco Ravanelli

firat.oncel@mila.quebec

PhD - Université de Montréal

ostapeno@mila.quebec

emiliano.penaloza@mila.quebec

Emiliano Penaloza

PhD - Université de Montréal

omar.salemohamed@mila.quebec

Omar Salemohamed

Master's Research - Université de Montréal

Yipeng Zhang

PhD - Université de Montréal

yipeng.zhang@mila.quebec

Publications

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Oleksiy Ostapenko

Zhan Su

Edoardo Ponti

Nicolas Le Roux

Matheus Pereira

Lucas Caccia

Alessandro Sordoni

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trai… (see more)ned adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.

2024-05-18

ArXiv (preprint)

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Oleksiy Ostapenko

Zhan Su

Edoardo Ponti

Nicolas Le Roux

Matheus Pereira

Lucas Caccia

Alessandro Sordoni

2024-05-01

ICML.cc/2024/Conference (poster)

Applying Recurrent Neural Networks and Blocked Cross-Validation to Model Conventional Drinking Water Treatment Processes

Aleksandar Jakovljevic

Benoit Barbeau

The jar test is the current standard method for predicting the performance of a conventional drinking water treatment (DWT) process and opti… (see more)mizing the coagulant dose. This test is time-consuming and requires human intervention, meaning it is infeasible for making continuous process predictions. As a potential alternative, we developed a machine learning (ML) model from historical DWT plant data that can operate continuously using real-time sensor data without human intervention for predicting clarified water turbidity 15 min in advance. We evaluated three types of models: multilayer perceptron (MLP), the long short-term memory (LSTM) recurrent neural network (RNN), and the gated recurrent unit (GRU) RNN. We also employed two training methodologies: the commonly used holdout method and the theoretically correct blocked cross-validation (BCV) method. We found that the RNN with GRU was the best model type overall and achieved a mean absolute error on an independent production set of as low as 0.044 NTU. We further found that models trained using BCV typically achieve errors equal to or lower than their counterparts trained using holdout. These results suggest that RNNs trained using BCV are superior for the development of ML models for DWT processes compared to those reported in earlier literature.

2024-04-04

Water (published)

LitLLM: A Toolkit for Scientific Literature Review

Shubham Agarwal

Issam Hadj Laradji

Chris Pal

Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work.… (see more) It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual information-and ignore the latest research they have not been trained on. To address these limitations, we propose a toolkit that operates on Retrieval Augmented Generation (RAG) principles, specialized prompting and instructing techniques with the help of LLMs. Our system first initiates a web search to retrieve relevant papers by summarizing user-provided abstracts into keywords using an off-the-shelf LLM. Authors can enhance the search by supplementing it with relevant papers or keywords, contributing to a tailored retrieval process. Second, the system re-ranks the retrieved papers based on the user-provided abstract. Finally, the related work section is generated based on the re-ranked results and the abstract. There is a substantial reduction in time and effort for literature review compared to traditional methods, establishing our toolkit as an efficient alternative. Our open-source toolkit is accessible at https://github.com/shubhamagarwal92/LitLLM and Huggingface space (https://huggingface.co/spaces/shubhamagarwal92/LitLLM) with the video demo at https://youtu.be/E2ggOZBAFw0.

2024-02-02

ArXiv (preprint)

Improving the Generalizability and Robustness of Large-Scale Traffic Signal Control

Tianyu Shi

FranÃ§ois-Xavier Devailly

Denis Larocque

A number of deep reinforcement-learning (RL) approaches propose to control traffic signals. Compared to traditional approaches, RL approache… (see more)s can learn from higher-dimensionality input road and vehicle sensors and better adapt to varying traffic conditions resulting in reduced travel times (in simulation). However, these RL methods require training from massive traffic sensor data. To offset this relative inefficiency, some recent RL methods have the ability to first learn from small-scale networks and then generalize to unseen city-scale networks without additional retraining (zero-shot transfer). In this work, we study the robustness of such methods along two axes. First, sensor failures and GPS occlusions create missing-data challenges and we show that recent methods remain brittle in the face of these missing data. Second, we provide a more systematic study of the generalization ability of RL methods to new networks with different traffic regimes. Again, we identify the limitations of recent approaches. We then propose using a combination of distributional and vanilla reinforcement learning through a policy ensemble. Building upon the state-of-the-art previous model which uses a decentralized approach for large-scale traffic signal control with graph convolutional networks (GCNs), we first learn models using a distributional reinforcement learning (DisRL) approach. In particular, we use implicit quantile networks (IQN) to model the state-action return distribution with quantile regression. For traffic signal control problems, an ensemble of standard RL and DisRL yields superior performance across different scenarios, including different levels of missing sensor data and traffic flow patterns. Furthermore, the learning scheme of the resulting model can improve zero-shot transferability to different road network structures, including both synthetic networks and real-world networks (e.g., Luxembourg, Manhattan). We conduct extensive experiments to compare our approach to multi-agent reinforcement learning and traditional transportation approaches. Results show that the proposed method improves robustness and generalizability in the face of missing data, varying road networks, and traffic flows.

2024-01-01

IEEE Open Journal of Intelligent Transportation Systems (published)

Model-based graph reinforcement learning for inductive traffic signal control

FranÃ§ois-Xavier Devailly

Denis Larocque

Most reinforcement learning methods for adaptive-traffic-signal-control require training from scratch to be applied on any new intersection … (see more)or after any modification to the road network, traffic distribution, or behavioral constraints experienced during training. Considering 1) the massive amount of experience required to train such methods, and 2) that experience must be gathered by interacting in an exploratory fashion with real road-network-users, such a lack of transferability limits experimentation and applicability. Recent approaches enable learning policies that generalize for unseen road-network topologies and traffic distributions, partially tackling this challenge. However, the literature remains divided between the learning of cyclic (the evolution of connectivity at an intersection must respect a cycle) and acyclic (less constrained) policies, and these transferable methods 1) are only compatible with cyclic constraints and 2) do not enable coordination. We introduce a new model-based method, MuJAM, which, on top of enabling explicit coordination at scale for the first time, pushes generalization further by allowing a generalization to the controllers' constraints. In a zero-shot transfer setting involving both road networks and traffic settings never experienced during training, and in a larger transfer experiment involving the control of 3,971 traffic signal controllers in Manhattan, we show that MuJAM, using both cyclic and acyclic constraints, outperforms domain-specific baselines as well as another transferable approach.

2024-01-01

IEEE Open Journal of Intelligent Transportation Systems (published)

Operational Research: methods and applications

Fotios Petropoulos

Gilbert Laporte

Emel Aktas

Sibel A. Alumur

Claudia Archetti

Hayriye Ayhan

Maria Battarra

Julia A. Bennell

Jean-Marie Bourjolly

John E. Boylan

Michèle Breton

David Canca

Bo Chen

Cihan Tugrul Cicek

Louis Anthony Cox

Christine S.M. Currie

Erik Demeulemeester

Li Ding

Stephen M. Disney … (see 62 more)

Matthias Ehrgott

Martin J. Eppler

Güneş Erdoğan

Bernard Fortz

L. Alberto Franco

Jens Frische

Salvatore Greco

Amanda J. Gregory

Raimo P. Hämäläinen

Willy Herroelen

Mike Hewitt

Jan Holmström

John N. Hooker

Tuğçe Işık

Jill Johnes

Bahar Y. Kara

Özlem Karsu

Katherine Kent

Charlotte Köhler

Martin Kunc

Yong-Hong Kuo

Judit Lienert

Adam N. Letchford

Janny Leung

Dong Li

Haitao Li

Ivana Ljubić

Andrea Lodi

Sebastián Lozano

Virginie Lurkin

Silvano Martello

Ian G. McHale

Gerald Midgley

John D.W. Morecroft

Akshay Mutha

Ceyda Oğuz

Sanja Petrovic

Ulrich Pferschy

Harilaos N. Psaraftis

Sam Rose

Lauri Saarinen

Said Salhi

Jing-Sheng Song

Dimitrios Sotiros

Kathryn E. Stecke

Arne K. Strauss

İstenç Tarhan

Clemens Thielen

Paolo Toth

Greet Vanden Berghe

Christos Vasilakis

Vikrant Vaze

Daniele Vigo

Kai Virtanen

Xun Wang

Rafał Weron

Leroy White

Tom Van Woensel

Mike Yearworth

E. Alper Yıldırım

Georges Zaccour

Xuying Zhao

Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a … (see more)diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes.

2024-01-01

J. Oper. Res. Soc. (published)

Predictive inference for travel time on transportation networks

Mohamad Elmasri

Aurélie Labbe

Denis Larocque

2023-12-01

The Annals of Applied Statistics (published)

Challenging Common Assumptions about Catastrophic Forgetting and Knowledge Accumulation

Timothee LESORT

Oleksiy Ostapenko

Pau Rodriguez

Diganta Misra

Md Rifat Arefin

Irina Rish

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (published)

proceedings.mlr.press

Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges

Massimo Caccia

Jonas Mueller

Taesup Kim

Rasool Fakoor

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (published)

proceedings.mlr.press

A Case Study of Instruction Tuning with Mixture of Parameter-Efficient Experts

Oleksiy Ostapenko

Lucas Caccia

Zhan Su

Nicolas Le Roux

Alessandro Sordoni

We study the applicability of mixture of parameter-efficient experts (MoPEs) for instruction-tuning large decoder-only language models. Rece… (see more)nt literature indicates that MoPEs might enhance performance in specific multi-task instruction-following datasets. In this paper, we extend such previous results and study applicability of MoPEs in settings previously overlooked: a) with open-domain instruction-following datasets; b) with recent decoder-only models and c) with downstream out-of-distribution test sets. We build on top of LLaMA1-13B/-7B and LLaMA2-13B. We study different variants of learned routing, namely per-example routing ([PE]), and a more expensive per-token ([PT]) routing. Overall, we are unable to substantiate strong performance gains observed in related studies in our setting. We observe occasional enhancements of LLAMA2 fine-tuned on Open Platypus dataset in 0-shot SNI evaluation and TruthfulQA evaluation after fine-tuning on a subset of Flan. We shed some light on the inner workings of MoPEs by comparing different routing strategies. We find that [PE] routing tends to collapse at downstream evaluation time reducing the importance of router's application. We plan to publicly release our code.

2023-10-28

NeurIPS.cc/2023/Workshop/Instruction (published)

Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

Tristan Deleu

Mizu Nishikawa-Toomey

Jithendaraa Subramanian

Nikolay Malkin

Yoshua Bengio

Generative Flow Networks (GFlowNets), a class of generative models over discrete and structured sample spaces, have been previously applied … (see more)to the problem of inferring the marginal posterior distribution over the directed acyclic graph (DAG) of a Bayesian Network, given a dataset of observations. Based on recent advances extending this framework to non-discrete sample spaces, we propose in this paper to approximate the joint posterior over not only the structure of a Bayesian Network, but also the parameters of its conditional probability distributions. We use a single GFlowNet whose sampling policy follows a two-phase process: the DAG is first generated sequentially one edge at a time, and then the corresponding parameters are picked once the full structure is known. Since the parameters are included in the posterior distribution, this leaves more flexibility for the local probability models of the Bayesian Network, making our approach applicable even to non-linear models parametrized by neural networks. We show that our method, called JSP-GFN, offers an accurate approximation of the joint posterior, while comparing favorably against existing methods on both simulated and real data.