Laurent Charlin

Core Academic Member

lcharlin@mila.quebec

Canada CIFAR AI Chair

Associate Professor, HEC Montréal, Department of Decision Sciences

Associate Professor, Université de Montréal, Department of Computer Science and Operations Research

Research Topics

AI for Science

Data Mining

Deep Learning

Generative Models

Graph Neural Networks

Information Retrieval

Natural Language Processing

Probabilistic Models

Recommender Systems

Reinforcement Learning

Representation Learning

Google Scholar

Biography

Laurent Charlin is a Canada CIFAR AI Chair at Mila and an associate professor at HEC, the business school affiliated with the University de Montréal. He is also a core member of Mila—Quebec Institute for Artificial Intelligence.

Charlin’s research focuses on developing novel machine learning models to aid in decision-making. Recent work has focused on learning from data that changes over time, and on applications in fields such as recommender systems and optimization.

He has a number of highly cited publications on dialogue systems (chatbots). He co-developed the Toronto Paper Matching System (TPMS), which has been widely used by computer science conferences for matching reviewers to papers. He has also given MOOCs, introductory talks and media interviews to contribute to knowledge transfer and improve AI literacy.

Current Students

Neda Adl

Master's Research - HEC Montréal

Anirudh Buvanesh

PhD - Université de Montréal

Co-supervisor :

Aaron Courville

Github

Félix Gauthier

Master's Research - HEC Montréal

Soraya Ghassemlou

Master's Research - McGill University

Website

Github

Nicolas Goulet

PhD - HEC Montréal

Principal supervisor :

Eva Portelance

Google Scholar

Shubham Gupta

PhD - Université Laval

Principal supervisor :

Cem Subakan

Ben Hudson

PhD - Université de Montréal

Co-supervisor :

Mizu Nishikawa-Toomey

PhD - Université de Montréal

Co-supervisor :

PhD - Concordia University

Principal supervisor :

Collaborating Alumni - Université de Montréal

Google Scholar

Emiliano Penaloza

PhD - Université de Montréal

Website

Github

Gaurav Sahu

Postdoctorate - HEC Montréal

Co-supervisor :

PhD - Université de Montréal

Yipeng Zhang

PhD - Université de Montréal

Google Scholar

Publications

Focused Hierarchical RNNs for Conditional Sequence Processing

Nan Rosemary Ke

Adam Trischler

Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most o… (see more)f these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence modelling tasks which allows them to attend to key parts of the input as needed. We formulate this using a multi-layer conditional sequence encoder that reads in one token at a time and makes a discrete decision on whether the token is relevant to the context or question being asked. The discrete gating mechanism takes in the context embedding and the current hidden state as inputs and controls information flow into the layer above. We train it using policy gradient methods. We evaluate this method on several types of tasks with different attributes. First, we evaluate the method on synthetic tasks which allow us to evaluate the model for its generalization ability and probe the behavior of the gates in more controlled settings. We then evaluate this approach on large scale Question Answering tasks including the challenging MS MARCO and SearchQA tasks. Our models shows consistent improvements for both tasks over prior work and our baselines. It has also shown to generalize significantly better on synthetic tasks as compared to the baselines.

2018-07-02

Proceedings of the 35th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Towards Deep Conversational Recommendations

Raymond Li

Samira Ebrahimi Kahou

Hannes Schulz

Vincent Michalski

Laurent Charlin

Chris Pal

There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendat… (see more)ion is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, until now there has been no publicly available large-scale dataset consisting of real-world dialogues centered around recommendations. To address this issue and to facilitate our exploration here, we have collected ReDial, a dataset consisting of over 10,000 conversations centered around the theme of providing movie recommendations. We make this data available to the community for further research. Second, we use this dataset to explore multiple facets of conversational recommendations. In particular we explore new neural architectures, mechanisms, and methods suitable for composing conversational recommendation systems. Our dataset allows us to systematically probe model sub-components addressing different parts of the overall problem domain ranging from: sentiment analysis and cold-start recommendation generation to detailed aspects of how natural language is used in this setting in the real world. We combine such sub-components into a full-blown dialogue system and examine its behavior.

2017-12-31

Advances in Neural Information Processing Systems 31 (NeurIPS 2018) (published)

doi.org

arxiv.org

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks

Nan Rosemary Ke

A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagat… (see more)e credit information backwards through every single step of the forward computation. This makes BPTT both computationally impractical and biologically implausible. For this reason, full backpropagation through time is rarely used on long sequences, and truncated backpropagation through time is used as a heuristic. However, this usually leads to biased estimates of the gradient in which longer term dependencies are ignored. Addressing this issue, we propose an alternative algorithm, Sparse Attentive Backtracking, which might also be related to principles used by brains to learn long-term dependencies. Sparse Attentive Backtracking learns an attention mechanism over the hidden states of the past and selectively backpropagates through paths with high attention weights. This allows the model to learn long term dependencies while only backtracking for a small number of time steps, not just from the recent past but also from attended relevant past states.

2017-11-06

ArXiv (preprint)

doi.org

openreview.net

Learnable Explicit Density for Continuous Latent Space and Variational Inference

In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its correspon… (see more)ding posterior. First, we decompose the learning of VAEs into layerwise density estimation, and argue that having a flexible prior is beneficial to both sample generation and inference. Second, we analyze the family of inverse autoregressive flows (inverse AF) and show that with further improvement, inverse AF could be used as universal approximation to any complicated posterior. Our analysis results in a unified approach to parameterizing a VAE, without the need to restrict ourselves to use factorial Gaussians in the latent real space.

2017-10-05

ArXiv (preprint)

arxiv.org

A Sparse Probabilistic Model of User Preference Data

Matthew J. A. Smith

Laurent Charlin

Joelle Pineau

2017-04-10

Advances in Artificial Intelligence (published)

doi.org

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

Iulian V. Serban

Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterance… (see more)s in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.

2017-02-11

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Training End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus

Iulian Vlad Serban

Chia-Wei Liu

In this paper, we construct and train end-to-end neural network-based dialogue systems usingan updated version of the recent Ubuntu Dialogue… (see more) Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This dataset is interesting because of its size, long context lengths, and technical nature; thus, it can be used to train large models directly from data with minimal feature engineering, which can be both time consuming and expensive. We provide baselines in two different environments: one where models are trained to maximize the log-likelihood of a generated utterance conditioned on the context of the conversation, and one where models are trained to select the correct next response from a list of candidate responses. These are both evaluated on a recall task that we call Next Utterance Classification (NUC), as well as other generation-specific metrics. Finally, we provide a qualitative error analysis to help determine the most promising directions for future research on the Ubuntu Dialogue Corpus, and for end-to-end dialogue systems in general.

2017-01-19

Dialogue & Discourse (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Laurent Charlin

Biography

Current Students

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Laurent Charlin

Biography

Current Students

Publications