Alessandro Sordoni

Core Industry Member

Adjunct professor, Université de Montréal, Department of Computer Science and Operations Research

Research Scientist, Microsoft Research Montréal

Research Topics

Large Language Models (LLM)

Natural Language Processing

Reasoning

Biography

I am a principal researcher at Microsoft Research Montréal.

For my PhD at Université de Montréal under the direction of Jian-Yun Nie, I investigated how to effectively represent documents and queries for information retrieval.

Recently, I have been motivated to study the efficiency of learning and systematic generalization in current large deep learning models. My interests span the fields of unsupervised learning and few-shot learning, especially in NLP.

Current Students

Zhan Su

Collaborating Alumni - University of Copenhagen

Publications

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

Iulian V. Serban

Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterance… (see more)s in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.

2016-05-01

ArXiv (preprint)

doi.org

arxiv.org

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models

Iulian V. Serban

We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Ge… (see more)nerative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.

2016-03-05

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Alessandro Sordoni

Biography

Current Students

Publications

Custom AI Learning Programs

Mil'Haq Fest 2025

Mila Community of Practice

Supervision Requests

Popular keywords:

Alessandro Sordoni

Biography

Current Students

Publications