Portrait de Alessandro Sordoni

Alessandro Sordoni

Membre industriel principal
Professeur associé, Université de Montréal, Département d'informatique et de recherche opérationnelle
Chercheur scientifique, Microsoft Research Montréal
Sujets de recherche
Grands modèles de langage (LLM)
Raisonnement
Traitement du langage naturel

Biographie

Je suis chercheur principal à Microsoft Research Montréal. J'ai obtenu un doctorat de l'Université de Montréal sous la direction de Jian-Yun Nie, en étudiant comment représenter efficacement les documents et les requêtes pour la recherche d'information. Présentement, je m’intéresse à l'étude de l'efficacité de l'apprentissage et de la généralisation systématique dans les grands modèles actuels d'apprentissage profond. Mes intérêts s'étendent à l'apprentissage non supervisé et à l'apprentissage à petite échelle, en particulier dans le domaine du langage naturel.

Étudiants actuels

Collaborateur·rice alumni - University of Copenhagen

Publications

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterance… (voir plus)s in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Ge… (voir plus)nerative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.