Portrait of Dzmitry Bahdanau

Dzmitry Bahdanau

Core Industry Member
Canada CIFAR AI Chair
Adjunct professor, McGill University, School of Computer Science
AI Research Scientist, Periodic
Research Topics
Deep Learning
Natural Language Processing

Biography

Dzmitry Bahdanau is an adjunct professor at McGill University and a research scientist at ServiceNow.

He completed his PhD at Mila – Quebec Artificial Intelligence Institute and Université de Montréal under the direction of Yoshua Bengio.

Bahdanau is interested in fundamental and applied questions concerning natural language understanding. His main research areas include semantic parsing, language user interfaces, systematic generalization and hybrid neural-symbolic systems.

Current Students

Master's Research - McGill University
Principal supervisor :
Master's Research - McGill University
Principal supervisor :
PhD - McGill University
Co-supervisor :

Publications

Compositional Generalization in Dependency Parsing
Compositionality— the ability to combine familiar units like words into novel phrases and sentences— has been the focus of intense inter… (see more)est in artificial intelligence in recent years. To test compositional generalization in semantic parsing, Keysers et al. (2020) introduced Compositional Freebase Queries (CFQ). This dataset maximizes the similarity between the test and train distributions over primitive units, like words, while maximizing the compound divergence: the dissimilarity between test and train distributions over larger structures, like phrases. Dependency parsing, however, lacks a compositional generalization benchmark. In this work, we introduce a gold-standard set of dependency parses for CFQ, and use this to analyze the behaviour of a state-of-the art dependency parser (Qi et al., 2020) on the CFQ dataset. We find that increasing compound divergence degrades dependency parsing performance, although not as dramatically as semantic parsing performance. Additionally, we find the performance of the dependency parser does not uniformly degrade relative to compound divergence, and the parser performs differently on different splits with the same compound divergence. We explore a number of hypotheses for what causes the non-uniform degradation in dependency parsing performance, and identify a number of syntactic structures that drive the dependency parser’s lower performance on the most challenging splits.
Combating False Negatives in Adversarial Imitation Learning
Léonard Boussioux
David Y. T. Hui
Maxime Chevalier-Boisvert
In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the de… (see more)sired behavior. However, as the trained policy learns to be more successful, the negative examples (the ones produced by the agent) become increasingly similar to expert ones. Despite the fact that the task is successfully accomplished in some of the agent's trajectories, the discriminator is trained to output low values for them. We hypothesize that this inconsistent training signal for the discriminator can impede its learning, and consequently leads to worse overall performance of the agent. We show experimental evidence for this hypothesis and that the ‘False Negatives’ (i.e. successful agent episodes) significantly hinder adversarial imitation learning, which is the first contribution of this paper. Then, we propose a method to alleviate the impact of false negatives and test it on the BabyAI environment. This method consistently improves sample efficiency over the baselines by at least an order of magnitude.
Understanding by Understanding Not: Modeling Negation in Language Models
Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language mode… (see more)ls often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top 1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.
BabyAI 1.1
David Y. T. Hui
Maxime Chevalier-Boisvert
BabyAI 1.1
David Y. T. Hui
Maxime Chevalier-Boisvert
The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 … (see more)presents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent’s architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 × and improves imitation learning performance on the hardest level from 77% to 90 . 4% . We hope that these improvements increase the computational efficiency of BabyAI experiments and help users design better agents.
BabyAI 1.1
David Y. T. Hui
Maxime Chevalier-Boisvert
BabyAI 1.1
David Y. T. Hui
Maxime Chevalier-Boisvert
The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 p… (see more)resents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent's architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90.4 %. We hope that these improvements increase the computational efficiency of BabyAI experiments and help users design better agents.
BabyAI 1.1
David Y. T. Hui
Maxime Chevalier-Boisvert
The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 p… (see more)resents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent's architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90.4 %. We hope that these improvements increase the computational efficiency of BabyAI experiments and help users design better agents.
BabyAI 1.1
David Y. T. Hui
Maxime Chevalier-Boisvert
The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 p… (see more)resents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent's architecture in three minor ways. This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90.4 %. We hope that these improvements increase the computational efficiency of BabyAI experiments and help users design better agents.
Combating False Negatives in Adversarial Imitation Learning (Student Abstract)
Léonard Boussioux
David Y. T. Hui
Maxime Chevalier-Boisvert
CLOSURE: Assessing Systematic Generalization of CLEVR Models
Automated curriculum generation for Policy Gradients from Demonstrations