Publications

Batch Weight for Domain Adaptation With Mass Shift

R Devon Hjelm

Unsupervised domain transfer is the task of transferring or translating samples from a source distribution to a different target distributio… (see more)n. Current solutions unsupervised domain transfer often operate on data on which the modes of the distribution are well-matched, for instance have the same frequencies of classes between source and target distributions. However, these models do not perform well when the modes are not well-matched, as would be the case when samples are drawn independently from two different, but related, domains. This mode imbalance is problematic as generative adversarial networks (GANs), a successful approach in this setting, are sensitive to mode frequency, which results in a mismatch of semantics between source samples and generated samples of the target distribution. We propose a principled method of re-weighting training samples to correct for such mass shift between the transferred distributions, which we call batch weight. We also provide rigorous probabilistic setting for domain transfer and new simplified objective for training transfer networks, an alternative to complex, multi-component loss functions used in the current state-of-the art image-to-image translation models. The new objective stems from the discrimination of joint distributions and enforces cycle-consistency in an abstract, high-level, rather than pixel-wise, sense. Lastly, we experimentally show the effectiveness of the proposed methods in several image-to-image translation tasks.

2019-11-01

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Improved Conditional VRNNs for Video Prediction

Lluis Castrejon

Nicolas Ballas

Aaron Courville

Predicting future frames for a video sequence is a challenging generative modeling task. Promising approaches include probabilistic latent v… (see more)ariable models such as the Variational Auto-Encoder. While VAEs can handle uncertainty and model multiple possible future outcomes, they have a tendency to produce blurry predictions. In this work we argue that this is a sign of underfitting. To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models. Our approach relies on a hierarchy of latent variables, which defines a family of flexible prior and posterior distributions in order to better model the probability of future sequences. We validate our proposal through a series of ablation experiments and compare our approach to current state-of-the-art latent variable models. Our method performs favorably under several metrics in three different datasets.

2019-11-01

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction

Alaaeldin El-Nouby

Shikhar Sharma

Hannes Schulz

Devon Hjelm

Layla El Asri

Samira Ebrahimi Kahou

Yoshua Bengio

Graham W. Taylor

Conditional text-to-image generation is an active area of research, with many possible applications. Existing research has primarily focused… (see more) on generating a single image from available conditioning information in one step. One practical extension beyond one-step generation is a system that generates an image iteratively, conditioned on ongoing linguistic input or feedback. This is significantly more challenging than one-step generation tasks, as such a system must understand the contents of its generated images with respect to the feedback history, the current feedback, as well as the interactions among concepts present in the feedback history. In this work, we present a recurrent image generation model which takes into account both the generated output up to the current step as well as all past instructions for generation. We show that our model is able to generate the background, add new objects, and apply simple transformations to existing objects. We believe our approach is an important step toward interactive generation. Code and data is available at: https://www.microsoft.com/en-us/research/project/generative-neural-visual-artist-geneva/ .

2019-11-01

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text

Ian Porada

Kaheer Suleman

Jackie CK Cheung

2019-10-31

Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing (published)

doi.org

arxiv.org

CoQA: A Conversational Question Answering Challenge

Siva Reddy

Danqi Chen

Christopher D Manning

Humans gather information through conversations involving a series of interconnected questions and answers. For machines to assist in inform… (see more)ation gathering, it is therefore essential to enable them to answer conversational questions. We introduce CoQA, a novel dataset for building Conversational Question Answering systems. Our dataset contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains. The questions are conversational, and the answers are free-form text with their corresponding evidence highlighted in the passage. We analyze CoQA in depth and show that conversational questions have challenging phenomena not present in existing reading comprehension datasets (e.g., coreference and pragmatic reasoning). We evaluate strong dialogue and reading comprehension models on CoQA. The best system obtains an F1 score of 65.4%, which is 23.4 points behind human performance (88.8%), indicating that there is ample room for improvement. We present CoQA as a challenge to the community at https://stanfordnlp.github.io/coqa.

2019-10-31

Transactions of the Association for Computational Linguistics (published)

doi.org

arxiv.org

Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses

Matt Grenander

Yue Dong

Jackie CK Cheung

Annie Priyadarshini Louis

Sentence position is a strong feature for news summarization, since the lead often (but not always) summarizes the key points of the article… (see more). In this paper, we show that recent neural systems excessively exploit this trend, which although powerful for many inputs, is also detrimental when summarizing documents where important content should be extracted from later parts of the article. We propose two techniques to make systems sensitive to the importance of content in different parts of the article. The first technique employs ‘unbiased’ data; i.e., randomly shuffled sentences of the source document, to pretrain the model. The second technique uses an auxiliary ROUGE-based loss that encourages the model to distribute importance scores throughout a document by mimicking sentence-level ROUGE scores on the training data. We show that these techniques significantly improve the performance of a competitive reinforcement learning based extractive system, with the auxiliary loss being more powerful than pretraining.

2019-10-31

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

doi.org

arxiv.org

Fear in Hebrew

Tal Arbel

2019-10-31

Historical Studies in the Natural Sciences (published)

doi.org

How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG

Paul Trichelair

Ali Emami

Adam Trischler

Kaheer Suleman

Jackie CK Cheung

Recent studies have significantly improved the state-of-the-art on common-sense reasoning (CSR) benchmarks like the Winograd Schema Challeng… (see more)e (WSC) and SWAG. The question we ask in this paper is whether improved performance on these benchmarks represents genuine progress towards common-sense-enabled systems. We make case studies of both benchmarks and design protocols that clarify and qualify the results of previous work by analyzing threats to the validity of previous experimental designs. Our protocols account for several properties prevalent in common-sense benchmarks including size limitations, structural regularities, and variable instance difficulty.

2019-10-31

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

doi.org

arxiv.org

Interactive Language Learning by Question Answering

Xingdi Yuan

Christopher Pal

Adam Trischler

Humans observe and interact with the world to acquire knowledge. However, most existing machine reading comprehension (MRC) tasks miss the i… (see more)nteractive, information-seeking component of comprehension. Such tasks present models with static documents that contain all necessary information, usually concentrated in a single short substring. Thus, models can achieve strong performance through simple word- and phrase-based pattern matching. We address this problem by formulating a novel text-based question answering task: Question Answering with Interactive Text (QAit). In QAit, an agent must interact with a partially observable text-based environment to gather information required to answer questions. QAit poses questions about the existence, location, and attributes of objects found in the environment. The data is built using a text-based game generator that defines the underlying dynamics of interaction with the environment. We propose and evaluate a set of baseline models for the QAit task that includes deep reinforcement learning agents. Experiments show that the task presents a major challenge for machine reading systems, while humans solve it with relative ease.

2019-10-31

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

doi.org

arxiv.org

Referring Expression Generation Using Entity Profiles

Meng Cao

Jackie Chi Kit Cheung

Meng Cao, Jackie Chi Kit Cheung. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Internat… (see more)ional Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.

2019-10-31

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (published)

doi.org

arxiv.org

Structure Learning for Neural Module Networks

Vardaan Pahuja

Jie Fu

A. Chandar

Christopher Pal

Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that invo… (see more)lve human-specified neural modules, each designed for a specific form of reasoning. In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned. In this work, we further expand this approach and also learn the underlying internal structure of modules in terms of the ordering and combination of simple and elementary arithmetic operators. We utilize a minimum amount of prior knowledge from the human-specified neural modules in the form of different input types and arithmetic operators used in these modules. Our results show that one is indeed able to simultaneously learn both internal module structure and module sequencing without extra supervisory signals for module execution sequencing. With this approach, we report performance comparable to models using hand-designed modules. In addition, we do a analysis of sensitivity of the learned modules w.r.t. the arithmetic operations and infer the analytical expressions of the learned modules.

2019-10-31

Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN) (published)

doi.org

arxiv.org

Fluoroquinolone Use and Seasonal Patterns of Ciprofloxacin Resistance in Community-Acquired Urinary Escherichia coli Infection in a Large Urban Center

Jean-Paul R Soucy

Alexandra M. Schmidt

Caroline Quach

David L Buckeridge

2019-10-28

American Journal of Epidemiology (published)

doi.org

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Publications

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Popular keywords:

Publications