Publications

What Do Compressed Deep Neural Networks Forget

Sara Hooker

Gregory Clark

Andrea Frome

Deep neural network pruning and quantization techniques have demonstrated it is possible to achieve high levels of compression with surprisi… (see more)ngly little degradation to test set accuracy. However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques. We find that models with radically different numbers of weights have comparable top-line performance metrics but diverge considerably in behavior on a narrow subset of the dataset. This small subset of data points, which we term Pruning Identified Exemplars (PIEs) are systematically more impacted by the introduction of sparsity. Compression disproportionately impacts model performance on the underrepresented long-tail of the data distribution. PIEs over-index on atypical or noisy images that are far more challenging for both humans and algorithms to classify. Our work provides intuition into the role of capacity in deep neural networks and the trade-offs incurred by compression. An understanding of this disparate impact is critical given the widespread deployment of compressed models in the wild.

2019-11-13

ArXiv (preprint)

arxiv.org

Fractal impedance for passive controllers: a framework for interaction robotics

Keyhan Kouhkiloui Babarahmati

Carlo Tiseo

Joshua Smith

Hsiu‐chin Lin

M. S. Erden

Michael Nalin Mistry

2019-11-12

ArXiv (preprint)

doi.org

arxiv.org

Defining ‘actionable’ high- costhealth care use: results using the Canadian Institute for Health Information population grouping methodology

Maureen Anderson

Crawford W. Revie

Henrik Stryhn

Cordell Neudorf

Yvonne Rosehart

Wenbin Li

Meriç Osman

David Buckeridge

Laura C. Rosella

Walter P. Wodchis

2019-11-10

International Journal for Equity in Health (published)

doi.org

Preventing Posterior Collapse in Sequence VAEs with Pooling

Teng Long

Yanshuai Cao

Jackie Cheung

Variational Autoencoders (VAEs) hold great potential for modelling text, as they could in theory separate high-level semantic and syntactic … (see more)properties from local regularities of natural language. Practically, however, VAEs with autoregressive decoders often suffer from posterior collapse, a phenomenon where the model learns to ignore the latent variables, causing the sequence VAE to degenerate into a language model. Previous works attempt to solve this problem with complex architectural changes or costly optimization schemes. In this paper, we argue that posterior collapse is caused in part by the encoder network failing to capture the input variabilities. We verify this hypothesis empirically and propose a straightforward fix using pooling. This simple technique effectively prevents posterior collapse, allowing the model to achieve significantly better data log-likelihood than standard sequence VAEs. Compared to the previous SOTA on preventing posterior collapse, we are able to achieve comparable performances while being significantly faster.

2019-11-10

ArXiv (preprint)

arxiv.org

Adversarial target-invariant representation learning for domain generalization

Isabela Albuquerque

Joao Monteiro

Tiago Falk

Ioannis Mitliagkas

In many applications of machine learning, the training and test set data come from different distributions, or domains. A number of domain g… (see more)eneralization strategies have been introduced with the goal of achieving good performance on out-of-distribution data. In this paper, we propose an adversarial approach to the problem. We propose a process that enforces pair-wise domain invariance while training a feature extractor over a diverse set of domains. We show that this process ensures invariance to any distribution that can be expressed as a mixture of the training domains. Following this insight, we then introduce an adversarial approach in which pair-wise divergences are estimated and minimized. Experiments on two domain generalization benchmarks for object recognition (i.e., PACS and VLCS) show that the proposed method yields higher average accuracy on the target domains in comparison to previously introduced adversarial strategies, as well as recently proposed methods based on learning invariant representations.

2019-11-03

arXiv.org (preprint)

dblp.uni-trier.de

Deep Generative Modeling of LiDAR Data

Lucas Caccia

Herke van Hoof

Aaron Courville

Joelle Pineau

Building models capable of generating structured output is a key challenge for AI and robotics. While generative models have been explored o… (see more)n many types of data, little work has been done on synthesizing lidar scans, which play a key role in robot mapping and localization. In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map. Our approach can generate high quality samples, while simultaneously learning a meaningful latent representation of the data. We demonstrate significant improvements against state-of-the-art point cloud generation methods. Furthermore, we propose a novel data representation that augments the 2D signal with absolute positional information. We show that this helps robustness to noisy and imputed input; the learned model can recover the underlying lidar scan from seemingly uninformative data.

2019-11-03

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

doi.org

arxiv.org

Generalizing to unseen domains via distribution matching

Isabela Albuquerque

Joao Monteiro

Mohammad Javad Darvishi Bayazi

Tiago Falk

Ioannis Mitliagkas

Supervised learning results typically rely on assumptions of i.i.d. data. Unfortunately, those assumptions are commonly violated in practice… (see more). In this work, we tackle this problem by focusing on domain generalization: a formalization where the data generating process at test time may yield samples from never-before-seen domains (distributions). Our work relies on a simple lemma: by minimizing a notion of discrepancy between all pairs from a set of given domains, we also minimize the discrepancy between any pairs of mixtures of domains. Using this result, we derive a generalization bound for our setting. We then show that low risk over unseen domains can be achieved by representing the data in a space where (i) the training distributions are indistinguishable, and (ii) relevant information for the task at hand is preserved. Minimizing the terms in our bound yields an adversarial formulation which estimates and minimizes pairwise discrepancies. We validate our proposed strategy on standard domain generalization benchmarks, outperforming a number of recently introduced methods. Notably, we tackle a real-world application where the underlying data corresponds to multi-channel electroencephalography time series from different subjects, each considered as a distinct domain.

2019-11-03

ArXiv (preprint)

arxiv.org

Batch Weight for Domain Adaptation With Mass Shift

Mikolaj Binkowski

(Rex) Devon Hjelm

Aaron Courville

Unsupervised domain transfer is the task of transferring or translating samples from a source distribution to a different target distributio… (see more)n. Current solutions unsupervised domain transfer often operate on data on which the modes of the distribution are well-matched, for instance have the same frequencies of classes between source and target distributions. However, these models do not perform well when the modes are not well-matched, as would be the case when samples are drawn independently from two different, but related, domains. This mode imbalance is problematic as generative adversarial networks (GANs), a successful approach in this setting, are sensitive to mode frequency, which results in a mismatch of semantics between source samples and generated samples of the target distribution. We propose a principled method of re-weighting training samples to correct for such mass shift between the transferred distributions, which we call batch weight. We also provide rigorous probabilistic setting for domain transfer and new simplified objective for training transfer networks, an alternative to complex, multi-component loss functions used in the current state-of-the art image-to-image translation models. The new objective stems from the discrimination of joint distributions and enforces cycle-consistency in an abstract, high-level, rather than pixel-wise, sense. Lastly, we experimentally show the effectiveness of the proposed methods in several image-to-image translation tasks.

2019-11-02

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Improved Conditional VRNNs for Video Prediction

Lluis Castrejon

Nicolas Ballas

Aaron Courville

Predicting future frames for a video sequence is a challenging generative modeling task. Promising approaches include probabilistic latent v… (see more)ariable models such as the Variational Auto-Encoder. While VAEs can handle uncertainty and model multiple possible future outcomes, they have a tendency to produce blurry predictions. In this work we argue that this is a sign of underfitting. To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models. Our approach relies on a hierarchy of latent variables, which defines a family of flexible prior and posterior distributions in order to better model the probability of future sequences. We validate our proposal through a series of ablation experiments and compare our approach to current state-of-the-art latent variable models. Our method performs favorably under several metrics in three different datasets.

2019-11-02

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Learning Fixed Points in Generative Adversarial Networks: From Image-to-Image Translation to Disease Detection and Localization

Md Mahfuzur Rahman Siddiquee

Zongwei Zhou

Nima Tajbakhsh

Ruibin Feng

Michael Gotway

Yoshua Bengio

Jianming Liang

Generative adversarial networks (GANs) have ushered in a revolution in image-to-image translation. The development and proliferation of GANs… (see more) raises an interesting question: can we train a GAN to remove an object, if present, from an image while otherwise preserving the image? Specifically, can a GAN ``virtually heal'' anyone by turning his medical image, with an unknown health status (diseased or healthy), into a healthy one, so that diseased regions could be revealed by subtracting those two images? Such a task requires a GAN to identify a minimal subset of target pixels for domain translation, an ability that we call fixed-point translation, which no GAN is equipped with yet. Therefore, we propose a new GAN, called Fixed-Point GAN, trained by (1) supervising same-domain translation through a conditional identity loss, and (2) regularizing cross-domain translation through revised adversarial, domain classification, and cycle consistency loss. Based on fixed-point translation, we further derive a novel framework for disease detection and localization using only image-level annotation. Qualitative and quantitative evaluations demonstrate that the proposed method outperforms the state of the art in multi-domain image-to-image translation and that it surpasses predominant weakly-supervised localization methods in both disease detection and localization. Implementation is available at https://github.com/jlianglab/Fixed-Point-GAN.

2019-11-02

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction

Alaaeldin El-Nouby

Shikhar Sharma

Hannes Schulz

(Rex) Devon Hjelm

Layla El Asri

Samira Ebrahimi Kahou

Yoshua Bengio

Graham W. Taylor

Conditional text-to-image generation is an active area of research, with many possible applications. Existing research has primarily focused… (see more) on generating a single image from available conditioning information in one step. One practical extension beyond one-step generation is a system that generates an image iteratively, conditioned on ongoing linguistic input or feedback. This is significantly more challenging than one-step generation tasks, as such a system must understand the contents of its generated images with respect to the feedback history, the current feedback, as well as the interactions among concepts present in the feedback history. In this work, we present a recurrent image generation model which takes into account both the generated output up to the current step as well as all past instructions for generation. We show that our model is able to generate the background, add new objects, and apply simple transformations to existing objects. We believe our approach is an important step toward interactive generation. Code and data is available at: https://www.microsoft.com/en-us/research/project/generative-neural-visual-artist-geneva/.

2019-11-02

2019 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text

Ian Porada

Kaheer Suleman

Jackie Cheung

2019-11-01

Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing (published)

doi.org

arxiv.org