Publications

BOUNDS LEAD TO IMPROVED CLASSIFIERS

Nicolas Roux

The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. Wh… (see more)ile this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.

2016-12-31

(published)

www.semanticscholar.org

Calibrating Energy-based Generative Adversarial Networks

Zihang Dai

Amjad Almahairi

Philip Bachman

Eduard Hovy

Aaron Courville

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specific… (see more)ally, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal. We derive the analytic form of the induced solution, and analyze the properties. In order to make the proposed framework trainable in practice, we introduce two effective approximation techniques. Empirically, the experiment results closely match our theoretical analysis, verifying the discriminator is able to recover the energy of data distribution.

2016-12-31

ICLR.cc/2017/conference (poster)

openreview.net

Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures

M. Jorge Cardoso

Tal Arbel

Xiongbiao Luo

Stefan Wesarg

Tobias Reichl

M. Ballester

Jonathan Mcleod

Klaus Dr. Drechsler

T. Peters

Marius Erdt

Kensaku Mori

M. Linguraru

Andreas Uhl

Cristina Oyarzun Laura

R. Shekhar

2016-12-31

Lecture Notes in Computer Science (published)

doi.org

Computer-Assisted Conceptual Analysis of Textual Data as Applied to Philosophical Corpuses

Jean Guy Meunier

L. Chartrand

Jackie CK Cheung

Mathieu Valette

Marie-noëlle Bayle

2016-12-31

DH (published)

dblp.uni-trier.de

Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support

M. Jorge Cardoso

Tal Arbel

G. Carneiro

T. Syeda-Mahmood

J. Tavares

Mehdi Moradi

Andrew P. Bradley

Hayit Greenspan

J. Papa

Anant. Madabhushi

Jacinto C Nascimento

Jaime S. Cardoso

Vasileios Belagiannis

Zhi Lu

Faculdade Engenharia

2016-12-31

Lecture Notes in Computer Science (published)

doi.org

arxiv.org

Diet Networks: Thin Parameters for Fat Genomics

Adriana Romero

Marie-Pierre Dubé

Julie G. Hussin

Yoshua Bengio

Learning tasks such as those involving genomic data often poses a serious challenge: the number of input features can be orders of magnitude… (see more) larger than the number of training examples, making it difficult to avoid overfitting, even when using the known regularization techniques. We focus here on tasks in which the input is a description of the genetic variation specific to a patient, the single nucleotide polymorphisms (SNPs), yielding millions of ternary inputs. Improving the ability of deep learning to handle such datasets could have an important impact in precision medicine, where high-dimensional data regarding a particular patient is used to make predictions of interest. Even though the amount of data for such tasks is increasing, this mismatch between the number of examples and the number of inputs remains a concern. Naive implementations of classifier neural networks involve a huge number of free parameters in their first layer: each input feature is associated with as many parameters as there are hidden units. We propose a novel neural network parametrization which considerably reduces the number of free parameters. It is based on the idea that we can first learn or provide a distributed representation for each input feature (e.g. for each position in the genome where variations are observed), and then learn (with another neural network called the parameter prediction network) how to map a feature's distributed representation to the vector of parameters specific to that feature in the classifier neural network (the weights which link the value of the feature to each of the hidden units). We show experimentally on a population stratification task of interest to medical studies that the proposed approach can significantly reduce both the number of parameters and the error rate of the classifier.

2016-12-31

ICLR.cc/2017/conference (poster)

doi.org

openreview.net

Facilitating Multimodality in Normalizing Flows

Chin-wei Huang

David M. Krueger

Aaron Courville

The true Bayesian posterior of a model such as a neural network may be highly multimodal. In principle, normalizing flows can represent such… (see more) a distribution via compositions of invertible transformations of random noise. In practice, however, existing normalizing flows may fail to capture most of the modes of a distribution. We argue that the conditionally affine structure of the transformations used in [Dinh et al., 2014, 2016, Kingma et al., 2016] is inefficient, and show that flows which instead use (conditional) invertible non-linear transformations naturally enable multimodality in their output distributions. With just two layers of our proposed deep sigmoidal flow, we are able to model complicated 2d energy functions with much higher fidelity than six layers of deep affine flows.

2016-12-31

(published)

www.semanticscholar.org

Fetal, Infant and Ophthalmic Medical Image Analysis

M. Jorge Cardoso

Tal Arbel

Andrew Melbourne

Hrvoje Bogunovic

Pim Moeskops

Xinjian Chen

Ernst Schwartz

M. Garvin

E. Robinson

E. Trucco

Michael Ebner

Yanwu Xu

Antonios Makropoulos

Adrien Desjardin

Tom Kamiel Magda Vercauteren

2016-12-31

Lecture Notes in Computer Science (published)

doi.org

Generalizable Features From Unsupervised Learning

Mehdi Mirza

Aaron Courville

Yoshua Bengio

Humans learn a predictive model of the world and use this model to reason about future events and the consequences of actions. In contrast t… (see more)o most machine predictors, we exhibit an impressive ability to generalize to unseen scenarios and reason intelligently in these settings. One important aspect of this ability is physical intuition(Lake et al., 2016). In this work, we explore the potential of unsupervised learning to find features that promote better generalization to settings outside the supervised training distribution. Our task is predicting the stability of towers of square blocks. We demonstrate that an unsupervised model, trained to predict future frames of a video sequence of stable and unstable block configurations, can yield features that support extrapolating stability prediction to blocks configurations outside the training set distribution

2016-12-31

ICLR.cc/2017/conference (Invite to Workshop Track)

doi.org

openreview.net

GibbsNet: Iterative Adversarial Inference for Deep Graphical Models

R Devon Hjelm

Directed latent variable models that formulate the joint distribution as …

2016-12-31

Advances in Neural Information Processing Systems 30 (NIPS 2017) (published)

doi.org

arxiv.org

Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics

M. Jorge Cardoso

Tal Arbel

Enzo Ferrante

Xavier Pennec

Adrian Dalca

Sarah Parisot

S. Joshi

Nematollah Batmanghelich

Aristeidis Sotiras

Mads Lenstrup Nielsen

Mert R. Sabuncu

Tom Fletcher

Li Shen

Stanley Durrleman

Stefan H. Sommer

2016-12-31

GRAIL/MFCA/MICGen@MICCAI (published)

doi.org

Hierarchical Methods of Moments

Matteo Ruffini

Guillaume Rabusseau

Borja Balle

Spectral methods of moments provide a powerful tool for learning the parameters of latent variable models. Despite their theoretical appeal,… (see more) the applicability of these methods to real data is still limited due to a lack of robustness to model misspecification. In this paper we present a hierarchical approach to methods of moments to circumvent such limitations. Our method is based on replacing the tensor decomposition step used in previous algorithms with approximate joint diagonalization. Experiments on topic modeling show that our method outperforms previous tensor decomposition methods in terms of speed and model quality.

2016-12-31

Advances in Neural Information Processing Systems 30 (NIPS 2017) (published)

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications