Publications

BOUNDS LEAD TO IMPROVED CLASSIFIERS
Nicolas Roux
The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. Wh… (see more)ile this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.
Calibrating Energy-based Generative Adversarial Networks
Amjad Almahairi
Philip Bachman
Eduard Hovy
In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specific… (see more)ally, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal. We derive the analytic form of the induced solution, and analyze the properties. In order to make the proposed framework trainable in practice, we introduce two effective approximation techniques. Empirically, the experiment results closely match our theoretical analysis, verifying the discriminator is able to recover the energy of data distribution.
Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures
M. Jorge Cardoso
Xiongbiao Luo
Stefan Wesarg
Tobias Reichl
M. Ballester
Jonathan Mcleod
Klaus Dr. Drechsler
T. Peters
Marius Erdt
Kensaku Mori
M. Linguraru
Andreas Uhl
Cristina Oyarzun Laura
R. Shekhar
Computer-Assisted Conceptual Analysis of Textual Data as Applied to Philosophical Corpuses
Jean Guy Meunier
L. Chartrand
Jackie CK Cheung
Mathieu Valette
Marie-noëlle Bayle
Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support
M. Jorge Cardoso
G. Carneiro
T. Syeda-Mahmood
J. Tavares
Mehdi Moradi
Andrew P. Bradley
Hayit Greenspan
J. Papa
Anant. Madabhushi
Jacinto C Nascimento
Jaime S. Cardoso
Vasileios Belagiannis
Zhi Lu
Faculdade Engenharia
Diet Networks: Thin Parameters for Fat Genomics
Learning tasks such as those involving genomic data often poses a serious challenge: the number of input features can be orders of magnitude… (see more) larger than the number of training examples, making it difficult to avoid overfitting, even when using the known regularization techniques. We focus here on tasks in which the input is a description of the genetic variation specific to a patient, the single nucleotide polymorphisms (SNPs), yielding millions of ternary inputs. Improving the ability of deep learning to handle such datasets could have an important impact in precision medicine, where high-dimensional data regarding a particular patient is used to make predictions of interest. Even though the amount of data for such tasks is increasing, this mismatch between the number of examples and the number of inputs remains a concern. Naive implementations of classifier neural networks involve a huge number of free parameters in their first layer: each input feature is associated with as many parameters as there are hidden units. We propose a novel neural network parametrization which considerably reduces the number of free parameters. It is based on the idea that we can first learn or provide a distributed representation for each input feature (e.g. for each position in the genome where variations are observed), and then learn (with another neural network called the parameter prediction network) how to map a feature's distributed representation to the vector of parameters specific to that feature in the classifier neural network (the weights which link the value of the feature to each of the hidden units). We show experimentally on a population stratification task of interest to medical studies that the proposed approach can significantly reduce both the number of parameters and the error rate of the classifier.
Facilitating Multimodality in Normalizing Flows
David M. Krueger
The true Bayesian posterior of a model such as a neural network may be highly multimodal. In principle, normalizing flows can represent such… (see more) a distribution via compositions of invertible transformations of random noise. In practice, however, existing normalizing flows may fail to capture most of the modes of a distribution. We argue that the conditionally affine structure of the transformations used in [Dinh et al., 2014, 2016, Kingma et al., 2016] is inefficient, and show that flows which instead use (conditional) invertible non-linear transformations naturally enable multimodality in their output distributions. With just two layers of our proposed deep sigmoidal flow, we are able to model complicated 2d energy functions with much higher fidelity than six layers of deep affine flows.
Fetal, Infant and Ophthalmic Medical Image Analysis
M. Jorge Cardoso
Andrew Melbourne
Hrvoje Bogunovic
Pim Moeskops
Xinjian Chen
Ernst Schwartz
M. Garvin
E. Robinson
E. Trucco
Michael Ebner
Yanwu Xu
Antonios Makropoulos
Adrien Desjardin
Tom Kamiel Magda Vercauteren
Generalizable Features From Unsupervised Learning
Humans learn a predictive model of the world and use this model to reason about future events and the consequences of actions. In contrast t… (see more)o most machine predictors, we exhibit an impressive ability to generalize to unseen scenarios and reason intelligently in these settings. One important aspect of this ability is physical intuition(Lake et al., 2016). In this work, we explore the potential of unsupervised learning to find features that promote better generalization to settings outside the supervised training distribution. Our task is predicting the stability of towers of square blocks. We demonstrate that an unsupervised model, trained to predict future frames of a video sequence of stable and unstable block configurations, can yield features that support extrapolating stability prediction to blocks configurations outside the training set distribution
GibbsNet: Iterative Adversarial Inference for Deep Graphical Models
Directed latent variable models that formulate the joint distribution as …
Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics
M. Jorge Cardoso
Enzo Ferrante
Xavier Pennec
Adrian Dalca
S. Joshi
Nematollah Batmanghelich
Aristeidis Sotiras
Mads Lenstrup Nielsen
Mert R. Sabuncu
Tom Fletcher
Li Shen
Stanley Durrleman
Stefan H. Sommer
Hierarchical Methods of Moments
Matteo Ruffini
Borja Balle
Spectral methods of moments provide a powerful tool for learning the parameters of latent variable models. Despite their theoretical appeal,… (see more) the applicability of these methods to real data is still limited due to a lack of robustness to model misspecification. In this paper we present a hierarchical approach to methods of moments to circumvent such limitations. Our method is based on replacing the tensor decomposition step used in previous algorithms with approximate joint diagonalization. Experiments on topic modeling show that our method outperforms previous tensor decomposition methods in terms of speed and model quality.