Publications

Facilitating Multimodality in Normalizing Flows
David M. Krueger
The true Bayesian posterior of a model such as a neural network may be highly multimodal. In principle, normalizing flows can represent such… (voir plus) a distribution via compositions of invertible transformations of random noise. In practice, however, existing normalizing flows may fail to capture most of the modes of a distribution. We argue that the conditionally affine structure of the transformations used in [Dinh et al., 2014, 2016, Kingma et al., 2016] is inefficient, and show that flows which instead use (conditional) invertible non-linear transformations naturally enable multimodality in their output distributions. With just two layers of our proposed deep sigmoidal flow, we are able to model complicated 2d energy functions with much higher fidelity than six layers of deep affine flows.
Fetal, Infant and Ophthalmic Medical Image Analysis
M. Jorge Cardoso
Andrew Melbourne
Hrvoje Bogunovic
Pim Moeskops
Xinjian Chen
Ernst Schwartz
M. Garvin
E. Robinson
E. Trucco
Michael Ebner
Yanwu Xu
Antonios Makropoulos
Adrien Desjardin
Tom Kamiel Magda Vercauteren
Generalizable Features From Unsupervised Learning
Humans learn a predictive model of the world and use this model to reason about future events and the consequences of actions. In contrast t… (voir plus)o most machine predictors, we exhibit an impressive ability to generalize to unseen scenarios and reason intelligently in these settings. One important aspect of this ability is physical intuition(Lake et al., 2016). In this work, we explore the potential of unsupervised learning to find features that promote better generalization to settings outside the supervised training distribution. Our task is predicting the stability of towers of square blocks. We demonstrate that an unsupervised model, trained to predict future frames of a video sequence of stable and unstable block configurations, can yield features that support extrapolating stability prediction to blocks configurations outside the training set distribution
GibbsNet: Iterative Adversarial Inference for Deep Graphical Models
Directed latent variable models that formulate the joint distribution as …
Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics
M. Jorge Cardoso
Enzo Ferrante
Xavier Pennec
Adrian Dalca
S. Joshi
Nematollah Batmanghelich
Aristeidis Sotiras
Mads Lenstrup Nielsen
Mert R. Sabuncu
Tom Fletcher
Li Shen
Stanley Durrleman
Stefan H. Sommer
Hierarchical Methods of Moments
Matteo Ruffini
Borja Balle
Spectral methods of moments provide a powerful tool for learning the parameters of latent variable models. Despite their theoretical appeal,… (voir plus) the applicability of these methods to real data is still limited due to a lack of robustness to model misspecification. In this paper we present a hierarchical approach to methods of moments to circumvent such limitations. Our method is based on replacing the tensor decomposition step used in previous algorithms with approximate joint diagonalization. Experiments on topic modeling show that our method outperforms previous tensor decomposition methods in terms of speed and model quality.
Improved Training of Wasserstein GANs
Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserste… (voir plus)in GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only low-quality samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.
Independently Controllable Factors
Valentin Thomas
Philippe Beaudoin
Marie-Jean Meurs
It has been postulated that a good representation is one that disentangles the underlying explanatory factors of variation. However, it rema… (voir plus)ins an open question what kind of training framework could potentially achieve that. Whereas most previous work focuses on the static setting (e.g., with images), we postulate that some of the causal factors could be discovered if the learner is allowed to interact with its environment. The agent can experiment with different actions and observe their effects. More specifically, we hypothesize that some of these factors correspond to aspects of the environment which are independently controllable, i.e., that there exists a policy and a learnable feature for each such aspect of the environment, such that this policy can yield changes in that feature with minimal changes to other features that explain the statistical variations in the observed data. We propose a specific objective function to find such factors and verify experimentally that it can indeed disentangle independently controllable aspects of the environment without any extrinsic reward signal.
Independently Controllable Features
Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis
M. Jorge Cardoso
Su-Lin Lee
Veronika Cheplygina
Simone Balocco
Diana Mateus
Guillaume Zahnd
Lena Maier-Hein
Stefanie Demirci
Eric Granger
Luc Duong
M. Carbonneau
Shadi N. Albarqouni
G. Carneiro
Modulating early visual processing by language
It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected. This view do… (voir plus)minates the current literature in computational models for language-vision tasks, where visual and linguistic input are mostly processed independently before being fused into a single representation. In this paper, we deviate from this classic pipeline and propose to modulate the \emph{entire visual processing} by linguistic input. Specifically, we condition the batch normalization parameters of a pretrained residual network (ResNet) on a language embedding. This approach, which we call MOdulated RESnet (\MRN), significantly improves strong baselines on two visual question answering tasks. Our ablation study shows that modulating from the early stages of the visual processing is beneficial.
Molecular Imaging, Reconstruction and Analysis of Moving Body Organs, and Stroke Imaging and Treatment
M. Jorge Cardoso
Fei Gao
BERNHARD KAINZ
T. Walsum
Kuangyu Shi
Kanwal K. Bhatia
R. Peter
Tom Kamiel Magda Vercauteren
Mauricio Reyes
Adrian Dalca
Roland Wiest
Wiro Niessen
B. Emmer