Publications

Bayesian Model-Agnostic Meta-Learning

Taesup Kim

Jaesik Yoon

Ousmane Dia

Sungwoong Kim

Sungjin Ahn

Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty … (see more)inherent in the problem. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation. In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update. Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement. Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning.

2017-12-31

Advances in Neural Information Processing Systems 31 (NeurIPS 2018) (published)

BigBrain: 1D convolutional neural networks for automated sementation of cortical layers

Konrad Wagstyl

Claude Lepage

Karl Zilles

Sebastian Bludau

G. Cucurul

Alan C. Evans

Paul C Fletcher

Adriana Romero

Joseph Paul Cohen

Stéphanie Larocque

Thomas Funck

Katrin Amunts

2017-12-31

(published)

www.semanticscholar.org

Challenging Conventional Segmentation Evaluation Metrics Focal Pathology ( Lesion and Tumour ) Segmentation from Patient Images

Tal Arbel

2017-12-31

(published)

www.semanticscholar.org

ChatPainter: Improving Text to Image Generation using Dialogue

Shikhar Sharma

Dendi Suhubdy

Vincent Michalski

S Ebrahimi Kahou

Synthesizing realistic images from text descriptions on a dataset like Microsoft Common Objects in Context (MS COCO), where each image can c… (see more)ontain several objects, is a challenging task. Prior work has used text captions to generate images. However, captions might not be informative enough to capture the entire image and insufficient for the model to be able to understand which objects in the images correspond to which words in the captions. We show that adding a dialogue that further describes the scene leads to significant improvement in the inception score and in the quality of generated images on the MS COCO dataset.

2017-12-31

ICLR (Workshop) (published)

openreview.net

Convolutional neural networks for mesh-based parcellation of the cerebral cortex

Guillem Cucurull

Konrad Wagstyl

Arantxa Casanova

Petar Veličković

Estrid Jakobsen

Michal Drozdzal

Adriana Romero

Alan C. Evans

In order to understand the organization of the cerebral cortex, it is necessary to create a map or parcellation of cortical areas. Reconstru… (see more)ctions of the cortical surface created from structural MRI scans, are frequently used in neuroimaging as a common coordinate space for representing multimodal neuroimaging data. These meshes are used to investigate healthy brain organization as well as abnormalities in neurological and psychiatric conditions. We frame cerebral cortex parcellation as a mesh segmentation task, and address it by taking advantage of recent advances in generalizing convolutions to the graph domain. In particular, we propose to assess graph convolutional networks and graph attention networks, which, in contrast to previous mesh parcellation models, exploit the underlying structure of the data to make predictions. We show experimentally on the Human Connectome Project dataset that the proposed graph convolutional models outperform current state-of-the-art and baselines, highlighting the potential and applicability of these methods to tackle neuroimaging challenges, paving the road towards a better characterization of brain diseases.

2017-12-31

MIDL.amsterdam/2018/Conference (oral)

openreview.net

A Dataset of Topic-Oriented Human-to-Chatbot Dialogues

Varvara Logacheva

Mikhail Burtsev

Valentin Malykh

Vadim Poluliakh

Alexander Rudnicky

Iulian V. Serban

Ryan Thomas Lowe

Shrimai Prabhumoye

Alan W. Black

This document contains the description of dataset collected during the first round of Conversational Intelligence Challenge (ConvAI) which t… (see more)ook place in July 2017. During this evaluation round we collected over 2,500 dialogues from 10 chatbots and 500 volunteers. Here we provide the analysis of dataset statistics and outline some possible improvements for future data collection experiments.

2017-12-31

(published)

www.semanticscholar.org

A Deep Reinforcement Learning Chatbot (Short Version)

Iulian V. Serban

Mathieu Germain

Michael Pieper

Nan Rosemary Ke

Sai Rajeswar

Alexandre De Brébisson

Jose M. R. Sotelo

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon … (see more)Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.

2017-12-31

arXiv (preprint)

Dendritic cortical microcircuits approximate the backpropagation algorithm

João Sacramento

Rui Ponte Costa

Walter Senn

Deep learning has seen remarkable developments over the last years, many of them inspired by neuroscience. However, the main learning mechan… (see more)ism behind these advances - error backpropagation - appears to be at odds with neurobiology. Here, we introduce a multilayer neuronal network model with simplified dendritic compartments in which error-driven synaptic plasticity adapts the network towards a global desired output. In contrast to previous work our model does not require separate phases and synaptic learning is driven by local dendritic prediction errors continuously in time. Such errors originate at apical dendrites and occur due to a mismatch between predictive input from lateral interneurons and activity from actual top-down feedback. Through the use of simple dendritic compartments and different cell-types our model can represent both error and normal activity within a pyramidal neuron. We demonstrate the learning capabilities of the model in regression and classification tasks, and show analytically that it approximates the error backpropagation algorithm. Moreover, our framework is consistent with recent observations of learning between brain areas and the architecture of cortical microcircuits. Overall, we introduce a novel view of learning on dendritic cortical circuits and on how the brain may solve the long-standing synaptic credit assignment problem.

2017-12-31

Advances in Neural Information Processing Systems 31 (NeurIPS 2018) (published)

Disentangling the independently controllable factors of variation by interacting with the world

Valentin Thomas

Philippe Beaudoin

William Fedus

It has been postulated that a good representation is one that disentangles the underlying explanatory factors of variation. However, it rema… (see more)ins an open question what kind of training framework could potentially achieve that. Whereas most previous work focuses on the static setting (e.g., with images), we postulate that some of the causal factors could be discovered if the learner is allowed to interact with its environment. The agent can experiment with different actions and observe their effects. More specifically, we hypothesize that some of these factors correspond to aspects of the environment which are independently controllable, i.e., that there exists a policy and a learnable feature for each such aspect of the environment, such that this policy can yield changes in that feature with minimal changes to other features that explain the statistical variations in the observed data. We propose a specific objective function to find such factors, and verify experimentally that it can indeed disentangle independently controllable aspects of the environment without any extrinsic reward signal.

2017-12-31

arXiv (preprint)

FigureQA: An Annotated Figure Dataset for Visual Reasoning

Samira Ebrahimi Kahou

Adam Atkinson

Vincent Michalski

Ákos Kádár

Adam Trischler

We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are s… (see more)ynthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements. We study the proposed visual reasoning task by training several models, including the recently proposed Relation Network as a strong baseline. Preliminary results indicate that the task poses a significant machine learning challenge. We envision FigureQA as a first step towards developing models that can intuitively recognize patterns from visual representations of data.

2017-12-31

ICLR.cc/2018/Conference (Invite to Workshop Track)

openreview.net

Frank-Wolfe Splitting via Augmented Lagrangian Method

Gauthier Gidel

Fabian Pedregosa

Simon Lacoste-Julien

Minimizing a function over an intersection of convex sets is an important task in optimization that is often much more challenging than mini… (see more)mizing it over each individual constraint set. While traditional methods such as Frank-Wolfe (FW) or proximal gradient descent assume access to a linear or quadratic oracle on the intersection, splitting techniques take advantage of the structure of each sets, and only require access to the oracle on the individual constraints. In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints. It is based on the Augmented Lagrangian Method (ALM), also known as Method of Multipliers, but unlike most existing splitting methods, it only requires access to linear (instead of quadratic) minimization oracles. We use recent advances in the analysis of Frank-Wolfe and the alternating direction method of multipliers algorithms to prove a sublinear convergence rate for FW-AL over general convex compact sets and a linear convergence rate for polytopes.

2017-12-31

AISTATS (published)