Advances in Artificial Intelligence
Ebrahim Bagheri
Automatic differentiation in ML: Where we are and where we should be going
Bart van Merriënboer
Olivier Breuleux
Arnaud Bergeron
Pascal Lamblin
We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approa… (see more)ches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) which specifically aims to efficiently support fully-general AD for array programming. Unlike existing dataflow programming representations in ML frameworks, our IR naturally supports function calls, higher-order functions and recursion, making ML models easier to implement. The ability to represent closures allows us to perform AD using ST without a tape, making the resulting derivative (adjoint) program amenable to ahead-of-time optimization using tools from functional language compilers, and enabling higher-order derivatives. Lastly, we introduce a proof of concept compiler toolchain called Myia which uses a subset of Python as a front end.
Bayesian Model-Agnostic Meta-Learning
Taesup Kim
Jaesik Yoon
Ousmane Dia
Sungwoong Kim
Sungjin Ahn
Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty … (see more)inherent in the problem. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation. In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update. Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement. Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning.
BigBrain: 1D convolutional neural networks for automated sementation of cortical layers
Konrad Wagstyl
Claude Lepage
Karl Zilles
Sebastian Bludau
G. Cucurul
Alan C. Evans
Paul C Fletcher
Joseph Paul Cohen
Stéphanie Larocque
Thomas Funck
Katrin Amunts
Boundary Seeking GANs
Athul Jacob
Adam Trischler
Gerry Che
Kyunghyun Cho
Generative adversarial networks are a learning framework that rely on training a discriminator to estimate a measure of difference between a… (see more) target and generated distributions. GANs, as normally formulated, rely on the generated samples being completely differentiable w.r.t. the generative parameters, and thus do not work for discrete data. We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator. The importance weights have a strong connection to the decision boundary of the discriminator, and we call our method boundary-seeking GANs (BGANs). We demonstrate the effectiveness of the proposed algorithm with discrete image and character-based natural language generation. In addition, the boundary-seeking objective extends to continuous data, which can be used to improve stability of training, and we demonstrate this on Celeba, Large-scale Scene Understanding (LSUN) bedrooms, and Imagenet without conditioning.
Challenging Conventional Segmentation Evaluation Metrics Focal Pathology ( Lesion and Tumour ) Segmentation from Patient Images
ChatPainter: Improving Text to Image Generation using Dialogue
Shikhar Sharma
Dendi Suhubdy
Vincent Michalski
Synthesizing realistic images from text descriptions on a dataset like Microsoft Common Objects in Context (MS COCO), where each image can c… (see more)ontain several objects, is a challenging task. Prior work has used text captions to generate images. However, captions might not be informative enough to capture the entire image and insufficient for the model to be able to understand which objects in the images correspond to which words in the captions. We show that adding a dialogue that further describes the scene leads to significant improvement in the inception score and in the quality of generated images on the MS COCO dataset.
Convolutional neural networks for mesh-based parcellation of the cerebral cortex
Guillem Cucurull
Konrad Wagstyl
Arantxa Casanova
Petar Veličković
Estrid Jakobsen
Michal Drozdzal
Alan C. Evans
In order to understand the organization of the cerebral cortex, it is necessary to create a map or parcellation of cortical areas. Reconstru… (see more)ctions of the cortical surface created from structural MRI scans, are frequently used in neuroimaging as a common coordinate space for representing multimodal neuroimaging data. These meshes are used to investigate healthy brain organization as well as abnormalities in neurological and psychiatric conditions. We frame cerebral cortex parcellation as a mesh segmentation task, and address it by taking advantage of recent advances in generalizing convolutions to the graph domain. In particular, we propose to assess graph convolutional networks and graph attention networks, which, in contrast to previous mesh parcellation models, exploit the underlying structure of the data to make predictions. We show experimentally on the Human Connectome Project dataset that the proposed graph convolutional models outperform current state-of-the-art and baselines, highlighting the potential and applicability of these methods to tackle neuroimaging challenges, paving the road towards a better characterization of brain diseases.
A Dataset of Topic-Oriented Human-to-Chatbot Dialogues
Varvara Logacheva
Mikhail Burtsev
Valentin Malykh
Vadim Poluliakh
Alexander Rudnicky
Iulian V. Serban
Ryan Thomas Lowe
Shrimai Prabhumoye
Alan W. Black
This document contains the description of dataset collected during the first round of Conversational Intelligence Challenge (ConvAI) which t… (see more)ook place in July 2017. During this evaluation round we collected over 2,500 dialogues from 10 chatbots and 500 volunteers. Here we provide the analysis of dataset statistics and outline some possible improvements for future data collection experiments.
Deep Complex Networks
Chiheb Trabelsi
Olexa Bilaniuk
Ying Zhang
Dmitriy Serdyuk
Sandeep Subramanian
Joao Felipe Santos
Soroush Mehri
At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and re… (see more)presentations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are competitive with their real-valued counterparts. We test deep complex models on several computer vision tasks, on music transcription using the MusicNet dataset and on Speech spectrum prediction using TIMIT. We achieve state-of-the-art performance on these audio-related tasks.
Dendritic cortical microcircuits approximate the backpropagation algorithm
João Sacramento
Rui Ponte Costa
Walter Senn
Deep learning has seen remarkable developments over the last years, many of them inspired by neuroscience. However, the main learning mechan… (see more)ism behind these advances - error backpropagation - appears to be at odds with neurobiology. Here, we introduce a multilayer neuronal network model with simplified dendritic compartments in which error-driven synaptic plasticity adapts the network towards a global desired output. In contrast to previous work our model does not require separate phases and synaptic learning is driven by local dendritic prediction errors continuously in time. Such errors originate at apical dendrites and occur due to a mismatch between predictive input from lateral interneurons and activity from actual top-down feedback. Through the use of simple dendritic compartments and different cell-types our model can represent both error and normal activity within a pyramidal neuron. We demonstrate the learning capabilities of the model in regression and classification tasks, and show analytically that it approximates the error backpropagation algorithm. Moreover, our framework is consistent with recent observations of learning between brain areas and the architecture of cortical microcircuits. Overall, we introduce a novel view of learning on dendritic cortical circuits and on how the brain may solve the long-standing synaptic credit assignment problem.
FigureQA: An Annotated Figure Dataset for Visual Reasoning
Adam Atkinson
Vincent Michalski
Ákos Kádár
Adam Trischler
We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are s… (see more)ynthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements. We study the proposed visual reasoning task by training several models, including the recently proposed Relation Network as a strong baseline. Preliminary results indicate that the task poses a significant machine learning challenge. We envision FigureQA as a first step towards developing models that can intuitively recognize patterns from visual representations of data.