Publications

Challenging Conventional Segmentation Evaluation Metrics Focal Pathology ( Lesion and Tumour ) Segmentation from Patient Images

Tal Arbel

ChatPainter: Improving Text to Image Generation using Dialogue

Shikhar Sharma

Dendi Suhubdy

Vincent Michalski

Samira Ebrahimi Kahou

Yoshua Bengio

Synthesizing realistic images from text descriptions on a dataset like Microsoft Common Objects in Context (MS COCO), where each image can c… (voir plus)ontain several objects, is a challenging task. Prior work has used text captions to generate images. However, captions might not be informative enough to capture the entire image and insufficient for the model to be able to understand which objects in the images correspond to which words in the captions. We show that adding a dialogue that further describes the scene leads to significant improvement in the inception score and in the quality of generated images on the MS COCO dataset.

2018-01-01

ICLR (Workshop) (publié)

openreview.net

Convolutional neural networks for mesh-based parcellation of the cerebral cortex

Guillem Cucurull

Konrad Wagstyl

Arantxa Casanova

Petar Veličković

Estrid Jakobsen

Michal Drozdzal

Adriana Romero Soriano

Alan C. Evans

Yoshua Bengio

In order to understand the organization of the cerebral cortex, it is necessary to create a map or parcellation of cortical areas. Reconstru… (voir plus)ctions of the cortical surface created from structural MRI scans, are frequently used in neuroimaging as a common coordinate space for representing multimodal neuroimaging data. These meshes are used to investigate healthy brain organization as well as abnormalities in neurological and psychiatric conditions. We frame cerebral cortex parcellation as a mesh segmentation task, and address it by taking advantage of recent advances in generalizing convolutions to the graph domain. In particular, we propose to assess graph convolutional networks and graph attention networks, which, in contrast to previous mesh parcellation models, exploit the underlying structure of the data to make predictions. We show experimentally on the Human Connectome Project dataset that the proposed graph convolutional models outperform current state-of-the-art and baselines, highlighting the potential and applicability of these methods to tackle neuroimaging challenges, paving the road towards a better characterization of brain diseases.

2018-01-01

MIDL.amsterdam/2018/Conference (présentation orale)

openreview.net

A Dataset of Topic-Oriented Human-to-Chatbot Dialogues

Varvara Logacheva

Mikhail Burtsev

Valentin Malykh

Vadim Poluliakh

Alexander Rudnicky

Iulian V. Serban

Ryan Thomas Lowe

Shrimai Prabhumoye

Alan W. Black

Yoshua Bengio

This document contains the description of dataset collected during the first round of Conversational Intelligence Challenge (ConvAI) which t… (voir plus)ook place in July 2017. During this evaluation round we collected over 2,500 dialogues from 10 chatbots and 500 volunteers. Here we provide the analysis of dataset statistics and outline some possible improvements for future data collection experiments.

Deep Complex Networks

Chiheb Trabelsi

Olexa Bilaniuk

Ying Zhang

Dmitriy Serdyuk

Sandeep Subramanian

Joao Felipe Santos

Soroush Mehri

Negar Rostamzadeh

Yoshua Bengio

Chris Pal

At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and re… (voir plus)presentations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are competitive with their real-valued counterparts. We test deep complex models on several computer vision tasks, on music transcription using the MusicNet dataset and on Speech spectrum prediction using TIMIT. We achieve state-of-the-art performance on these audio-related tasks.

2018-01-01

ICLR.cc/2018/Conference (poster)

openreview.net

Dendritic cortical microcircuits approximate the backpropagation algorithm

João Sacramento

Rui Ponte Costa

Yoshua Bengio

Walter Senn

Deep learning has seen remarkable developments over the last years, many of them inspired by neuroscience. However, the main learning mechan… (voir plus)ism behind these advances - error backpropagation - appears to be at odds with neurobiology. Here, we introduce a multilayer neuronal network model with simplified dendritic compartments in which error-driven synaptic plasticity adapts the network towards a global desired output. In contrast to previous work our model does not require separate phases and synaptic learning is driven by local dendritic prediction errors continuously in time. Such errors originate at apical dendrites and occur due to a mismatch between predictive input from lateral interneurons and activity from actual top-down feedback. Through the use of simple dendritic compartments and different cell-types our model can represent both error and normal activity within a pyramidal neuron. We demonstrate the learning capabilities of the model in regression and classification tasks, and show analytically that it approximates the error backpropagation algorithm. Moreover, our framework is consistent with recent observations of learning between brain areas and the architecture of cortical microcircuits. Overall, we introduce a novel view of learning on dendritic cortical circuits and on how the brain may solve the long-standing synaptic credit assignment problem.

arxiv.org

FigureQA: An Annotated Figure Dataset for Visual Reasoning

Samira Ebrahimi Kahou

Adam Atkinson

Vincent Michalski

Ákos Kádár

Adam Trischler

Yoshua Bengio

We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are s… (voir plus)ynthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as bounding-box annotations for all plot elements. We study the proposed visual reasoning task by training several models, including the recently proposed Relation Network as a strong baseline. Preliminary results indicate that the task poses a significant machine learning challenge. We envision FigureQA as a first step towards developing models that can intuitively recognize patterns from visual representations of data.

2018-01-01

ICLR.cc/2018/Conference (Invite to Workshop Track)

openreview.net

Frank-Wolfe Splitting via Augmented Lagrangian Method

Gauthier Gidel

Fabian Pedregosa

Simon Lacoste-Julien

Minimizing a function over an intersection of convex sets is an important task in optimization that is often much more challenging than mini… (voir plus)mizing it over each individual constraint set. While traditional methods such as Frank-Wolfe (FW) or proximal gradient descent assume access to a linear or quadratic oracle on the intersection, splitting techniques take advantage of the structure of each sets, and only require access to the oracle on the individual constraints. In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints. It is based on the Augmented Lagrangian Method (ALM), also known as Method of Multipliers, but unlike most existing splitting methods, it only requires access to linear (instead of quadratic) minimization oracles. We use recent advances in the analysis of Frank-Wolfe and the alternating direction method of multipliers algorithms to prove a sublinear convergence rate for FW-AL over general convex compact sets and a linear convergence rate for polytopes.

2018-01-01

AISTATS (publié)

proceedings.mlr.press

arxiv.org

Fraternal Dropout

Konrad Żołna

Devansh Arpit

Dendi Suhubdy

Yoshua Bengio

2018-01-01

ICLR.cc/2018/Conference (poster)

openreview.net

Fraternal Dropout

Konrad Żołna

Devansh Arpit

Dendi Suhubdy

Yoshua Bengio

Recurrent neural networks (RNNs) are important class of architectures among neural networks useful for language modeling and sequential pred… (voir plus)iction. However, optimizing RNNs is known to be harder compared to feed-forward neural networks. A number of techniques have been proposed in literature to address this problem. In this paper we propose a simple technique called fraternal dropout that takes advantage of dropout to achieve this goal. Specifically, we propose to train two identical copies of an RNN (that share parameters) with different dropout masks while minimizing the difference between their (pre-softmax) predictions. In this way our regularization encourages the representations of RNNs to be invariant to dropout mask, thus being robust. We show that our regularization term is upper bounded by the expectation-linear dropout objective which has been shown to address the gap due to the difference between the train and inference phases of dropout. We evaluate our model and achieve state-of-the-art results in sequence modeling tasks on two benchmark datasets - Penn Treebank and Wikitext-2. We also show that our approach leads to performance improvement by a significant margin in image captioning (Microsoft COCO) and semi-supervised (CIFAR-10) tasks.

2018-01-01

ICLR (Poster) (publié)

arxiv.org

Graph Attention Networks

Petar Veličković

Guillem Cucurull

Arantxa Casanova

Adriana Romero Soriano

Pietro Lio

Yoshua Bengio

2018-01-01

ICLR.cc/2018/Conference (poster)

openreview.net

A Hierarchical Neural Attention-based Text Classifier

Koustuv Sinha

Yue Dong

Jackie Cheung

Derek Ruths

Deep neural networks have been displaying superior performance over traditional supervised classifiers in text classification. They learn to… (voir plus) extract useful features automatically when sufficient amount of data is presented. However, along with the growth in the number of documents comes the increase in the number of categories, which often results in poor performance of the multiclass classifiers. In this work, we use external knowledge in the form of topic category taxonomies to aide the classification by introducing a deep hierarchical neural attention-based classifier. Our model performs better than or comparable to state-of-the-art hierarchical models at significantly lower computational cost while maintaining high interpretability.

2018-01-01

Conference on Empirical Methods in Natural Language Processing (publié)

doi.org

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Mila Techaide 2025

Développement du groupe d'experts de l'ONU sur l'IA

Transition à la direction scientifique de Mila

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Publications

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Mila Techaide 2025

Développement du groupe d'experts de l'ONU sur l'IA

Transition à la direction scientifique de Mila

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Mots-clés populaires:

Publications