Publications

Handling Black Swan Events in Deep Learning with Diversely Extrapolated Neural Networks

Maxime Wabartha

Vincent Francois-Lavet

By virtue of their expressive power, neural networks (NNs) are well suited to fitting large, complex datasets, yet they are also known to … (voir plus)produce similar predictions for points outside the training distribution. As such, they are, like humans, under the influence of the Black Swan theory: models tend to be extremely "surprised" by rare events, leading to potentially disastrous consequences, while justifying these same events in hindsight. To avoid this pitfall, we introduce DENN, an ensemble approach building a set of Diversely Extrapolated Neural Networks that fits the training data and is able to generalize more diversely when extrapolating to novel data points. This leads DENN to output highly uncertain predictions for unexpected inputs. We achieve this by adding a diversity term in the loss function used to train the model, computed at specific inputs. We first illustrate the usefulness of the method on a low-dimensional regression problem. Then, we show how the loss can be adapted to tackle anomaly detection during classification, as well as safe imitation learning problems.

2020-07-01

International Joint Conference on Artificial Intelligence (publié)

doi.org

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)

Vincent Francois-Lavet

Guillaume Rabusseau

Joelle Pineau

Damien Ernst

Raphael Fonteneau

When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: … (voir plus)a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.

2020-07-01

International Joint Conference on Artificial Intelligence (publié)

doi.org

Words Aren’t Enough, Their Order Matters: On the Robustness of Grounding Visual Referring Expressions

Arjun Reddy Akula

Spandana Gella

Yaser Al-Onaizan

Song-Chun Zhu

Siva Reddy

Visual referring expression recognition is a challenging task that requires natural language understanding in the context of an image. We cr… (voir plus)itically examine RefCOCOg, a standard benchmark for this task, using a human study and show that 83.7% of test instances do not require reasoning on linguistic structure, i.e., words are enough to identify the target object, the word order doesn’t matter. To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn’t. Additionally, we create an out-of-distribution dataset Ref-Adv by asking crowdworkers to perturb in-domain examples such that the target object changes. Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task. We also propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT, the current state-of-the-art model for this task. Our datasets are publicly available at https://github.com/aws/aws-refcocog-adv.

2020-07-01

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (publié)

doi.org

arxiv.org

Medical Imaging with Deep Learning: MIDL 2020 - Short Paper Track

Tal Arbel

Ismail Ben Ayed

Marleen de Bruijne

Maxime Descoteaux

Hervé Lombaert

Chris Pal

This compendium gathers all the accepted extended abstracts from the Third International Conference on Medical Imaging with Deep Learning (M… (voir plus)IDL 2020), held in Montreal, Canada, 6-9 July 2020. Note that only accepted extended abstracts are listed here, the Proceedings of the MIDL 2020 Full Paper Track are published in the Proceedings of Machine Learning Research (PMLR).

2020-06-29

ArXiv (prépublication)

arxiv.org

Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

Anirudh Goyal

Alex Lamb

Phanideep Gampa

Philippe Beaudoin

Sergey Levine

Charles Blundell

Yoshua Bengio

Michael Curtis Mozer

2020-06-29

ArXiv (prépublication)

arxiv.org

Inherent privacy limitations of decentralized contact tracing apps

Yoshua Bengio

Daphne Ippolito

Richard Janda

Max Jarvie

Benjamin Prud'homme

Jean-François Rousseau

Abhinav Sharma

Yun William Yu

2020-06-25

J. Am. Medical Informatics Assoc. (publié)

doi.org

Image-to-image Mapping with Many Domains by Sparse Attribute Transfer

Matthew Amodio

Rim Assouel

Victor Schmidt

Tristan Sylvain

Smita Krishnaswamy

Yoshua Bengio

2020-06-23

ArXiv (prépublication)

arxiv.org

Rethinking Distributional Matching Based Domain Adaptation

Bo Li

Yezhen Wang

Tong Che

Shanghang Zhang

Sicheng Zhao

Pengfei Xu

Wei Zhou

Yoshua Bengio

Kurt W. Keutzer

Domain adaptation (DA) is a technique that transfers predictive models trained on a labeled source domain to an unlabeled target domain, wit… (voir plus)h the core difficulty of resolving distributional shift between domains. Currently, most popular DA algorithms are based on distributional matching (DM). However in practice, realistic domain shifts (RDS) may violate their basic assumptions and as a result these methods will fail. In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods. We further propose InstaPBM, a novel Instance-based Predictive Behavior Matching method for robust DA. Extensive experiments on both conventional and RDS benchmarks demonstrate both the limitations of DM methods and the efficacy of InstaPBM: Compared with the best baselines, InstaPBM improves the classification accuracy respectively by

2020-06-23

ArXiv (prépublication)

arxiv.org

HNHN: Hypergraph Networks with Hyperedge Neurons

Yihe Dong

W. Sawin

Yoshua Bengio

2020-06-22

ArXiv (prépublication)

arxiv.org

Individual differences in interpersonal coordination

Julia Ayache

Guillaume Dumas

A. Sumich

D. Kuss

Darren Rhodes

Nadja Heym

2020-06-18

(publié)

doi.org

Special Issue on Novel Informatics Approaches to COVID-19 Research

Hua Xu

David Buckeridge

Fei Wang Guest Editors

2020-06-17

Journal of Biomedical Informatics (publié)

doi.org

Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks

Ghouthi Boukli Hacene

Vincent Gripon

Matthieu Arzel

Nicolas Farrugia

Yoshua Bengio

Deep Neural Networks (DNNs) in general and Convolutional Neural Networks (CNNs) in particular are state-of-the-art in numerous computer visi… (voir plus)on tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity and strongly limits their usability in budget-constrained devices such as embedded devices. In this paper, we propose a combination of a pruning technique and a quantization scheme that effectively reduce the complexity and memory usage of convolutional layers of CNNs, by replacing the complex convolutional operation by a low-cost multiplexer. We perform experiments on CIFAR10, CIFAR100 and SVHN datasets and show that the proposed method achieves almost state-of-the-art accuracy, while drastically reducing the computational and memory footprints compared to the baselines. We also propose an efficient hardware architecture, implemented on Field Programmable Gate Arrays (FPGAs), to accelerate inference, which works as a pipeline and accommodates multiple layers working at the same time to speed up the inference process. In contrast with most proposed approaches which have used external memory or software defined memory controllers, our work is based on algorithmic optimization and full-hardware design, enabling a direct, on-chip memory implementation of a DNN while keeping close to state of the art accuracy.

2020-06-16

2020 18th IEEE International New Circuits and Systems Conference (NEWCAS) (publié)

doi.org

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Mila Techaide 2025

Développement du groupe d'experts de l'ONU sur l'IA

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Publications

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Mila Techaide 2025

Développement du groupe d'experts de l'ONU sur l'IA

À la hauteur du moment

Perspectives sur l’IA pour les responsables des politiques

Mots-clés populaires:

Publications