Publications

Neural Transfer Learning for Cry-based Diagnosis of Perinatal Asphyxia

Charles C. Onu

William L. Hamilton

Despite continuing medical advances, the rate of newborn morbidity and mortality globally remains high, with over 6 million casualties every… (voir plus) year. The prediction of pathologies affecting newborns based on their cry is thus of significant clinical interest, as it would facilitate the development of accessible, low-cost diagnostic tools\cut{ based on wearables and smartphones}. However, the inadequacy of clinically annotated datasets of infant cries limits progress on this task. This study explores a neural transfer learning approach to developing accurate and robust models for identifying infants that have suffered from perinatal asphyxia. In particular, we explore the hypothesis that representations learned from adult speech could inform and improve performance of models developed on infant speech. Our experiments show that models based on such representation transfer are resilient to different types and degrees of noise, as well as to signal loss in time and frequency domains.

2019-09-12

Interspeech 2019 (publié)

doi.org

arxiv.org

Speech Model Pre-training for End-to-End Spoken Language Understanding

Loren Lugosch

Mirco Ravanelli

Patrick Ignoto

Vikrant Singh Tomar

Yoshua Bengio

Whereas conventional spoken language understanding (SLU) systems map speech to text, and then text to intent, end-to-end SLU systems map spe… (voir plus)ech directly to intent through a single trainable model. Achieving high accuracy with these end-to-end models without a large amount of training data is difficult. We propose a method to reduce the data requirements of end-to-end SLU in which the model is first pre-trained to predict words and phonemes, thus learning good features for SLU. We introduce a new SLU dataset, Fluent Speech Commands, and show that our method improves performance both when the full dataset is used for training and when only a small subset is used. We also describe preliminary experiments to gauge the model's ability to generalize to new phrases not heard during training.

2019-09-12

Interspeech 2019 (publié)

doi.org

arxiv.org

The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis

Jessica A.F. Thompson

Yoshua Bengio

Marc Schoenwiesner

2019-09-12

2019 Conference on Cognitive Computational Neuroscience (publié)

doi.org

arxiv.org

Exact Combinatorial Optimization with Graph Convolutional Neural Networks

Maxime Gasse

Didier Chételat

Nicola Ferroni

Laurent Charlin

Andrea Lodi

Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose a new graph convolutional neural netw… (voir plus)ork model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs. We train our model via imitation learning from the strong branching expert rule, and demonstrate on a series of hard problems that our approach produces policies that improve upon state-of-the-art machine-learning methods for branching and generalize to instances significantly larger than seen during training. Moreover, we improve for the first time over expert-designed branching rules implemented in a state-of-the-art solver on large problems. Code for reproducing all the experiments can be found at this https URL.

2019-09-05

NeurIPS.cc/2019/Reproducibility_Challenge (inconnu)

doi.org

openreview.net

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

Giancarlo Kerg

Kyle Goyette

Maximilian Puelma Touzel

A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over lo… (voir plus)ng time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This ensures eigenvalues with unit norm and thus stable dynamics and training. However this comes at the cost of reduced expressivity due to the limited variety of orthogonal transformations. We propose a novel connectivity structure based on the Schur decomposition and a splitting of the Schur form into normal and non-normal parts. This allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases. The resulting architecture ensures access to a larger space of spectrally constrained matrices, of which orthogonal matrices are a subset. This crucial difference retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences.

2019-09-05

NeurIPS.cc/2019/Reproducibility_Challenge (inconnu)

doi.org

openreview.net

Data-Driven Approach to Encoding and De-Coding 3-D Crystal Structures

Jean Michel Sellier

Generative models have achieved impressive results in many domains including image and text generation. In the natural sciences, generative … (voir plus)models have led to rapid progress in automated drug discovery. Many of the current methods focus on either 1-D or 2-D representations of typically small, drug-like molecules. However, many molecules require 3-D descriptors and exceed the chemical complexity of commonly used dataset. We present a method to encode and decode the position of atoms in 3-D molecules from a dataset of nearly 50,000 stable crystal unit cells that vary from containing 1 to over 100 atoms. We construct a smooth and continuous 3-D density representation of each crystal based on the positions of different atoms. Two different neural networks were trained on a dataset of over 120,000 three-dimensional samples of single and repeating crystal structures, made by rotating the single unit cells. The first, an Encoder-Decoder pair, constructs a compressed latent space representation of each molecule and then decodes this description into an accurate reconstruction of the input. The second network segments the resulting output into atoms and assigns each atom an atomic number. By generating compressed, continuous latent spaces representations of molecules we are able to decode random samples, interpolate between two molecules, and alter known molecules.

2019-09-02

ArXiv (prépublication)

doi.org

openreview.net

Recognizable series on graphs and hypergraphs

Raphaël Bailly

Guillaume Rabusseau

François Denis

2019-08-31

Journal of Computer and System Sciences (publié)

doi.org

Teaching Modelling Literacy: An Artificial Intelligence Approach

Rijul Saini

Gunter Mussbacher

Jin L.C. Guo

Jörg Kienzle

In Model-Driven Engineering (MDE), models are used to build and analyze complex systems. In the last decades, different modelling formalisms… (voir plus) have been proposed for supporting software development. However, their adoption and practice strongly rely on mastering essential modelling skills to develop a complete and coherent model-based system. Moreover, it is often difficult for novice modellers to get direct and timely feedback and recommendations on their modelling strategies and decisions, particularly in large classroom settings which hinders their learning. Certainly, there is an opportunity to apply Artificial Intelligence (AI) techniques to an MDE learning environment to empower the provisioning of automated and intelligent modelling advocacy. In this paper, we propose a framework called ModBud (a modelling buddy) to educate novice modellers about the art of abstraction. ModBud uses natural language processing (NLP) and machine learning (ML) to create modelling bots with the aim of improving the modelling skills of novice modellers and assisting other practitioners, too. These bots could be used to support teaching with automatic creation or grading of models and enhance learning beyond the traditional classroom-based MDE education with timely feedback and personalized tutoring. Research challenges for the proposed framework are discussed and a research roadmap is presented.

2019-08-31

2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C) (publié)

doi.org

Toward Requirements Specification for Machine-Learned Components

Mona Rahimi

Jin L.C. Guo

Sahar Kokaly

Marsha Chechik

In current practice, the behavior of Machine-Learned Components (MLCs) is not sufficiently specified by the predefined requirements. Instead… (voir plus), they "learn" existing patterns from the available training data, and make predictions for unseen data when deployed. On the surface, their ability to extract patterns and to behave accordingly is specifically useful for hard-to-specify concepts in certain safety critical domains (e.g., the definition of a pedestrian in a pedestrian detection component in a vehicle). However, the lack of requirements specifications on their behaviors makes further software engineering tasks challenging for such components. This is especially concerning for tasks such as safety assessment and assurance. In this position paper, we call for more attention from the requirements engineering community on supporting the specification of requirements for MLCs in safety critical domains. Towards that end, we propose an approach to improve the process of requirements specification in which an MLC is developed and operates by explicitly specifying domain-related concepts. Our approach extracts a universally accepted benchmark for hard-to-specify concepts (e.g., "pedestrian") and can be used to identify gaps in the associated dataset and the constructed machine-learned model.

2019-08-31

2019 IEEE 27th International Requirements Engineering Conference Workshops (REW) (publié)

doi.org

Online Continual Learning with Maximally Interfered Retrieval

Lucas Caccia

Massimo Caccia

Tinne Tuytelaars

Continual learning, the setting where a learning agent is faced with a never ending stream of data, continues to be a great challenge for mo… (voir plus)dern machine learning systems. In particular the online or "single-pass through the data" setting has gained attention recently as a natural setting that is difficult to tackle. Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks. These approaches typically rely on randomly selecting samples from the replay memory or from a generative model, which is suboptimal. In this work, we consider a controlled sampling of memories for replay. We retrieve the samples which are most interfered, i.e. whose prediction will be most negatively impacted by the foreseen parameters update. We show a formulation for this sampling criterion in both the generative replay and the experience replay setting, producing consistent gains in performance and greatly reduced forgetting. We release an implementation of our method at this https URL.

2019-08-10

ArXiv (prépublication)

arxiv.org

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

Felipe Petroski Such

Vashisht Madhavan

Rosanne Liu

Rui Wang

Pablo Samuel Castro

Yulun Li

Jiale Zhi

Ludwig Schubert

Bellemare Marc-Emmanuel

Jeff Clune

Joel Lehman

Much human and computational effort has aimed to improve how deep reinforcement learning (DRL) algorithms perform on benchmarks such as the … (voir plus)Atari Learning Environment. Comparatively less effort has focused on understanding what has been learned by such methods, and investigating and comparing the representations learned by different families of DRL algorithms. Sources of friction include the onerous computational requirements, and general logistical and architectural complications for running DRL algorithms at scale. We lessen this friction, by (1) training several algorithms at scale and releasing trained models, (2) integrating with a previous DRL model release, and (3) releasing code that makes it easy for anyone to load, visualize, and analyze such models. This paper introduces the Atari Zoo framework, which contains models trained across benchmark Atari games, in an easy-to-use format, as well as code that implements common modes of analysis and connects such models to a popular neural network visualization library. Further, to demonstrate the potential of this dataset and software package, we show initial quantitative and qualitative comparisons between the performance and representations of several DRL algorithms, highlighting interesting and previously unknown distinctions between them.

2019-08-09

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (publié)

doi.org

arxiv.org

Interpolation Consistency Training for Semi-Supervised Learning

Juho Kannala

David Lopez-Paz

Arno Solin