Publications

Fast reinforcement learning with generalized policy updates

Andre Barreto

Shaobo Hou

Diana Borsa

David Silver

The combination of reinforcement learning with deep learning is a promising approach to tackle important sequential decision-making problems… (see more) that are currently intractable. One obstacle to overcome is the amount of data needed by learning systems of this type. In this article, we propose to address this issue through a divide-and-conquer approach. We argue that complex decision problems can be naturally decomposed into multiple tasks that unfold in sequence or in parallel. By associating each task with a reward function, this problem decomposition can be seamlessly accommodated within the standard reinforcement-learning formalism. The specific way we do so is through a generalization of two fundamental operations in reinforcement learning: policy improvement and policy evaluation. The generalized version of these operations allow one to leverage the solution of some tasks to speed up the solution of others. If the reward function of a task can be well approximated as a linear combination of the reward functions of tasks previously solved, we can reduce a reinforcement-learning problem to a simpler linear regression. When this is not the case, the agent can still exploit the task solutions by using them to interact with and learn about the environment. Both strategies considerably reduce the amount of data needed to solve a reinforcement-learning problem.

2020-08-16

Proceedings of the National Academy of Sciences of the United States of America (published)

doi.org

Mastering Rate based Curriculum Learning

Lucas Willems

Salem Lahlou

Yoshua Bengio

2020-08-13

ArXiv (preprint)

arxiv.org

Adaptive Learning of Tensor Network Structures

Tensor Networks (TN) offer a powerful framework to efficiently represent very high-dimensional objects. TN have recently shown their potenti… (see more)al for machine learning applications and offer a unifying view of common tensor decomposition models such as Tucker, tensor train (TT) and tensor ring (TR). However, identifying the best tensor network structure from data for a given task is challenging. In this work, we leverage the TN formalism to develop a generic and efficient adaptive algorithm to jointly learn the structure and the parameters of a TN from data. Our method is based on a simple greedy approach starting from a rank one tensor and successively identifying the most promising tensor network edges for small rank increments. Our algorithm can adaptively identify TN structures with small number of parameters that effectively optimize any differentiable objective function. Experiments on tensor decomposition, tensor completion and model compression tasks demonstrate the effectiveness of the proposed algorithm. In particular, our method outperforms the state-of-the-art evolutionary topology search [Li and Sun, 2020] for tensor decomposition of images (while being orders of magnitude faster) and finds efficient tensor network structures to compress neural networks outperforming popular TT based approaches [Novikov et al., 2015].

2020-08-11

ArXiv (preprint)

openreview.net

Prediction, Not Association, Paves the Road to Precision Medicine

Danilo Bzdok

Gael Varoquaux

Ewout W. Steyerberg

2020-08-11

JAMA Psychiatry (published)

doi.org

Robust motion in-betweening

Félix Harvey

Mike Yurick

D. Nowrouzezahrai

Christopher Pal

In this work we present a novel, robust transition generation technique that can serve as a new tool for 3D animators, based on adversarial … (see more)recurrent neural networks. The system synthesises high-quality motions that use temporally-sparse keyframes as animation constraints. This is reminiscent of the job of in-betweening in traditional animation pipelines, in which an animator draws motion frames between provided keyframes. We first show that a state-of-the-art motion prediction model cannot be easily converted into a robust transition generator when only adding conditioning information about future keyframes. To solve this problem, we then propose two novel additive embedding modifiers that are applied at each timestep to latent representations encoded inside the network's architecture. One modifier is a time-to-arrival embedding that allows variations of the transition length with a single model. The other is a scheduled target noise vector that allows the system to be robust to target distortions and to sample different transitions given fixed keyframes. To qualitatively evaluate our method, we present a custom MotionBuilder plugin that uses our trained model to perform in-betweening in production scenarios. To quantitatively evaluate performance on transitions and generalizations to longer time horizons, we present well-defined in-betweening benchmarks on a subset of the widely used Human3.6M dataset and on LaFAN1, a novel high quality motion capture dataset that is more appropriate for transition generation. We are releasing this new dataset along with this work, with accompanying code for reproducing our baseline results.

2020-08-11

ACM Transactions on Graphics (published)

doi.org

arxiv.org

Meta-matching: a simple framework to translate phenotypic predictive models from big to small data

Tong He

Lijun An

Jiashi Feng

Danilo Bzdok

Avram J Holmes

Simon B. Eickhoff

B.T. Thomas Yeo

There is significant interest in using brain imaging data to predict non-brain-imaging phenotypes in individual participants. However, most … (see more)prediction studies are underpowered, relying on less than a few hundred participants, leading to low reliability and inflated prediction performance. Yet, small sample sizes are unavoidable when studying clinical populations or addressing focused neuroscience questions. Here, we propose a simple framework – “meta-matching” – to translate predictive models from large-scale datasets to new unseen non-brain-imaging phenotypes in boutique studies. The key observation is that many large-scale datasets collect a wide range inter-correlated phenotypic measures. Therefore, a unique phenotype from a boutique study likely correlates with (but is not the same as) some phenotypes in some large-scale datasets. Meta-matching exploits these correlations to boost prediction in the boutique study. We applied meta-matching to the problem of predicting non-brain-imaging phenotypes using resting-state functional connectivity (RSFC). Using the UK Biobank (N = 36,848), we demonstrated that meta-matching can boost the prediction of new phenotypes in small independent datasets by 100% to 400% in many scenarios. When considering relative prediction performance, meta-matching significantly improved phenotypic prediction even in samples with 10 participants. When considering absolute prediction performance, meta-matching significantly improved phenotypic prediction when there were least 50 participants. With a growing number of large-scale population-level datasets collecting an increasing number of phenotypic measures, our results represent a lower bound on the potential of meta-matching to elevate small-scale boutique studies.

2020-08-10

bioRxiv (preprint)

doi.org

Hidden population modes in social brain morphology: Its parts are more than its sum

Hannah Kiesow

R. Nathan Spreng

Avram J. Holmes

M. Mallar Chakravarty

Andre F. Marquand

B.T. Thomas Yeo

Danilo Bzdok

The complexity of social interactions is a defining property of the human species. Many social neuroscience experiments have sought to map … (see more)perspective taking’, ‘empathy’, and other canonical psychological constructs to distinguishable brain circuits. This predominant research paradigm was seldom complemented by bottom-up studies of the unknown sources of variation that add up to measures of social brain structure; perhaps due to a lack of large population datasets. We aimed at a systematic de-construction of social brain morphology into its elementary building blocks in the UK Biobank cohort (n=~10,000). Coherent patterns of structural co-variation were explored within a recent atlas of social brain locations, enabled through translating autoencoder algorithms from deep learning. The artificial neural networks learned rich subnetwork representations that became apparent from social brain variation at population scale. The learned subnetworks carried essential information about the co-dependence configurations between social brain regions, with the nucleus accumbens, medial prefrontal cortex, and temporoparietal junction embedded at the core. Some of the uncovered subnetworks contributed to predicting examined social traits in general, while other subnetworks helped predict specific facets of social functioning, such as feelings of loneliness. Our population-level evidence indicates that hidden subsystems of the social brain underpin interindividual variation in dissociable aspects of social lifestyle.

2020-08-06

bioRxiv (preprint)

doi.org

''COGITO in Space'': a thought experiment in exo-neurobiology

Daniela de Paulis

Stephen Whitmarsh

Robert Oostenveld

Guillaume Dumas

Michael Sanders

2020-08-03

(published)

doi.org

SeroTracker: a global SARS-CoV-2 seroprevalence dashboard

Rahul K. Arora

Abel Joseph

Jordan Van Wyk

Simona Rocco

Austin Atmaja

Ewan May

Tingting Yan

Niklas Bobrovitz

Jonathan Chevrier

Matthew P. Cheng

Tyler Williamson

David L Buckeridge

2020-08-03

Lancet. Infectious Diseases (Print) (published)

doi.org

BDD-based optimization for the quadratic stable set problem

Jaime E. González

Andr'e Augusto Cire

Andrea Lodi

Louis-Martin Rousseau

2020-07-31

Discrete Optimization (published)

doi.org

Dynamic planning of redundant robots within a set-based task-priority inverse kinematics framework.

Daniele Di Vito

Mathieux Bergeron

David Meger

Gregory Dudek

Gianluca Antonelli

This work presents the dynamic planning of redundant robots by merging a global and local planner. The global planner is implemented as a sa… (see more)mpling-based algorithm which works in the reduced-dimensionality of the robot workspace applying the Cartesian constraints only. The output trajectory is then checked within a framework of set-based task priority inverse kinematics verifying the fulfillment of the other task constraints. The inverse kinematics framework is used also in real-time as local motion control to ensure a reactive behaviour to address, e.g., mismatch between the apriori information and on-line perception acquisition. During the movement, the motion planner runs in background to adapt to changes in the environment or, in general, to continuously optimize the path. The proposed method is experimentally validated with a Kinova Jaco2 7 degrees of freedom manipulator.

2020-07-31

Conference on Control Technology and Applications (published)

doi.org

Optimal Local and Remote Controllers With Unreliable Uplink Channels: An Elementary Proof

Mohammad Afshari

Aditya Mahajan

Recently, a model of a decentralized control system with local and remote controllers connected over unreliable channels was presented in [… (see more)1]. The model has a nonclassical information structure that is not partially nested. Nonetheless, it is shown in [1] that the optimal control strategies are linear functions of the state estimate (which is a nonlinear function of the observations). Their proof is based on a fairly sophisticated dynamic programming argument. In this article, we present an alternative and elementary proof of the result which uses common information-based conditional independence and completion of squares.

2020-07-31

IEEE Transactions on Automatic Control (published)

doi.org

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications