Publications

Not one model fits all: unfairness in RSFC-based prediction of behavioral data in African American
Jingwei Li
Avram J. Holmes
Thomas B.t. Yeo
Sarah Genon
14 Helmholtz AI kick-off meeting 5 Mar 2020, 14:17:33 Page 1/1 Abstract #14 | Poster Not one model fits all: unfairness in RSFC-based predic… (see more)tion of behavioral data in African American J. Li , D. Bzdok, A. Holmes, T. Yeo, S. Genon 1 Forschungszentrum Julich, Institute of Neuroscience and Medicine, Jülich, Germany 2 McGill University, Department of Biomedical Imaging, Montreal, Canada 3 National University of Singapore, ECE, CSC, CIRC, N.1 & MNP, Singapore, Singapore 4 Yale University, New Haven, United States of America While predictive models are expected to play a major role in personalized medicine approaches in the future, biases towards specific population groups have been evidenced, hence raising concerns about the risks of unfairness of machine learning algorithms. As great hopes and intense work have been invested recently in the prediction of behavioral phenotypes based on brain resting-state functional connectivity (RSFC), we here examined potential differences in RSFC-based predictive models of behavioral data between African American (AA) and White American (WA) samples matched for the main demographic, anthropometric, behavioral and in-scanner motion variables. We used resting-fMRI data with 58 behavioral measures of 953 subjects comprising 130 African American (AA) and 724 White American (WA). For each subject, a 419 x 419 matrix summarizing connectivity of 419 brain regions was computed. Matching between AA and WA was performed at the subject level by creating 102 pairs of AA and WA subjects, matched for 6 types of variables (age, sex, intracranial volume, education, in-scanner motion and behavioral scores). We performed 10-fold nested cross-validation by randomly splitting the 102 pairs across 10 sets. The remaining 749 subjects were also divided across the 10 sets. A predictive model was built for each behavioral variable by using kernel ridge regression. All analyses focused on the 102 matched AA and WA groups. After FDR correction (q 0.05), no significant difference was found between the matched AA and WA groups for the matching variables. Out of 58 behavioral variables, 38 showed significantly above chance prediction accuracies (based on permutation test, FDR corrected). Overall, average prediction performance for these variables was higher in the WA group than in the AA group. Furthermore, significant differences in prediction performance between the two groups were found in 35 behavioral variables (FDR corrected; q 0.05). Our results suggest that RSFC-based prediction models of behavioral phenotype trained on the entire HCP population show different prediction performance in different subsets of the population. This suggest that one model might not fit all that, in some cases, RSFC-based predictive models might have poorer prediction accuracies for African Americans compared to matched White Americans. Future work should evaluate the factors contributing to these discrepancies and the potential consequences, as well as possible recommendations.
RandomNet: Towards Fully Automatic Neural Architecture Design for Multimodal Learning
Almost all neural architecture search methods are evaluated in terms of performance (i.e. test accuracy) of the model structures that it fin… (see more)ds. Should it be the only metric for a good autoML approach? To examine aspects beyond performance, we propose a set of criteria aimed at evaluating the core of autoML problem: the amount of human intervention required to deploy these methods into real world scenarios. Based on our proposed evaluation checklist, we study the effectiveness of a random search strategy for fully automated multimodal neural architecture search. Compared to traditional methods that rely on manually crafted feature extractors, our method selects each modality from a large search space with minimal human supervision. We show that our proposed random search strategy performs close to the state of the art on the AV-MNIST dataset while meeting the desirable characteristics for a fully automated design process.
10,000 social brains: Sex differentiation in human brain anatomy
Hannah Kiesow
Robin I. M. Dunbar
Joseph W. Kable
Tobias Kalenscher
Kai Vogeley
Leonhard Schilbach
Andre F. Marquand
Thomas V. Wiecki
Population variability in social lifestyle is reflected in brain morphology in sex-dependent ways.
Multiple Kernel Learning-Based Transfer Regression for Electric Load Forecasting
Electric load forecasting, especially short-term load forecasting (STLF), is becoming more and more important for power system operation. We… (see more) propose to use multiple kernel learning (MKL) for residential electric load forecasting which provides more flexibility than traditional kernel methods. Computation time is an important issue for short-term forecasting, especially for energy scheduling. However, conventional MKL methods usually lead to complicated optimization problems. Another practical issue for this application is that there may be a very limited amount of data available to train a reliable forecasting model for a new house, while at the same time we may have historical data collected from other houses which can be leveraged to improve the prediction performance for the new house. In this paper, we propose a boosting-based framework for MKL regression to deal with the aforementioned issues for STLF. In particular, we first adopt boosting to learn an ensemble of multiple kernel regressors and then extend this framework to the context of transfer learning. Furthermore, we consider two different settings: homogeneous transfer learning and heterogeneous transfer learning. Experimental results on residential data sets demonstrate that forecasting error can be reduced by a large margin with the knowledge learned from other houses.
Seven pillars of precision digital health and medicine
Arash Shaban-Nejad
Martin Michalowski
Niels Peek
John S. Brownstein
David L Buckeridge
On Catastrophic Interference in Atari 2600 Games
William Fedus
Dibya Ghosh
John D. Martin
Bellemare Marc-Emmanuel
Model-free deep reinforcement learning is sample inefficient. One hypothesis -- speculated, but not confirmed -- is that catastrophic interf… (see more)erence within an environment inhibits learning. We test this hypothesis through a large-scale empirical study in the Arcade Learning Environment (ALE) and, indeed, find supporting evidence. We show that interference causes performance to plateau; the network cannot train on segments beyond the plateau without degrading the policy used to reach there. By synthetically controlling for interference, we demonstrate performance boosts across architectures, learning algorithms and environments. A more refined analysis shows that learning one segment of a game often increases prediction errors elsewhere. Our study provides a clear empirical link between catastrophic interference and sample efficiency in reinforcement learning.
Machine learning analysis of exome trios to contrast the genomic architecture of autism and schizophrenia
Sameer Sardaar
Bill Qi
Alexandre Dionne-Laporte
Guy. A. Rouleau
Yannis J. Trakadis
Machine learning (ML) algorithms and methods offer great tools to analyze large complex genomic datasets. Our goal was to compare the genomi… (see more)c architecture of schizophrenia (SCZ) and autism spectrum disorder (ASD) using ML. In this paper, we used regularized gradient boosted machines to analyze whole-exome sequencing (WES) data from individuals SCZ and ASD in order to identify important distinguishing genetic features. We further demonstrated a method of gene clustering to highlight which subsets of genes identified by the ML algorithm are mutated concurrently in affected individuals and are central to each disease (i.e., ASD vs. SCZ “hub” genes). In summary, after correcting for population structure, we found that SCZ and ASD cases could be successfully separated based on genetic information, with 86–88% accuracy on the testing dataset. Through bioinformatic analysis, we explored if combinations of genes concurrently mutated in patients with the same condition (“hub” genes) belong to specific pathways. Several themes were found to be associated with ASD, including calcium ion transmembrane transport, immune system/inflammation, synapse organization, and retinoid metabolic process. Moreover, ion transmembrane transport, neurotransmitter transport, and microtubule/cytoskeleton processes were highlighted for SCZ. Our manuscript introduces a novel comparative approach for studying the genetic architecture of genetically related diseases with complex inheritance and highlights genetic similarities and differences between ASD and SCZ.
Policy Evaluation Networks
Jean Harb
Tom Schaul
Many reinforcement learning algorithms use value functions to guide the search for better policies. These methods estimate the value of a si… (see more)ngle policy while generalizing across many states. The core idea of this paper is to flip this convention and estimate the value of many policies, for a single set of states. This approach opens up the possibility of performing direct gradient ascent in policy space without seeing any new data. The main challenge for this approach is finding a way to represent complex policies that facilitates learning and generalization. To address this problem, we introduce a scalable, differentiable fingerprinting mechanism that retains essential policy information in a concise embedding. Our empirical results demonstrate that combining these three elements (learned Policy Evaluation Network, policy fingerprints, gradient ascent) can produce policies that outperform those that generated the training data, in zero-shot manner.
Solving ODE with Universal Flows: Approximation Theory for Flow-Based Models
Normalizing flows are powerful invertible probabilistic models that can be used to translate two probability distributions, in a way that al… (see more)lows us to efficiently track the change of probability density. However, to trade for computational efficiency in sampling and in evaluating the log-density, special parameterization designs have been proposed at the cost of representational expressiveness. In this work, we propose to use ODEs as a framework to establish universal approximation theory for certain families of flow-based models.
Analysing brain networks in population neuroscience: a case for the Bayesian philosophy
Dorothea L. Floris
Andre F. Marquand
Network connectivity fingerprints are among today's best choices to obtain a faithful sampling of an individual's brain and cognition. Widel… (see more)y available MRI scanners can provide rich information tapping into network recruitment and reconfiguration that now scales to hundreds and thousands of humans. Here, we contemplate the advantages of analysing such connectome profiles using Bayesian strategies. These analysis techniques afford full probability estimates of the studied network coupling phenomena, provide analytical machinery to separate epistemological uncertainty and biological variability in a coherent manner, usher us towards avenues to go beyond binary statements on existence versus non-existence of an effect, and afford credibility estimates around all model parameters at play which thus enable single-subject predictions with rigorous uncertainty intervals. We illustrate the brittle boundary between healthy and diseased brain circuits by autism spectrum disorder as a recurring theme where, we argue, network-based approaches in neuroscience will require careful probabilistic answers. This article is part of the theme issue ‘Unifying the essential concepts of biological networks: biological insights and philosophical foundations’.
Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning
Huan Wang
Caiming Xiong
Richard Socher
oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions
David Venuto
Léonard Boussioux
Junhao Wang
Explicit engineering of reward functions for given environments has been a major hindrance to reinforcement learning methods. While Inverse … (see more)Reinforcement Learning (IRL) is a solution to recover reward functions from demonstrations only, these learned rewards are generally heavily \textit{entangled} with the dynamics of the environment and therefore not portable or \emph{robust} to changing environments. Modern adversarial methods have yielded some success in reducing reward entanglement in the IRL setting. In this work, we leverage one such method, Adversarial Inverse Reinforcement Learning (AIRL), to propose an algorithm that learns hierarchical disentangled rewards with a policy over options. We show that this method has the ability to learn \emph{generalizable} policies and reward functions in complex transfer learning tasks, while yielding results in continuous control benchmarks that are comparable to those of the state-of-the-art methods.