Publications

Is a Modular Architecture Enough?

Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures. Recent… (see more) work demonstrates that not only do some modular architectures generalize well, but they also lead to better out-of-distribution generalization, scaling properties, learning speed, and interpretability. A key intuition behind the success of such systems is that the data generating system for most real-world settings is considered to consist of sparsely interacting parts, and endowing models with similar inductive biases will be helpful. However, the field has been lacking in a rigorous quantitative assessment of such systems because these real-world data distributions are complex and unknown. In this work, we provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions. We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems. In doing so, we propose evaluation metrics that highlight the benefits of modularity, the regimes in which these benefits are substantial, as well as the sub-optimality of current end-to-end learned modular systems as opposed to their claimed potential.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

Multilingual Language Model Adaptive Fine-Tuning: A Study on African Languages

Jesujoba Oluwadara Alabi

David Ifeoluwa Adelani

Marius Mosbach

Dietrich Klakow

and XLM-R) and three NLP tasks (NER, news topic classiﬁcation, and sentiment classiﬁcation) shows that our approach is competitive to ap… (see more)plying LAFT on individual languages while requiring signiﬁcantly less disk space. Finally, we show that our adapted PLM also improves the zero-shot cross-lingual transfer abilities of parameter efﬁcient ﬁne-tuning methods.

2021-12-31

arXiv.org (preprint)

doi.org

NeoRS: a neonatal resting state fMRI data preprocessing pipeline

V. Enguix

J. Kenley

D. Luck

J. Cohen-Adad

G.A. Lodygensky

Resting state fMRI (rsfMRI) has been shown to be a promising tool to study intrinsic functional connectivity and assess its integrity in cer… (see more)ebral development. In neonates, where fMRI is limited to few paradigms, rsfMRI was shown to be a relevant tool to explore regional interactions of brain networks. However, to identify the resting state networks, data needs to be carefully processed. Because of the non-collaborative nature of the neonates, the differences in brain size and the reversed contrast compared to adults, neonates can't be processed with the existing adult pipelines. Therefore, we developed NeoRS. The main processing steps include atlas registration, skull tripping, segmentation, slice timing and head motion correction and confounds regression. To address the specificity of neonatal brain imaging, particular attention was given to registration including neonatal atlas type and parameters, such as brain size variations, and contrast differences compared to adults. Furthermore, head motion was scrutinized and optimized, as it is a major issue when processing neonatal data. The pipeline includes visual quality control assessment checkpoints. To assess its effectiveness, we used the data from the Baby Connectome Project including 10 neonates. NeoRS was designed to work on both multi-band and single-band acquisitions and is applicable on smaller datasets. It also includes popular functional connectivity analysis features such as seed based correlations. Language, default mode, dorsal attention, visual, ventral attention, motor and fronto parietal networks were evaluated. The different analyzed networks were in agreement with previously published studies in the neonate. NeoRS is coded in Matlab, it is open-source and available on https://github.com/venguix/NeoRS. NeoRS allows robust image processing of the neonatal rsfMRI data that can be readily customized to different datasets.

2021-12-31

Frontiers in Neuroinformatics (published)

doi.org

arxiv.org

Neural Attentive Circuits

Nasim Rahaman

Martin Weiss

Francesco Locatello

Chris Pal

Yoshua Bengio

Bernhard Schölkopf

Li Erran Li

Nicolas Ballas

Recent work has seen the development of general purpose neural architectures that can be trained to perform tasks across diverse data modali… (see more)ties. General purpose models typically make few assumptions about the underlying data-structure and are known to perform well in the large-data regime. At the same time, there has been growing interest in modular neural architectures that represent the data using sparsely interacting modules. These models can be more robust out-of-distribution, computationally efficient, and capable of sample-efficient adaptation to new data. However, they tend to make domain-specific assumptions about the data, and present challenges in how module behavior (i.e., parameterization) and connectivity (i.e., their layout) can be jointly learned. In this work, we introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) that jointly learns the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs are best understood as the combination of two systems that are jointly trained end-to-end: one that determines the module configuration and the other that executes it on an input. We demonstrate qualitatively that NACs learn diverse and meaningful module configurations on the NLVR2 dataset without additional supervision. Quantitatively, we show that by incorporating modularity in this way, NACs improve upon a strong non-modular baseline in terms of low-shot adaptation on CIFAR and CUBs dataset by about 10%, and OOD robustness on Tiny ImageNet-R by about 2.5%. Further, we find that NACs can achieve an 8x speedup at inference time while losing less than 3% performance. Finally, we find NACs to yield competitive results on diverse data modalities spanning point-cloud classification, symbolic processing and text-classification from ASCII bytes, thereby confirming its general purpose nature.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

Optimizing deep learning for Magnetoencephalography (MEG): From sensory perception to sex prediction and brain fingerprinting

Arthur Dehgan

Irina Rish

Karim Jerbi

2021-12-31

2022 Conference on Cognitive Computational Neuroscience (published)

doi.org

Orientation and Context Entangled Network for Retinal Vessel Segmentation

Xinxu Wei

Kaifu Yang

Danilo Bzdok

Yongjie Li

Most of the existing deep learning based methods for vessel segmentation neglect two important aspects of retinal vessels, one is the orient… (see more)ation information of vessels, and the other is the contextual information of the whole fundus region. In this paper, we propose a robust Orientation and Context Entangled Network (denoted as OCE-Net), which has the capability of extracting complex orientation and context information of the blood vessels. To achieve complex orientation aware, a Dynamic Complex Orientation Aware Convolution (DCOA Conv) is proposed to extract complex vessels with multiple orientations for improving the vessel continuity. To simultaneously capture the global context information and emphasize the important local information, a Global and Local Fusion Module (GLFM) is developed to simultaneously model the long-range dependency of vessels and focus sufficient attention on local thin vessels. A novel Orientation and Context Entangled Non-local (OCE-NL) module is proposed to entangle the orientation and context information together. In addition, an Unbalanced Attention Refining Module (UARM) is proposed to deal with the unbalanced pixel numbers of background, thick and thin vessels. Extensive experiments were performed on several commonly used datasets (DRIVE, STARE and CHASEDB1) and some more challenging datasets (AV-WIDE, UoA-DR, RFMiD and UK Biobank). The ablation study shows that the proposed method achieves promising performance on maintaining the continuity of thin vessels and the comparative experiments demonstrate that our OCE-Net can achieve state-of-the-art performance on retinal vessel segmentation.

2021-12-31

SSRN Electronic Journal (unknown)

doi.org

arxiv.org

Overcoming Challenges in Leveraging GANs for Few-Shot Data Augmentation

Pau Rodríguez

Christopher Pal

In this paper, we explore the use of GAN-based few-shot data augmentation as a method to improve few-shot classification performance. We per… (see more)form an exploration into how a GAN can be fine-tuned for such a task (one of which is in a class-incremental manner), as well as a rigorous empirical investigation into how well these models can perform to improve few-shot classification. We identify issues related to the difficulty of training such generative models under a purely supervised regime with very few examples, as well as issues regarding the evaluation protocols of existing works. We also find that in this regime, classification accuracy is highly sensitive to how the classes of the dataset are randomly split. Therefore, we propose a semi-supervised fine-tuning approach as a more pragmatic way forward to address these problems.

2021-12-31

CoLLAs (published)

doi.org

proceedings.mlr.press

PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding

We are now witnessing significant progress of deep learning methods in a variety of tasks (or datasets) of proteins. However, there is a lac… (see more)k of a standard benchmark to evaluate the performance of different methods, which hinders the progress of deep learning in this field. In this paper, we propose such a benchmark called PEER, a comprehensive and multi-task benchmark for Protein sEquence undERstanding. PEER provides a set of diverse protein understanding tasks including protein function prediction, protein localization prediction, protein structure prediction, protein-protein interaction prediction, and protein-ligand interaction prediction. We evaluate different types of sequence-based methods for each task including traditional feature engineering approaches, different sequence encoding methods as well as large-scale pre-trained protein language models. In addition, we also investigate the performance of these methods under the multi-task learning setting. Experimental results show that large-scale pre-trained protein language models achieve the best performance for most individual tasks, and jointly training multiple tasks further boosts the performance. The datasets and source codes of this benchmark are all available at https://github.com/DeepGraphLearning/PEER_Benchmark

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

Peer-to-Peer Energy Trading and Energy Conversion in Interconnected Multi-Energy Microgrids Using Multi-Agent Deep Reinforcement Learning

Tianyi Chen

Shengrong Bu

Xue Liu

Jikun Kang

F. Richard Yu

Zhu Han

A key aspect of multi-energy microgrids (MEMGs) is the capability to efficiently convert and store energy in order to reduce the costs and e… (see more)nvironmental impact. Peer-to-peer (P2P) energy trading is a novel paradigm for decentralised energy market designs. In this paper, we investigate the external P2P energy trading problem and internal energy conversion problem within interconnected residential, commercial and industrial MEMGs. These two problems are complex decision-making problems with enormous high-dimensional data and uncertainty, so a multi-agent deep reinforcement learning approach combining the multi-agent actor-critic algorithm with the twin delayed deep deterministic policy gradient algorithm is proposed. The proposed approach can handle the high-dimensional continuous action space and aligns with the nature of P2P energy trading with multiple MEMGs. Simulation results based on three real-world MG datasets show that the proposed approach significantly reduces each MGâ€™s average hourly operation cost. The impact of carbon tax pricing is also considered.

2021-12-31

IEEE Transactions on Smart Grid (published)

PRACTICAL GUIDE

Paolo Bellavista

2021-12-31

(published)

www.semanticscholar.org

Privacy-aware compression for federated data analysis

Kamalika Chaudhuri

Chuan Guo

Michael G. Rabbat

Federated data analytics is a framework for distributed data analysis where a server compiles noisy responses from a group of distributed lo… (see more)w-bandwidth user devices to estimate aggregate statistics. Two major challenges in this framework are privacy, since user data is often sensitive, and compression, since the user devices have low network bandwidth. Prior work has addressed these challenges separately by combining standard compression algorithms with known privacy mechanisms. In this work, we take a holistic look at the problem and design a family of privacy-aware compression mechanisms that work for any given communication budget. We first propose a mechanism for transmitting a single real number that has optimal variance under certain conditions. We then show how to extend it to metric differential privacy for location privacy use-cases, as well as vectors, for application to federated learning. Our experiments illustrate that our mechanism can lead to better utility vs. compression trade-offs for the same privacy loss in a number of settings.

2021-12-31

UAI (published)

doi.org

proceedings.mlr.press

(Private)-Retroactive Carbon Pricing [(P)ReCaP]: A Market-based Approach for Climate Finance and Risk Assessment

Yoshua Bengio

Prateek Gupta

Dylan Radovic

Maarten Scholl

Andrew Williams

Christian Schroeder de Witt

Tianyu Zhang

Yang Zhang

Insufficient Social Cost of Carbon (SCC) estimation methods and short-term decision-making horizons have hindered the ability of carbon emit… (see more)ters to properly correct for the negative externalities of climate change, as well as the capacity of nations to balance economic and climate policy. To overcome these limitations, we introduce Retrospective Social Cost of Carbon Updating (ReSCCU), a novel mechanism that corrects for these limitations as empirically measured evidence is collected. To implement ReSCCU in the context of carbon taxation, we propose Retroactive Carbon Pricing (ReCaP), a market mechanism in which polluters offload the payment of ReSCCU adjustments to insurers. To alleviate systematic risks and minimize government involvement, we introduce the Private ReCaP (PReCaP) prediction market, which could see real-world implementation based on the engagement of a few high net-worth individuals or independent institutions.

2021-12-31

arXiv (preprint)

doi.org

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications