Publications

Global fMRI signal topography differs systematically across the lifespan

Jason S. Nomi

Danilo Bzdok

Jingwei Li

Taylor Bolt

Catie Chang

Salome Kornfeld

Zachary T. Goodman

B.T. Thomas Yeo

R. Nathan Spreng

Lucina Q. Uddin

The global signal (GS) in resting-state fMRI, known to contain artifacts and non-neuronal physiological signals, also contains important neu… (see more)ral information related to individual state and trait characteristics. Here we show distinct linear and curvilinear lifespan patterns of GS topography in a cross-sectional lifespan sample, demonstrating its importance for consideration in studies of development and aging. Subcortical brain regions such as the thalamus and putamen show linear associations with the GS across the lifespan. The thalamus has stronger coupling in older-age individuals compared with younger-aged individuals, while the putamen has stronger coupling in younger individuals compared with older individuals. The subcortical nucleus basalis shows a u-shaped pattern similar to cortical regions within the lateral frontoparietal network and dorsal attention network, where coupling with the GS is stronger at early and old age, with weaker coupling in middle age. This differentiation in coupling strength between subcortical and cortical brain activity across the lifespan supports a dual-layer model of GS composition, where subcortical aspects of the GS are differentiated from cortical aspects of the GS. We find that these subcortical-cortical contributions to the GS depend strongly on the lifespan stage of individuals. Our findings demonstrate how neurobiological information within the GS differs across development and highlight the need to carefully consider whether or not to remove this signal when investigating age-related functional differences in the brain.

2022-07-26

bioRxiv (preprint)

doi.org

H4rm0ny: A Competitive Zero-Sum Two-Player Markov Game for Multi-Agent Learning on Evasive Malware Generation and Detection

Christopher Molloy

Steven H. H. Ding

Benjamin C. M. Fung

Philippe Charland

To combat the increasingly versatile and mutable modern malware, Machine Learning (ML) is now a popular and effective complement to the exis… (see more)ting signature-based techniques for malware triage and identification. However, ML is also a readily available tool for adversaries. Recent studies have shown that malware can be modified by deep Reinforcement Learning (RL) techniques to bypass AI-based and signature-based anti-virus systems without altering their original malicious functionalities. These studies only focus on generating evasive samples and assume a static detection system as the enemy.Malware detection and evasion essentially form a two-party cat-and-mouse game. Simulating the real-life scenarios, in this paper we present the first two-player competitive game for evasive malware detection and generation, following the zero-sum Multi-Agent Reinforcement Learning (MARL) paradigm. Our experiments on recent malware show that the produced malware detection agent is more robust against adversarial attacks. Also, the produced malware modification agent is able to generate more evasive samples fooling both AI-based and other anti-malware techniques.

2022-07-26

Computer Science Symposium in Russia (published)

doi.org

Implications of Topological Imbalance for Representation Learning on Biomedical Knowledge Graphs

Stephen Bonner

Ufuk Kirik

Ola Engkvist

Jian Tang

Ian P Barrett

Adoption of recently developed methods from machine learning has given rise to creation of drug-discovery knowledge graphs (KG) that utilize… (see more) the interconnected nature of the domain. Graph-based modelling of the data, combined with KG embedding (KGE) methods, are promising as they provide a more intuitive representation and are suitable for inference tasks such as predicting missing links. One common application is to produce ranked lists of genes for a given disease, where the rank is based on the perceived likelihood of association between the gene and the disease. It is thus critical that these predictions are not only pertinent but also biologically meaningful. However, KGs can be biased either directly due to the underlying data sources that are integrated or due to modeling choices in the construction of the graph, one consequence of which is that certain entities can get topologically overrepresented. We demonstrate the effect of these inherent structural imbalances, resulting in densely-connected entities being highly ranked no matter the context. We provide support for this observation across different datasets, models as well as predictive tasks. Further, we present various graph perturbation experiments which yield more support to the observation that KGE models can be more influenced by the frequency of entities rather than any biological information encoded within the relations. Our results highlight the importance of data modeling choices, and emphasizes the need for practitioners to be mindful of these issues when interpreting model outputs and during KG composition.

2022-07-25

Briefings in Bioinformatics (unknown)

doi.org

arxiv.org

On the Expressivity of Markov Reward (Extended Abstract)

David Abel

Will Dabney

Anna Harutyunyan

Mark K. Ho

Michael L. Littman

Doina Precup

Satinder Singh

2022-07-22

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (published)

doi.org

Flaky Performances when Pre-Training on Relational Databases with a Plan for Future Characterization Efforts

Shengchao Liu

David Vázquez

Jian Tang

Pierre-Andre Noel

We explore the downstream task performances for graph neural network (GNN) self-supervised learning (SSL) methods trained on subgraphs extra… (see more)cted from relational databases (RDBs). Intu-itively, this joint use of SSL and GNNs allows us to leverage more of the available data, which could translate to better results. However, while we observe positive transfer in some cases, others showed systematic performance degradation, including some spectacular ones. We hypothesize a mechanism that could explain this behaviour and draft the plan for future work testing it by characterizing how much relevant information different strategies can (theoretically and/or empirically) extract from (synthetic and/or real) RDBs.

2022-07-21

ICML.cc/2022/Workshop/Pre-Training (unknown)

openreview.net

Revisiting Hotels-50K and Hotel-ID

Aarash Feizi

Arantxa Casanova

Adriana Romero

Reihaneh Rabbany

In this paper, we propose revisited versions for two recent hotel recognition datasets: Hotels-50K and Hotel-ID. The revisited versions prov… (see more)ide evaluation setups with different levels of difﬁculty to better align with the intended real-world application, i.e. countering human trafﬁcking. Real-world scenarios involve hotels and locations that are not captured in the current data sets, therefore it is important to consider evaluation settings where classes are truly unseen. We test this setup using multiple state-of-the-art image retrieval models and show that as expected, the models’ performances decrease as the evaluation gets closer to the real-world unseen settings. The rankings of the best performing models also change across the different evaluation settings, which further motivates using the proposed revisited datasets.

2022-07-19

ArXiv (preprint)

doi.org

arxiv.org

On the Generalization and Adaption Performance of Causal Models

Nan Rosemary Ke

2022-07-19

ICML.cc/2022/Workshop/SCIS (poster)

doi.org

openreview.net

Characterizing User Behaviors in Open-Source Software User Forums: An Empirical Study

Jazlyn Hellman

Jiahao Chen

Md. Sami Uddin

Jinghui Cheng

Jin L.C. Guo

User forums of Open Source Software (OSS) enable end-users to collaboratively discuss problems concerning the OSS applications. Despite deca… (see more)des of research on OSS, we know very little about how end-users engage with OSS communities on these forums, in particular, the challenges that hinder their continuous and meaningful participation in the OSS community. Many previous works are developer-centric and overlook the importance of end-user forums. As a result, end-users' expectations are seldom reflected in OSS development. To better understand user behaviors in OSS user forums, we carried out an empirical study analyzing about 1.3 million posts from user forums of four popular OSS applications: Zotero, Audacity, VLC, and RStudio. Through analyzing the contribution patterns of three common user types (end-users, developers, and organizers), we observed that end-users not only initiated most of the threads (above 96% of threads in three projects, 86% in the other), but also acted as the significant contributors for responding to other users' posts, even though they tended to lack confidence in their activities as indicated by psycho-linguistic analyses. Moreover, we found end-users more open, reflecting a more positive emotion in communication than organizers and developers in the forums. Our work contributes new knowledge about end-users' activities and behaviors in OSS user forums that the vital OSS stakeholders can leverage to improve end-user engagement in the OSS development process.

2022-07-18

Proceedings of the 15th International Conference on Cooperative and Human Aspects of Software Engineering (published)

doi.org

arxiv.org

Generative Coarse-Graining of Molecular Conformations

Wujie Wang

Minkai Xu

Chen Cai

Benjamin K Miller

Tess Smidt

Yusu Wang

Jian Tang

Rafael Gomez-Bombarelli

Coarse-graining (CG) of molecular simulations simplifies the particle representation by grouping selected atoms into pseudo-beads and drasti… (see more)cally accelerates simulation. However, such CG procedure induces information losses, which makes accurate backmapping, i.e., restoring fine-grained (FG) coordinates from CG coordinates, a long-standing challenge. Inspired by the recent progress in generative models and equivariant networks, we propose a novel model that rigorously embeds the vital probabilistic nature and geometric consistency requirements of the backmapping transformation. Our model encodes the FG uncertainties into an invariant latent space and decodes them back to FG geometries via equivariant convolutions. To standardize the evaluation of this domain, we provide three comprehensive benchmarks based on molecular dynamics trajectories. Experiments show that our approach always recovers more realistic structures and outperforms existing data-driven methods with a significant margin.

2022-07-18

ICML (Accept for Short Presentation)

doi.org

proceedings.mlr.press

IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Emanuele Bugliarello

Fangyu Liu

Jonas Pfeiffer

Reliable evaluation benchmarks designed for replicability and comprehensiveness have driven progress in machine learning. Due to the lack of… (see more) a multilingual benchmark, however, vision-and-language research has mostly focused on English language tasks. To fill this gap, we introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together - by both aggregating pre-existing datasets and creating new ones - visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups. Based on the evaluation of the available state-of-the-art models, we find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks. Moreover, downstream performance is partially explained by the amount of available unlabelled textual data for pretraining, and only weakly by the typological distance of target-source languages. We hope to encourage future research efforts in this area by releasing the benchmark to the community.

2022-07-18

ICML (Accept for Short Presentation)

doi.org

proceedings.mlr.press

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

William Harvey

Michael Teng

Frank N. Wood

Hard visual attention is a promising approach to reduce the computational burden of modern computer vision methodologies. However, hard atte… (see more)ntion mechanisms can be difficult and slow to train, which is especially costly for applications like neural architecture search where multiple networks must be trained. We introduce a method to amortise the cost of training by generating an extra supervision signal for a subset of the training data. This supervision is in the form of sequences of ‘good’ locations to attend to for each image. We find that the best method to generate supervision sequences comes from framing hard attention for image classification as a Bayesian optimal experimental design (BOED) problem. From this perspective, the optimal locations to attend to are those which provide the greatest expected reduction in the entropy of the classification distribution. We introduce methodology from the BOED literature to approximate this optimal behaviour and generate ‘near-optimal’ supervision sequences. We then present a hard attention network training objective that makes use of these sequences and show that it allows faster training than prior work. We finally demonstrate the utility of faster hard attention training by incorporating supervision sequences in a neural architecture search, resulting in hard attention architectures which can outperform networks with access to the entire image.

2022-07-17

2022 International Joint Conference on Neural Networks (IJCNN) (published)

doi.org

arxiv.org

On the Effectiveness of Interpretable Feedforward Neural Network

Miles Q. Li

Benjamin C. M. Fung

Adel Abusitta

Deep learning models have achieved state-of-the-art performance in many classification tasks. However, most of them cannot provide an explan… (see more)ation for their classification results. Machine learning models that are interpretable are usually linear or piecewise linear and yield inferior performance. Non-linear models achieve much better classification performance, but it is usually hard to explain their classification results. As a counter-example, an interpretable feedforward neural network (IFFNN) is proposed to achieve both high classification performance and interpretability for malware detection. If the IFFNN can perform well in a more flexible and general form for other classification tasks while providing meaningful explanations, it may be of great interest to the applied machine learning community. In this paper, we propose a way to generalize the interpretable feedforward neural network to multi-class classification scenarios and any type of feedforward neural networks, and evaluate its classification performance and interpretability on interpretable datasets. We conclude by finding that the generalized IFFNNs achieve comparable classification performance to their normal feedforward neural network counterparts and provide meaningful explanations. Thus, this kind of neural network architecture has great practical use.

2022-07-17

2022 International Joint Conference on Neural Networks (IJCNN) (published)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications