Publications

Universal Equivariant Multilayer Perceptrons

Group invariant and equivariant Multilayer Perceptrons (MLP), also known as Equivariant Networks, have achieved remarkable success in learni… (see more)ng on a variety of data structures, such as sequences, images, sets, and graphs. Using tools from group theory, this paper proves the universality of a broad class of equivariant MLPs with a single hidden layer. In particular, it is shown that having a hidden layer on which the group acts regularly is sufficient for universal equivariance (invariance). A corollary is unconditional universality of equivariant MLPs for Abelian groups, such as CNNs with a single hidden layer. A second corollary is the universality of equivariant MLPs with a high-order hidden layer, where we give both group-agnostic bounds and means for calculating group-specific bounds on the order of hidden layer that guarantees universal equivariance (invariance).

2020-11-20

Proceedings of the 37th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

What can I do here? A Theory of Affordances in Reinforcement Learning

Khimya Khetarpal

Zafarali Ahmed

Gheorghe Comanici

David Abel

Doina Precup

Reinforcement learning algorithms usually assume that all actions are always available to an agent. However, both people and animals underst… (see more)and the general link between the features of their environment and the actions that are feasible. Gibson (1977) coined the term "affordances" to describe the fact that certain states enable an agent to do certain actions, in the context of embodied agents. In this paper, we develop a theory of affordances for agents who learn and plan in Markov Decision Processes. Affordances play a dual role in this case. On one hand, they allow faster planning, by reducing the number of actions available in any given situation. On the other hand, they facilitate more efficient and precise learning of transition models from data, especially when such models require function approximation. We establish these properties through theoretical results as well as illustrative examples. We also propose an approach to learn affordances and use it to estimate transition models that are simpler and generalize better.

2020-11-20

Proceedings of the 37th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

An Effective Anti-Aliasing Approach for Residual Networks

Cristina Vasconcelos

Hugo Larochelle

Vincent Dumoulin

Nicolas Roux

Ross Goroshin

Image pre-processing in the frequency domain has traditionally played a vital role in computer vision and was even part of the standard pipe… (see more)line in the early days of deep learning. However, with the advent of large datasets, many practitioners concluded that this was unnecessary due to the belief that these priors can be learned from the data itself. Frequency aliasing is a phenomenon that may occur when sub-sampling any signal, such as an image or feature map, causing distortion in the sub-sampled output. We show that we can mitigate this effect by placing non-trainable blur filters and using smooth activation functions at key locations, particularly where networks lack the capacity to learn them. These simple architectural changes lead to substantial improvements in out-of-distribution generalization on both image classification under natural corruptions on ImageNet-C [10] and few-shot learning on Meta-Dataset [17], without introducing additional trainable parameters and using the default hyper-parameters of open source codebases.

2020-11-19

ArXiv (preprint)

arxiv.org

Multiscale PHATE Exploration of SARS-CoV-2 Data Reveals Multimodal Signatures of Disease

Manik Kuchroo

Jessie Huang

Patrick Wong

Jean-Christophe Grenier

Dennis Shung

Alexander Tong

Carolina Lucas

Jon Klein

Daniel B. Burkhardt

Scott Gigante

Abhinav Godavarthi

Benjamin Israelow

Tianyang Mao

Ji Eun Oh

Julio Silva

Takehiro Takahashi

Camila D. Odio

Arnau Casanovas-Massana

John Fournier

Shelli Farhadian … (see 7 more)

Charles S. Dela Cruz

Albert I. Ko

F. Perry Wilson

Julie Hussin

Guy Wolf

Akiko Iwasaki

Smita Krishnaswamy

Abstract

The biomedical community is producing increasingly high dimensional datasets, integrated from hundreds of… (see more) patient samples, which current computational techniques struggle to explore. To uncover biological meaning from these complex datasets, we present an approach called Multiscale PHATE, which learns abstracted biological features from data that can be directly predictive of disease. Built on a coarse graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse levels for high level summarizations of data, as well as at fine levels for detailed representations on subsets. We apply Multiscale PHATE to study the immune response to COVID-19 in 54 million cells from 168 hospitalized patients. Through our analysis of patient samples, we identify CD16-hi,CD66b-lo neutrophil and IFNγ+,GranzymeB+ Th17 cell responses enriched in patients who die. Furthermore, we show that population groupings Multiscale PHATE discovers can be directly fed into a classifier to predict disease outcome. We also use Multiscale PHATE-derived features to construct two different manifolds of patients, one from abstracted flow cytometry features and another directly on patient clinical features, both associating immune subsets and clinical markers with outcome.

2020-11-16

bioRxiv (preprint)

doi.org

Using Open Source Licensing to Regulate the Assembly of LAWS: A Preliminary Analysis

Cheng Lin

AJung Moon

Lethal autonomous weapons (LAWS) are an emerging technology capable of automatically targeting and exercising lethal force. Many scholars an… (see more)d advocates have petitioned to ban the technology internationally for a myriad of reasons. However, there are practical challenges to implementing a ban. One such challenge is posed by the “intangible” nature of the software that LAWS depends on, which is incompatible with implementation mechanisms such as export control. Given the dual-use nature of software, and the fact that software is developed by teams of individuals, a number of soft governance mechanisms have been proposed to regulate this technology. In this paper, we investigate the feasibility of one particular approach: leveraging open source licenses as a means to prohibit the use of certain software in LAWS. This approach is largely motivated by the fact that open source software underpins all of technology, especially AI. Through a review of the recent tech activism and open source activism, we evaluate whether open source licenses can feasibly limit the use of open source software to only non-LAWS applications. We distill the current challenges facing “ethics-driven” open source licensing efforts into three main obstacles: the need for clarity of licensing language, the lack of enforceability of licenses, and the lack of cohesiveness of the open source community. We propose that addressing these factors are also success criteria for future anti-LAWS open source initiatives. We find that open source licenses provide more theoretical than practical promise in regulating LAWS, and conclude that cohesion in the open source community is the key to their potential practical success in the future.

2020-11-11

2020 IEEE International Symposium on Technology and Society (ISTAS) (published)

doi.org

Global Surveillance of COVID-19 by mining news media using a multi-source dynamic embedded topic model.

Yue Li

Pratheeksha Nair

Zhi Wen

Imane Chafi

Anya Okhmatovskaia

Guido Powell

Yannan Shen

David L. Buckeridge

As the COVID-19 pandemic continues to unfold, understanding the global impact of non-pharmacological interventions (NPI) is important for fo… (see more)rmulating effective intervention strategies, particularly as many countries prepare for future waves. We used a machine learning approach to distill latent topics related to NPI from large-scale international news media. We hypothesize that these topics are informative about the timing and nature of implemented NPI, dependent on the source of the information (e.g., local news versus official government announcements) and the target countries. Given a set of latent topics associated with NPI (e.g., self-quarantine, social distancing, online education, etc), we assume that countries and media sources have different prior distributions over these topics, which are sampled to generate the news articles. To model the source-specific topic priors, we developed a semi-supervised, multi-source, dynamic, embedded topic model. Our model is able to simultaneously infer latent topics and learn a linear classifier to predict NPI labels using the topic mixtures as input for each news article. To learn these models, we developed an efficient end-to-end amortized variational inference algorithm. We applied our models to news data collected and labelled by the World Health Organization (WHO) and the Global Public Health Intelligence Network (GPHIN). Through comprehensive experiments, we observed superior topic quality and intervention prediction accuracy, compared to the baseline embedded topic models, which ignore information on media source and intervention labels. The inferred latent topics reveal distinct policies and media framing in different countries and media sources, and also characterize reaction to COVID-19 and NPI in a semantically meaningful manner. Our PyTorch code is available on Github (htps://github.com/li-lab-mcgill/covid19_media).

2020-11-09

Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (published)

doi.org

On Posterior Collapse and Encoder Feature Dispersion in Sequence VAEs.

Teng Long

Yanshuai Cao

Jackie CK Cheung

Variational autoencoders (VAEs) hold great potential for modelling text, as they could in theory separate high-level semantic and syntactic … (see more)properties from local regularities of natural language. Practically, however, VAEs with autoregressive decoders often suffer from posterior collapse, a phenomenon where the model learns to ignore the latent variables, causing the sequence VAE to degenerate into a language model. In this paper, we argue that posterior collapse is in part caused by the lack of dispersion in encoder features. We provide empirical evidence to verify this hypothesis, and propose a straightforward fix using pooling. This simple technique effectively prevents posterior collapse, allowing model to achieve significantly better data log-likelihood than standard sequence VAEs. Comparing to existing work, our proposed method is able to achieve comparable or superior performances while being more computationally efficient.

2020-11-09

(published)

www.semanticscholar.org

Approximate Planning and Learning for Partially Observed Systems

Aditya Mahajan

2020-11-08

International Conference of Control, Dynamic Systems, and Robotics (published)

doi.org

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning

Timo Milbich

Samarth Sinha

Björn Ommer

2020-11-06

Computer Vision – ECCV 2020 (published)

doi.org

arxiv.org

Effectiveness of quarantine and testing to prevent COVID-19 transmission from arriving travelers

Russell Wa

David L Buckeridge

2020-11-03

medRxiv (preprint)

doi.org

Explainability and Interpretability: Keys to Deep Medicine

Arash Shaban-Nejad

Martin Michalowski

David L Buckeridge

2020-11-02

Explainable AI in Healthcare and Medicine (published)

doi.org

A Study of Policy Gradient on a Class of Exactly Solvable Models

Gavin McCracken

Colin Daniels

Rosie Zhao

Anna M. Brandenberger

Prakash Panangaden

Doina Precup

Policy gradient methods are extensively used in reinforcement learning as a way to optimize expected return. In this paper, we explore the e… (see more)volution of the policy parameters, for a special class of exactly solvable POMDPs, as a continuous-state Markov chain, whose transition probabilities are determined by the gradient of the distribution of the policy's value. Our approach relies heavily on random walk theory, specifically on affine Weyl groups. We construct a class of novel partially observable environments with controllable exploration difficulty, in which the value distribution, and hence the policy parameter evolution, can be derived analytically. Using these environments, we analyze the probabilistic convergence of policy gradient to different local maxima of the value function. To our knowledge, this is the first approach developed to analytically compute the landscape of policy gradient in POMDPs for a class of such environments, leading to interesting insights into the difficulty of this problem.

2020-11-02

ArXiv (preprint)

arxiv.org

Mila Ventures Founder in Residence

TRAIL: Responsible AI for Professionals and Leaders

AI Advantage: Productivity in Public Service

Publications

Mila Ventures Founder in Residence

TRAIL: Responsible AI for Professionals and Leaders

AI Advantage: Productivity in Public Service

Popular keywords:

Publications