Publications

TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Completion

Jiapeng Wu

Yingxue Zhang

Mark J. Coates

Jackie CK Cheung

Reasoning in a temporal knowledge graph (TKG) is a critical task for information retrieval and semantic search. It is particularly challengi… (see more)ng when the TKG is updated frequently. The model has to adapt to changes in the TKG for efficient training and inference while preserving its performance on historical knowledge. Recent work approaches TKG completion (TKGC) by augmenting the encoder-decoder framework with a time-aware encoding function. However, naively fine-tuning the model at every time step using these methods does not address the problems of 1) catastrophic forgetting, 2) the model's inability to identify the change of facts (e.g., the change of the political affiliation and end of a marriage), and 3) the lack of training efficiency. To address these challenges, we present the Time-aware Incremental Embedding (TIE) framework, which combines TKG representation learning, experience replay, and temporal regularization. We introduce a set of metrics that characterizes the intransigence of the model and propose a constraint that associates the deleted facts with negative labels. Experimental results on Wikidata12k and YAGO11k datasets demonstrate that the proposed TIE framework reduces training time by about ten times and improves on the proposed metrics compared to vanilla full-batch training. It comes without a significant loss in performance for any traditional measures. Extensive ablation studies reveal performance trade-offs among different evaluation metrics, which is essential for decision-making around real-world TKG applications.

2021-07-10

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (published)

doi.org

arxiv.org

Parallel and recurrent cascade models as a unifying force for understanding sub-cellular computation

Emerson F. Harkin

Peter R. Shen

Anish Goel

Blake A. Richards

Richard Naud

Neurons are very complicated computational devices, incorporating numerous non-linear processes, particularly in their dendrites. Biophysica… (see more)l models capture these processes directly by explicitly modelling physiological variables, such as ion channels, current flow, membrane capacitance, etc. However, another option for capturing the complexities of real neural computation is to use cascade models, which treat individual neurons as a cascade of linear and non-linear operations, akin to a multi-layer artificial neural network. Recent research has shown that cascade models can capture single-cell computation well, but there are still a number of sub-cellular, regenerative dendritic phenomena that they cannot capture, such as the interaction between sodium, calcium, and NMDA spikes in different compartments. Here, we propose that it is possible to capture these additional phenomena using parallel, recurrent cascade models, wherein an individual neuron is modelled as a cascade of parallel linear and non-linear operations that can be connected recurrently, akin to a multi-layer, recurrent, artificial neural network. Given their tractable mathematical structure, we show that neuron models expressed in terms of parallel recurrent cascades can themselves be integrated into multi-layered artificial neural networks and trained to perform complex tasks. We go on to discuss potential implications and uses of these models for artificial intelligence. Overall, we argue that parallel, recurrent cascade models provide an important, unifying tool for capturing single-cell computation and exploring the algorithmic implications of physiological phenomena.

2021-07-04

Neuroscience (published)

doi.org

The default mode network in cognition: a topographical perspective

Jonathan Smallwood

Boris C Bernhardt

Robert Leech

Danilo Bzdok

Elizabeth Jefferies

Daniel S. Margulies

2021-07-04

Nature Reviews Neuroscience (published)

doi.org

Meeting and Missing Minds: Children and Adults Use Alignment of Intuitions to Solve Pure Coordination Games

Daniel Perez-Zapata

Xavia McKenzie-Smart

Ian Charest

Ian Apperly

In pure coordination games players seek to coordinate responses with one another without communicating. Without a logically correct response… (see more), success depends upon players intuiting a response that is mutually obvious. Previous work suggests that such coordination requires a distinctive form of “group” thinking and sufficient mutual knowledge, but reveals little about the basis for the intuitive judgements themselves. Here, that question was addressed for the first time by examining the basis of coordination performance of groups whose intuitions might plausibly differ: children versus adults. Twenty-five 5-year-olds, 30 7-year-olds, and 25 adults undertook four types of coordination game, and novel metrics allowed “intuitive alignment” in responses to be evaluated within- and between-groups. All groups performed above chance, and adults showed higher levels of alignment than children, but adults and children showed different patterns in their intuitions. Implications for intergenerational understanding and mis-understanding are discussed.

2021-07-02

Journal of Cognitive Psychology (published)

doi.org

Fixed-Points for Quantitative Equational Logics

Radu Mardare

Prakash Panangaden

Gordon Plotkin

We develop a fixed-point extension of quantitative equational logic and give semantics in one-bounded complete quantitative algebras. Unlike… (see more) previous related work about fixed-points in metric spaces, we are working with the notion of approximate equality rather than exact equality. The result is a novel theory of fixed points which can not only provide solutions to the traditional fixed-point equations but we can also define the rate of convergence to the fixed point. We show that such a theory is the quantitative analogue of a Conway theory and also of an iteration theory; and it reflects the metric coinduction principle. We study the Bellman equation for a Markov decision process as an illustrative example.

2021-07-01

2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) (published)

doi.org

arxiv.org

Universal Semantics for the Stochastic λ-Calculus

Pedro H. Azevedo de Amorim

Dexter Kozen

Radu Mardare

Prakash Panangaden

Michael Roberts

We define sound and adequate denotational and operational semantics for the stochastic lambda calculus. These two semantic approaches build … (see more)on previous work that used an explicit source of randomness to reason about higher-order probabilistic programs.

2021-07-01

2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) (published)

doi.org

arxiv.org

Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization

Wesley Chung

Valentin Thomas

Marlos C. Machado

Nicolas Le Roux

Bandit and reinforcement learning (RL) problems can often be framed as optimization problems where the goal is to maximize average performan… (see more)ce while having access only to stochastic estimates of the true gradient. Traditionally, stochastic optimization theory predicts that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates. In this paper we demonstrate that this is not the case for bandit and RL problems. To allow our analysis to be interpreted in light of multi-step MDPs, we focus on techniques derived from stochastic optimization principles (e.g., natural policy gradient and EXP3) and we show that some standard assumptions from optimization theory are violated in these problems. We present theoretical results showing that, at least for bandit problems, curvature and noise are not sufficient to explain the learning dynamics and that seemingly innocuous choices like the baseline can determine whether an algorithm converges. These theoretical findings match our empirical evaluation, which we extend to multi-state MDPs.

2021-06-30

Proceedings of the 38th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

A Brief Study on the Effects of Training Generative Dialogue Models with a Semantic loss

Prasanna Parthasarathi

Mohamed Abdelsalam

Joelle Pineau

Sarath Chandar

Neural models trained for next utterance generation in dialogue task learn to mimic the n-gram sequences in the training set with training o… (see more)bjectives like negative log-likelihood (NLL) or cross-entropy. Such commonly used training objectives do not foster generating alternate responses to a context. But, the effects of minimizing an alternate training objective that fosters a model to generate alternate response and score it on semantic similarity has not been well studied. We hypothesize that a language generation model can improve on its diversity by learning to generate alternate text during training and minimizing a semantic loss as an auxiliary objective. We explore this idea on two different sized data sets on the task of next utterance generation in goal oriented dialogues. We make two observations (1) minimizing a semantic objective improved diversity in responses in the smaller data set (Frames) but only as-good-as minimizing the NLL in the larger data set (MultiWoZ) (2) large language model embeddings can be more useful as a semantic loss objective than as initialization for token embeddings.

2021-06-30

Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue (published)

doi.org

arxiv.org

Continuous Coordination As a Realistic Scenario for Lifelong Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Aaron Courville

Sarath Chandar

Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. L… (see more)ifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.

2021-06-30

Proceedings of the 38th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

Scott Fujimoto

David Meger

Doina Precup

Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a… (see more) sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable and applicable to high-dimensional domains. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.

2021-06-30

Proceedings of the 38th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Directional Graph Networks: Anisotropic Aggregation in Graph Neural Networks via Directional Vector Fields

Dominique Beaini

Saro Passaro

Vincent Létourneau

William L. Hamilton

Gabriele Corso

Pietro Lio

The lack of anisotropic kernels in graph neural networks (GNNs) strongly limits their expressiveness, contributing to well-known issues such… (see more) as over-smoothing. To overcome this limitation, we propose the first globally consistent anisotropic kernels for GNNs, allowing for graph convolutions that are defined according to topologicaly-derived directional flows. First, by defining a vector field in the graph, we develop a method of applying directional derivatives and smoothing by projecting node-specific messages into the field. Then, we propose the use of the Laplacian eigenvectors as such vector field. We show that the method generalizes CNNs on an

2021-06-30

Proceedings of the 38th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Educating the future generation of researchers: A cross-disciplinary survey of trends in analysis methods

Taylor Bolt

Jason S. Nomi

Danilo Bzdok

Lucina Q. Uddin

Methods for data analysis in the biomedical, life, and social (BLS) sciences are developing at a rapid pace. At the same time, there is incr… (see more)easing concern that education in quantitative methods is failing to adequately prepare students for contemporary research. These trends have led to calls for educational reform to undergraduate and graduate quantitative research method curricula. We argue that such reform should be based on data-driven insights into within- and cross-disciplinary use of analytic methods. Our survey of peer-reviewed literature analyzed approximately 1.3 million openly available research articles to monitor the cross-disciplinary mentions of analytic methods in the past decade. We applied data-driven text mining analyses to the “Methods” and “Results” sections of a large subset of this corpus to identify trends in analytic method mentions shared across disciplines, as well as those unique to each discipline. We found that the t test, analysis of variance (ANOVA), linear regression, chi-squared test, and other classical statistical methods have been and remain the most mentioned analytic methods in biomedical, life science, and social science research articles. However, mentions of these methods have declined as a percentage of the published literature between 2009 and 2020. On the other hand, multivariate statistical and machine learning approaches, such as artificial neural networks (ANNs), have seen a significant increase in the total share of scientific publications. We also found unique groupings of analytic methods associated with each BLS science discipline, such as the use of structural equation modeling (SEM) in psychology, survival models in oncology, and manifold learning in ecology. We discuss the implications of these findings for education in statistics and research methods, as well as within- and cross-disciplinary collaboration.

2021-06-30

PLoS Biology (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications