Publications

Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution
Anthony Zador
Bence Ölveczky
Sean Escola
Kwabena Boahen
Matthew Botvinick
Dmitri Chklovskii
Anne Churchland
Claudia Clopath
James DiCarlo
Surya Ganguli
Jeff Hawkins
Konrad Paul Kording
Alexei Koulakov
Yann LeCun
Timothy P. Lillicrap
Adam Marblestone
Bruno Olshausen
Alexandre Pouget … (see 7 more)
Cristina Savin
Terrence Sejnowski
Eero Simoncelli
Sara Solla
David Sussillo
Andreas S. Tolias
Doris Tsao
Towards Better Evaluation for Dynamic Link Prediction
Despite the prevalence of recent success in learning from static graphs, learning from time-evolving graphs remains an open challenge. In th… (see more)is work, we design new, more stringent evaluation procedures for link prediction specific to dynamic graphs, which reflect real-world considerations, to better compare the strengths and weaknesses of methods. First, we create two visualization techniques to understand the reoccurring patterns of edges over time and show that many edges reoccur at later time steps. Based on this observation, we propose a pure memorization-based baseline called EdgeBank. EdgeBank achieves surprisingly strong performance across multiple settings which highlights that the negative edges used in the current evaluation are easy. To sample more challenging negative edges, we introduce two novel negative sampling strategies that improve robustness and better match real-world applications. Lastly, we introduce six new dynamic graph datasets from a diverse set of domains missing from current benchmarks, providing new challenges and opportunities for future research. Our code repository is accessible at https://github.com/fpour/DGB.git.
Towards good validation metrics for generative models in offline model-based optimisation
Christopher Beckham
Alexandre Piché
David Vazquez
In this work we propose a principled evaluation framework for model-based optimisation to measure how well a generative model can extrapolat… (see more)e. We achieve this by interpreting the training and validation splits as draws from their respective ‘truncated’ ground truth distributions, where examples in the validation set contain scores much larger than those in the training set. Model selection is performed on the validation set for some prescribed validation metric. A major research question however is in determining what validation metric correlates best with the expected value of generated candidates with respect to the ground truth oracle; work towards answering this question can translate to large economic gains since it is expensive to evaluate the ground truth oracle in the real world. We compare various validation metrics for generative adversarial networks using our framework. We also discuss limitations with our framework with respect to existing datasets and how progress can be made to mitigate them. 1
Towards Painless Policy Optimization for Constrained MDPs
Sharan Vaswani
Csaba Szepesvari
We study policy optimization in an infinite horizon, …
Trajectory Balance: Improved Credit Assignment in GFlowNets
Generative flow networks (GFlowNets) are a method for learning a stochastic policy for generating compositional objects, such as graphs or s… (see more)trings, from a given unnormalized density by sequences of actions, where many possible action sequences may lead to the same object. We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, which are analogous to temporal difference learning, to be prone to inefficient credit propagation across long action sequences. We thus propose a new learning objective for GFlowNets, trajectory balance, as a more efficient alternative to previously used objectives. We prove that any global minimizer of the trajectory balance objective can define a policy that samples exactly from the target distribution. In experiments on four distinct domains, we empirically demonstrate the benefits of the trajectory balance objective for GFlowNet convergence, diversity of generated samples, and robustness to long action sequences and large action spaces.
Trajectory of Mini-Batch Momentum: Batch Size Saturation and Convergence in High Dimensions
Kiwon Lee
Andrew Nicholas Cheng
Elliot Paquette
Understanding Generalization via Leave-One-Out Conditional Mutual Information
MAHDI HAGHIFAM
Shay Moran
Daniel M. Roy
Understanding the Evolution of Linear Regions in Deep Reinforcement Learning
Setareh Cohan
Nam Hee Gordon Kim
Michiel van de Panne
Policies produced by deep reinforcement learning are typically characterised by their learning curves, but they remain poorly understood in … (see more)many other respects. ReLU-based policies result in a partitioning of the input space into piecewise linear regions. We seek to understand how observed region counts and their densities evolve during deep reinforcement learning using empirical results that span a range of continuous control tasks and policy network dimensions. Intuitively, we may expect that during training, the region density increases in the areas that are frequently visited by the policy, thereby affording fine-grained control. We use recent theoretical and empirical results for the linear regions induced by neural networks in supervised learning settings for grounding and comparison of our results. Empirically, we find that the region density increases only moderately throughout training, as measured along fixed trajectories coming from the final policy. However, the trajectories themselves also increase in length during training, and thus the region densities decrease as seen from the perspective of the current trajectory. Our findings suggest that the complexity of deep reinforcement learning policies does not principally emerge from a significant growth in the complexity of functions observed on-and-around trajectories of the policy.
Unifying Likelihood-free Inference with Black-box Optimization and Beyond
Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on th… (see more)e pharmaceutical industry. In this work, we propose to unify two seemingly distinct worlds: likelihood-free inference and black-box optimization, under one probabilistic framework. In tandem, we provide a recipe for constructing various sequence design methods based on this framework. We show how previous optimization approaches can be"reinvented"in our framework, and further propose new probabilistic black-box optimization algorithms. Extensive experiments on sequence design application illustrate the benefits of the proposed methodology.
Unsupervised Dependency Graph Network
Yikang Shen
Shawn Tan
Peng Li
Jie Zhou
Recent work has identified properties of pretrained self-attention models that mirror those of dependency parse structures. In particular, s… (see more)ome self-attention heads correspond well to individual dependency types. Inspired by these developments, we propose a new competitive mechanism that encourages these attention heads to model different dependency relations. We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task. Experiment results show that UDGN achieves very strong unsupervised dependency parsing performance without gold POS tags and any other external information. The competitive gated heads show a strong correlation with human-annotated dependency types. Furthermore, the UDGN can also achieve competitive performance on masked language modeling and sentence textual similarity tasks.
Usefulness of School Absenteeism Data for Predicting Infl uenza Outbreaks,
Joseph R. Egger
A. Hoen
John S. Brownstein
Donald R. Olson
Kevin James Konty
and second-round PCR were 94°C for 3 min, followed by 40 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 2 min. Expected amplifi ca… (see more)tion products were 458 bp (PCR-1) and 304 bp (PCR-2). Using dilutions of a synthetic template corresponding to the target sequence, we estimated the sensitivity of the amplifi cation assay to be 5 copies of target sequence by limiting-dilution assay. Negative (sterile water) and positive controls (synthetic template dilutions) were
Vision-Language Pretraining: Current Trends and the Future
Damien Teney
Aida Nematzadeh
In the last few years, there has been an increased interest in building multimodal (vision-language) models that are pretrained on larger bu… (see more)t noisier datasets where the two modalities (e.g., image and text) loosely correspond to each other (e.g., Lu et al., 2019; Radford et al., 2021). Given a task (such as visual question answering), these models are then often fine-tuned on task-specific supervised datasets. (e.g., Lu et al., 2019; Chen et al.,2020; Tan and Bansal, 2019; Li et al., 2020a,b). In addition to the larger pretraining datasets, the transformer architecture (Vaswani et al., 2017) and in particular self-attention applied to two modalities are responsible for the impressive performance of the recent pretrained models on downstream tasks (Hendricks et al., 2021). In this tutorial, we focus on recent vision-language pretraining paradigms. Our goal is to first provide the background on image–language datasets, benchmarks, and modeling innovations before the multimodal pretraining area. Next we discuss the different family of models used for vision-language pretraining, highlighting their strengths and shortcomings. Finally, we discuss the limits of vision-language pretraining through statistical learning, and the need for alternative approaches such as causal representation learning.