Publications

Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
Manuela Girotti
Murat A Erdogdu
We study the problem of training a two-layer neural network (NN) of arbitrary width using stochastic gradient descent (SGD) where the input …
Organizing Principles of Astrocytic Nanoarchitecture in the Mouse Cerebral Cortex
Christopher K. Salmon
Tabish A Syed
J. Benjamin Kacerovsky
Nensi Alivodej
Alexandra L. Schober
Tyler F. W. Sloan
Michael T. Pratte
Michael P. Rosen
Miranda Green
Adario DasGupta
Shaurya Mehta
Affan Jilani
Yanan Wang
Hojatollah Vali
Craig A. Mandato
Keith K. Murai
Preclinical-to-clinical Anti-cancer Drug Response Prediction and Biomarker Identification Using TINDL
David Earl Hostallero
Lixuan Wei
Liewei Wang
Junmei Cairns
Protein Representation Learning by Geometric Structure Pretraining
Zuobai Zhang
Minghao Xu
Arian Rokkum Jamasb
Vijil Chenthamarakshan
Aurelie Lozano
Payel Das
Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein function or structure. Ex… (see more)isting approaches usually pretrain protein language models on a large number of unlabeled amino acid sequences and then finetune the models with some labeled data in downstream tasks. Despite the effectiveness of sequence-based approaches, the power of pretraining on known protein structures, which are available in smaller numbers only, has not been explored for protein property prediction, though protein structures are known to be determinants of protein function. In this paper, we propose to pretrain protein representations according to their 3D structures. We first present a simple yet effective encoder to learn the geometric features of a protein. We pretrain the protein graph encoder by leveraging multiview contrastive learning and different self-prediction tasks. Experimental results on both function prediction and fold classification tasks show that our proposed pretraining methods outperform or are on par with the state-of-the-art sequence-based methods, while using much less pretraining data. Our implementation is available at https://github.com/DeepGraphLearning/GearNet.
Protein Sequence and Structure Co-Design with Equivariant Translation
Chence Shi
Chuanrui Wang
Jiarui Lu
Bozitao Zhong
Proteins are macromolecules that perform essential functions in all living organisms. Designing novel proteins with specific structures and … (see more)desired functions has been a long-standing challenge in the field of bioengineering. Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models, both of which suffer from high inference costs. In this paper, we propose a new approach capable of protein sequence and structure co-design, which iteratively translates both protein sequence and structure into the desired state from random initialization, based on context features given a priori. Our model consists of a trigonometry-aware encoder that reasons geometrical constraints and interactions from context features, and a roto-translation equivariant decoder that translates protein sequence and structure interdependently. Notably, all protein amino acids are updated in one shot in each translation step, which significantly accelerates the inference process. Experimental results across multiple tasks show that our model outperforms previous state-of-the-art baselines by a large margin, and is able to design proteins of high fidelity as regards both sequence and structure, with running time orders of magnitude less than sampling-based methods.
P REDICTIVE I NFERENCE WITH F EATURE C ONFORMAL P REDICTION
Jiaye Teng
Chuan Wen
Dinghuai Zhang
Yang Gao
Yang Yuan
Reliability of CKA as a Similarity Measure in Deep Learning
MohammadReza Davari
Stefan Horoi
Amine Natik
Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different w… (see more)ays. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of architecturally similar networks trained differently, or of models with different architectures trained on the same data. A wide variety of claims about similarity and dissimilarity of these various representations have been made using CKA results. In this work we present analysis that formally characterizes CKA sensitivity to a large class of simple transformations, which can naturally occur in the context of modern machine learning. This provides a concrete explanation to CKA sensitivity to outliers, which has been observed in past works, and to transformations that preserve the linear separability of the data, an important generalization attribute. We empirically investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counterintuitive results. Finally we study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models, and call for caution when leveraging activation alignment metrics.
Revisiting Populations in multi-agent Communication
Paul Michel
Mathieu Rita
Olivier Tieleman
Angeliki Lazaridou
Despite evidence from cognitive sciences that larger groups of speakers tend to develop more structured languages in human communication, sc… (see more)aling up to populations has failed to yield significant benefits in emergent multi-agent communication. In this paper we advocate for an alternate population-level training paradigm for referential games based on the idea of "partitioning" the agents into sender-receiver pairs and limiting co-adaptation across pairs. We show that this results in optimizing a different objective at the population level, where agents maximize (1) their respective "internal" communication accuracy and (2) some measure of alignment between agents. In experiments, we find that this leads to the emergence of languages that are significantly more compositional. Moreover, when agents are trained in populations that are not fully connected (ie. not all agent pairs interact at training time), this approach reduces multi-linguality and improves zero-shot communication with new agents (ie. agents are able to communicate successfully with other agents outside their training partners).
Robust and Controllable Object-Centric Learning through Energy-based Models
Ruixiang ZHANG
Tong Che
Boris Ivanovic
Renhao Wang
Marco Pavone
Humans are remarkably good at understanding and reasoning about complex visual scenes. The capability of decomposing low-level observations … (see more)into discrete objects allows us to build a grounded abstract representation and identify the compositional structure of the world. Thus it is a crucial step for machine learning models to be capable of inferring objects and their properties from visual scene without explicit supervision. However, existing works on object-centric representation learning are either relying on tailor-made neural network modules or assuming sophisticated models of underlying generative and inference processes. In this work, we present EGO, a conceptually simple and general approach to learning object-centric representation through energy-based model. By forming a permutation-invariant energy function using vanilla attention blocks that are readily available in Transformers, we can infer object-centric latent variables via gradient-based MCMC methods where permutation equivariance is automatically guaranteed. We show that EGO can be easily integrated into existing architectures, and can effectively extract high-quality object-centric representations, leading to better segmentation accuracy and competitive downstream task performance. We empirically evaluate the robustness of the learned representation from EGO against distribution shift. Finally, we demonstrate the effectiveness of EGO in systematic compositional generalization, by recomposing learned energy functions for novel scene generation and manipulation.
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Pierluca D'Oro
Max Schwarzer
Evgenii Nikishin
Increasing the replay ratio, the number of updates of an agent's parameters per environment interaction, is an appealing strategy for improv… (see more)ing the sample efficiency of deep reinforcement learning algorithms. In this work, we show that fully or partially resetting the parameters of deep reinforcement learning agents causes better replay ratio scaling capabilities to emerge. We push the limits of the sample efficiency of carefully-modified algorithms by training them using an order of magnitude more updates than usual, significantly improving their performance in the Atari 100k and DeepMind Control Suite benchmarks. We then provide an analysis of the design choices required for favorable replay ratio scaling to be possible and discuss inherent limits and tradeoffs.
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
Samuel Lavoie
Christos Tsirigotis
Max Schwarzer
Kenji Kawaguchi
Ankit Vani
Michael Noukhovitch
Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into …
Social isolation is linked to classical risk factors of Alzheimer’s disease-related dementias
Kimia Shafighi
Sylvia Villeneuve
Pedro Rosa‐Neto
AmanPreet Badhwar
Judes Poirier
Vaibhav Sharma
Yasser Iturria-Medina
Patricia P. Silveira
Laurette Dubé
David C. Glahn
Alzheimer’s disease and related dementias is a major public health burden – compounding over upcoming years due to longevity. Recently, … (see more)clinical evidence hinted at the experience of social isolation in expediting dementia onset. In 502,506 UK Biobank participants and 30,097 participants from the Canadian Longitudinal Study of Aging, we revisited traditional risk factors for developing dementia in the context of loneliness and lacking social support. Across these measures of subjective and objective social deprivation, we have identified strong links between individuals’ social capital and various indicators of Alzheimer’s disease and related dementias risk, which replicated across both population cohorts. The quality and quantity of daily social encounters had deep connections with key aetiopathological factors, which represent 1) personal habits and lifestyle factors, 2) physical health, 3) mental health, and 4) societal and external factors. Our population-scale assessment suggest that social lifestyle determinants are linked to most neurodegeneration risk factors, highlighting them promising targets for preventive clinical action.