Publications

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
Samuel Lavoie
Christos Tsirigotis
Max Schwarzer
Kenji Kawaguchi
Ankit Vani
Michael Noukhovitch
Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into …
Social isolation is linked to classical risk factors of Alzheimer’s disease-related dementias
Kimia Shafighi
Sylvia Villeneuve
Pedro Rosa‐Neto
AmanPreet Badhwar
Judes Poirier
Vaibhav Sharma
Yasser Iturria-Medina
Patricia P. Silveira
Laurette Dubé
David C. Glahn
Alzheimer’s disease and related dementias is a major public health burden – compounding over upcoming years due to longevity. Recently, … (voir plus)clinical evidence hinted at the experience of social isolation in expediting dementia onset. In 502,506 UK Biobank participants and 30,097 participants from the Canadian Longitudinal Study of Aging, we revisited traditional risk factors for developing dementia in the context of loneliness and lacking social support. Across these measures of subjective and objective social deprivation, we have identified strong links between individuals’ social capital and various indicators of Alzheimer’s disease and related dementias risk, which replicated across both population cohorts. The quality and quantity of daily social encounters had deep connections with key aetiopathological factors, which represent 1) personal habits and lifestyle factors, 2) physical health, 3) mental health, and 4) societal and external factors. Our population-scale assessment suggest that social lifestyle determinants are linked to most neurodegeneration risk factors, highlighting them promising targets for preventive clinical action.
Spatio-temporal hard attention learning for skeleton-based activity recognition
Bahareh Nikpour
Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning
Dianbo Liu
Vedant Shah
Oussama Boussif
Cristian Meo
Anirudh Goyal
Tianmin Shu
Michael Curtis Mozer
Nicolas Heess
Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions
David Bieber
Rishab Goel
Dan Zheng
Danny Tarlow
The execution behavior of a program often depends on external resources, such as program inputs or file contents, and so cannot be run in is… (voir plus)olation. Nevertheless, software developers benefit from fast iteration loops where automated tools identify errors as early as possible, even before programs can be compiled and run. This presents an interesting machine learning challenge: can we predict runtime errors in a"static"setting, where program execution is not possible? Here, we introduce a real-world dataset and task for predicting runtime errors, which we show is difficult for generic models like Transformers. We approach this task by developing an interpreter-inspired architecture with an inductive bias towards mimicking program executions, which models exception handling and"learns to execute"descriptions of the contents of external resources. Surprisingly, we show that the model can also predict the location of the error, despite being trained only on labels indicating the presence/absence and kind of error. In total, we present a practical and difficult-yet-approachable challenge problem related to learning program execution and we demonstrate promising new capabilities of interpreter-inspired machine learning models for code.
Systematic Rectification of Language Models via Dead-end Analysis
Meng Cao
Mehdi Fatemi
Samira Shabanian
With adversarial or otherwise normal prompts, existing large language models (LLM) can be pushed to generate toxic discourses. One way to re… (voir plus)duce the risk of LLMs generating undesired discourses is to alter the training of the LLM. This can be very restrictive due to demanding computation requirements. Other methods rely on rule-based or prompt-based token elimination, which are limited as they dismiss future tokens and the overall meaning of the complete discourse. Here, we center detoxification on the probability that the finished discourse is ultimately considered toxic. That is, at each point, we advise against token selections proportional to how likely a finished text from this point will be toxic. To this end, we formally extend the dead-end theory from the recent reinforcement learning (RL) literature to also cover uncertain outcomes. Our approach, called rectification, utilizes a separate but significantly smaller model for detoxification, which can be applied to diverse LLMs as long as they share the same vocabulary. Importantly, our method does not require access to the internal representations of the LLM, but only the token probability distribution at each decoding step. This is crucial as many LLMs today are hosted in servers and only accessible through APIs. When applied to various LLMs, including GPT-3, our approach significantly improves the generated discourse compared to the base LLMs and other techniques in terms of both the overall language and detoxification performance.
The clinical value of Aspergillus-specific IgG antibody test in the diagnosis of nonneutropenic invasive pulmonary aspergillosis.
Yajie Lu
Lulu Liu
Hongxing Li
Bilin Chen
Yu-hui Gu
Li Wang
Chunlai Feng
Cheng Chen
Yanbin Chen
Wenkui Sun
X. Cui
Min Cao
Yujian Tao
Jinjin Zhong
Huanhuan Zhong
Yueyan Ni
Yuchen Cai
M. Song
X. Liu
Yi Shi Li Liu … (voir 1 de plus)
Xin Su
The Hidden Uniform Cluster Prior in Self-Supervised Learning
Mahmoud Assran
Randall Balestriero
Quentin Duval
Florian Bordes
Ishan Misra
Piotr Bojanowski
Nicolas Ballas
A successful paradigm in representation learning is to perform self-supervised pretraining using tasks based on mini-batch statistics (e.g.,… (voir plus) SimCLR, VICReg, SwAV, MSN). We show that in the formulation of all these methods is an overlooked prior to learn features that enable uniform clustering of the data. While this prior has led to remarkably semantic representations when pretraining on class-balanced data, such as ImageNet, we demonstrate that it can hamper performance when pretraining on class-imbalanced data. By moving away from conventional uniformity priors and instead preferring power-law distributed feature clusters, we show that one can improve the quality of the learned representations on real-world class-imbalanced datasets. To demonstrate this, we develop an extension of the Masked Siamese Networks (MSN) method to support the use of arbitrary features priors.
Understanding Zero-shot Adversarial Robustness for Large-Scale Models
Chengzhi Mao
Scott Geng
Junfeng Yang
Xin Wang
Carl Vondrick
Pretrained large-scale vision-language models like CLIP have exhibited strong generalization over unseen tasks. Yet imperceptible adversaria… (voir plus)l perturbations can significantly reduce CLIP's performance on new tasks. In this work, we identify and explore the problem of adapting large-scale models for zero-shot adversarial robustness. We first identify two key factors during model adaption--training losses and adaptation methods--that affect the model's zero-shot adversarial robustness. We then propose a text-guided contrastive adversarial training loss, which aligns the text embeddings and the adversarial visual features with contrastive learning on a small set of training data. We apply this training loss to two adaption methods, model finetuning and visual prompt tuning. We find that visual prompt tuning is more effective in the absence of texts, while finetuning wins in the existence of text guidance. Overall, our approach significantly improves the zero-shot adversarial robustness over CLIP, seeing an average improvement of 31 points over ImageNet and 15 zero-shot datasets. We hope this work can shed light on understanding the zero-shot adversarial robustness of large-scale models.
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Samuel Sokota
Ryan D'Orazio
J Zico Kolter
Nicolas Loizou
Marc Lanctot
Noam Brown
Christian Kroer
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul
Feng Chen
Brett W. Larsen
Jonathan Frankle
Surya Ganguli
Modern deep learning involves training costly, highly overparameterized networks, thus motivating the search for sparser networks that can s… (voir plus)till be trained to the same accuracy as the full network (i.e. matching). Iterative magnitude pruning (IMP) is a state of the art algorithm that can find such highly sparse matching subnetworks, known as winning tickets. IMP operates by iterative cycles of training, masking smallest magnitude weights, rewinding back to an early training point, and repeating. Despite its simplicity, the underlying principles for when and how IMP finds winning tickets remain elusive. In particular, what useful information does an IMP mask found at the end of training convey to a rewound network near the beginning of training? How does SGD allow the network to extract this information? And why is iterative pruning needed? We develop answers in terms of the geometry of the error landscape. First, we find that
Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
John Nguyen
Jianyu Wang
Kshitiz Malik
Maziar Sanjabi
AI Meta