Publications

Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions

Rishab Goel

Dan Zheng

Daniel Tarlow

The execution behavior of a program often depends on external resources, such as program inputs or file contents, and so cannot be run in is… (see more)olation. Nevertheless, software developers benefit from fast iteration loops where automated tools identify errors as early as possible, even before programs can be compiled and run. This presents an interesting machine learning challenge: can we predict runtime errors in a"static"setting, where program execution is not possible? Here, we introduce a real-world dataset and task for predicting runtime errors, which we show is difficult for generic models like Transformers. We approach this task by developing an interpreter-inspired architecture with an inductive bias towards mimicking program executions, which models exception handling and"learns to execute"descriptions of the contents of external resources. Surprisingly, we show that the model can also predict the location of the error, despite being trained only on labels indicating the presence/absence and kind of error. In total, we present a practical and difficult-yet-approachable challenge problem related to learning program execution and we demonstrate promising new capabilities of interpreter-inspired machine learning models for code.

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

Systematic Rectification of Language Models via Dead-end Analysis

Meng Cao

Mehdi Fatemi

Jackie Cheung

Samira Shabanian

With adversarial or otherwise normal prompts, existing large language models (LLM) can be pushed to generate toxic discourses. One way to re… (see more)duce the risk of LLMs generating undesired discourses is to alter the training of the LLM. This can be very restrictive due to demanding computation requirements. Other methods rely on rule-based or prompt-based token elimination, which are limited as they dismiss future tokens and the overall meaning of the complete discourse. Here, we center detoxification on the probability that the finished discourse is ultimately considered toxic. That is, at each point, we advise against token selections proportional to how likely a finished text from this point will be toxic. To this end, we formally extend the dead-end theory from the recent reinforcement learning (RL) literature to also cover uncertain outcomes. Our approach, called rectification, utilizes a separate but significantly smaller model for detoxification, which can be applied to diverse LLMs as long as they share the same vocabulary. Importantly, our method does not require access to the internal representations of the LLM, but only the token probability distribution at each decoding step. This is crucial as many LLMs today are hosted in servers and only accessible through APIs. When applied to various LLMs, including GPT-3, our approach significantly improves the generated discourse compared to the base LLMs and other techniques in terms of both the overall language and detoxification performance.

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

The clinical value of Aspergillus-specific IgG antibody test in the diagnosis of nonneutropenic invasive pulmonary aspergillosis.

Yajie Lu

Lulu Liu

Hongxing Li

Bilin Chen

Yu-hui Gu

Li Wang

Chunlai Feng

Cheng Chen

Yanbin Chen

Wenkui Sun

X. Cui

Min Cao

Yujian Tao

Jinjin Zhong

Huanhuan Zhong

Yueyan Ni

Yuchen Cai

M. Song

X. Liu

Yi Shi Li Liu … (see 1 more)

Xin Su

2023-02-01

Clinical Microbiology and Infection (published)

doi.org

The Hidden Uniform Cluster Prior in Self-Supervised Learning

Mahmoud Assran

Randall Balestriero

Quentin Duval

Florian Bordes

Ishan Misra

Piotr Bojanowski

Pascal Vincent

Michael Rabbat

Nicolas Ballas

A successful paradigm in representation learning is to perform self-supervised pretraining using tasks based on mini-batch statistics (e.g.,… (see more) SimCLR, VICReg, SwAV, MSN). We show that in the formulation of all these methods is an overlooked prior to learn features that enable uniform clustering of the data. While this prior has led to remarkably semantic representations when pretraining on class-balanced data, such as ImageNet, we demonstrate that it can hamper performance when pretraining on class-imbalanced data. By moving away from conventional uniformity priors and instead preferring power-law distributed feature clusters, we show that one can improve the quality of the learned representations on real-world class-imbalanced datasets. To demonstrate this, we develop an extension of the Masked Siamese Networks (MSN) method to support the use of arbitrary features priors.

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

Understanding Zero-shot Adversarial Robustness for Large-Scale Models

Chengzhi Mao

Scott Geng

Junfeng Yang

Xin Wang

Carl Vondrick

Pretrained large-scale vision-language models like CLIP have exhibited strong generalization over unseen tasks. Yet imperceptible adversaria… (see more)l perturbations can significantly reduce CLIP's performance on new tasks. In this work, we identify and explore the problem of adapting large-scale models for zero-shot adversarial robustness. We first identify two key factors during model adaption--training losses and adaptation methods--that affect the model's zero-shot adversarial robustness. We then propose a text-guided contrastive adversarial training loss, which aligns the text embeddings and the adversarial visual features with contrastive learning on a small set of training data. We apply this training loss to two adaption methods, model finetuning and visual prompt tuning. We find that visual prompt tuning is more effective in the absence of texts, while finetuning wins in the existence of text guidance. Overall, our approach significantly improves the zero-shot adversarial robustness over CLIP, seeing an average improvement of 31 points over ImageNet and 15 zero-shot datasets. We hope this work can shed light on understanding the zero-shot adversarial robustness of large-scale models.

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Samuel Sokota

Ryan D'Orazio

J Zico Kolter

Nicolas Loizou

Marc Lanctot

Ioannis Mitliagkas

Noam Brown

Christian Kroer

2023-02-01

ICLR.cc/2023/Conference (poster)

openreview.net

Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?

Mansheej Paul

Feng Chen

Brett W. Larsen

Jonathan Frankle

Surya Ganguli

Gintare Karolina Dziugaite

Modern deep learning involves training costly, highly overparameterized networks, thus motivating the search for sparser networks that can s… (see more)till be trained to the same accuracy as the full network (i.e. matching). Iterative magnitude pruning (IMP) is a state of the art algorithm that can find such highly sparse matching subnetworks, known as winning tickets. IMP operates by iterative cycles of training, masking smallest magnitude weights, rewinding back to an early training point, and repeating. Despite its simplicity, the underlying principles for when and how IMP finds winning tickets remain elusive. In particular, what useful information does an IMP mask found at the end of training convey to a rewound network near the beginning of training? How does SGD allow the network to extract this information? And why is iterative pruning needed? We develop answers in terms of the geometry of the error landscape. First, we find that

2023-02-01

ICLR.cc/2023/Conference (notable)

doi.org

openreview.net

Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

John Nguyen

Jianyu Wang

Kshitiz Malik

Maziar Sanjabi

Michael Rabbat

AI Meta

2023-02-01

ICLR.cc/2023/Conference (notable)

doi.org

openreview.net

BARVINN: Arbitrary Precision DNN Accelerator Controlled by a RISC-V CPU

Mohammadhossein Askarihemmat

Sean Wagner

Olexa Bilaniuk

Yassine Hariri

Yvon Savaria

Jean-Pierre David

2023-01-31

Proceedings of the 28th Asia and South Pacific Design Automation Conference (published)

doi.org

arxiv.org

Graph-based Time-Series Anomaly Detection: A Survey

Thi Kieu Khanh Ho

Ali Karami

Narges Armanfard

With the recent advances in technology, a wide range of systems continue to collect a large amount of data over time and thus generate time … (see more)series. Time-Series Anomaly Detection (TSAD) is an important task in various time-series applications such as e-commerce, cybersecurity, vehicle maintenance, and healthcare monitoring. However, this task is very challenging as it requires considering both the intra-variable dependency and the inter-variable dependency, where a variable can be defined as an observation in time series data. Recent graph-based approaches have made impressive progress in tackling the challenges of this field. In this survey, we conduct a comprehensive and up-to-date review of Graph-based TSAD (G-TSAD). First, we explore the significant potential of graph representation learning for time-series data. Then, we review state-of-the-art graph anomaly detection techniques in the context of time series and discuss their strengths and drawbacks. Finally, we discuss the technical challenges and potential future directions for possible improvements in this research field.

2023-01-31

ArXiv (preprint)

doi.org

arxiv.org

Single-cell multi-omic topic embedding reveals cell-type-specific and COVID-19 severity-related immune signatures

Manqi Zhou

Hao Zhang

Zilong Bai

Dylan Mann-Krzisnik

Yi Wang

Yue Li

The advent of single-cell multi-omics sequencing technology makes it possible for re-searchers to leverage multiple modalities for individua… (see more)l cells and explore cell heterogeneity. However, the high dimensional, discrete, and sparse nature of the data make the downstream analysis particularly challenging. Most of the existing computational methods for single-cell data analysis are either limited to single modality or lack flexibility and interpretability. In this study, we propose an interpretable deep learning method called multi-omic embedded topic model (moETM) to effectively perform integrative analysis of high-dimensional single-cell multimodal data. moETM integrates multiple omics data via a product-of-experts in the encoder for efficient variational inference and then employs multiple linear decoders to learn the multi-omic signatures of the gene regulatory programs. Through comprehensive experiments on public single-cell transcriptome and chromatin accessibility data (i.e., scRNA+scATAC), as well as scRNA and proteomic data (i.e., CITE-seq), moETM demonstrates superior performance compared with six state-of-the-art single-cell data analysis methods on seven publicly available datasets. By applying moETM to the scRNA+scATAC data in human bone marrow mononuclear cells (BMMCs), we identified sequence motifs corresponding to the transcription factors that regulate immune gene signatures. Applying moETM analysis to CITE-seq data from the COVID-19 patients revealed not only known immune cell-type-specific signatures but also composite multi-omic biomarkers of critical conditions due to COVID-19, thus providing insights from both biological and clinical perspectives.

2023-01-31

bioRxiv (preprint)

doi.org

Technical Note—Risk-Averse Regret Minimization in Multistage Stochastic Programs

Mehran Poursoltani

Erick Delage

Angelos Georghiou

2023-01-30

Operational Research (published)

doi.org

Speed Science

Leading in a New Era

Supervision Requests

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Publications