Xin Wang

Crowdkeeping in Last-mile Delivery

Xin Wang

Okan Arslan

Érick Delage

2024-02-29

Transportation Science (published)

doi.org

JaxPruner: A concise library for sparsity research

Joo Hyung Lee

Wonpyo Park

Nicole Elyse Mitchell

Jonathan Pilault

Johan Samir Obando Ceron

Han-Byul Kim

Namhoon Lee

Elias Frantar

Yun Long

Amir Yazdanbakhsh

Shivani Agrawal

Suvinay Subramanian

Xin Wang

Sheng-Chun Kao

Xingyao Zhang

Trevor Gale

Aart J.C. Bik

Woohyun Han

Milen Ferev

Zhonglin Han … (see 5 more)

Hong-Seok Kim

Yann Dauphin

Gintare Karolina Dziugaite

Pablo Samuel Castro

Utku Evci

This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research. JaxPruner aims … (see more)to accelerate research on sparse neural networks by providing concise implementations of popular pruning and sparse training algorithms with minimal memory and latency overhead. Algorithms implemented in JaxPruner use a common API and work seamlessly with the popular optimization library Optax, which, in turn, enables easy integration with existing JAX based libraries. We demonstrate this ease of integration by providing examples in four different codebases: Scenic, t5x, Dopamine and FedJAX and provide baseline experiments on popular benchmarks.

2024-01-08

Conference on Parsimony and Learning (published)

doi.org

openreview.net

Doubly Right Object Recognition: A Why Prompt for Visual Rationales

Chengzhi Mao

Revant Teotia

Amrutha Sundar

Sachit Menon

Junfeng Yang

Xin Wang

Carl Vondrick

Many visual recognition models are evaluated only on their classification accuracy, a metric for which they obtain strong performance. In th… (see more)is paper, we investigate whether computer vision models can also provide correct rationales for their predictions. We propose a “doubly right” object recognition benchmark, where the metric requires the model to simultaneously produce both the right labels as well as the right rationales. We find that state-of-the-art visual models, such as CLIP, often provide incorrect rationales for their categorical predictions. However, by transferring the rationales from language models into visual representations through a tailored dataset, we show that we can learn a “why prompt,” which adapts large visual representations to produce correct rationales. Visualizations and empirical experiments show that our prompts significantly improve performance on doubly right object recognition, in addition to zero-shot transfer to unseen tasks and datasets.

2023-06-17

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

Understanding Zero-shot Adversarial Robustness for Large-Scale Models

Chengzhi Mao

Scott Geng

Junfeng Yang

Xin Wang

Carl Vondrick

Pretrained large-scale vision-language models like CLIP have exhibited strong generalization over unseen tasks. Yet imperceptible adversaria… (see more)l perturbations can significantly reduce CLIP's performance on new tasks. In this work, we identify and explore the problem of adapting large-scale models for zero-shot adversarial robustness. We first identify two key factors during model adaption--training losses and adaptation methods--that affect the model's zero-shot adversarial robustness. We then propose a text-guided contrastive adversarial training loss, which aligns the text embeddings and the adversarial visual features with contrastive learning on a small set of training data. We apply this training loss to two adaption methods, model finetuning and visual prompt tuning. We find that visual prompt tuning is more effective in the absence of texts, while finetuning wins in the existence of text guidance. Overall, our approach significantly improves the zero-shot adversarial robustness over CLIP, seeing an average improvement of 31 points over ImageNet and 15 zero-shot datasets. We hope this work can shed light on understanding the zero-shot adversarial robustness of large-scale models.

2023-02-01

ICLR.cc/2023/Conference (poster)

doi.org

openreview.net

Robust Contrastive Learning against Noisy Views

Ching-Yao Chuang

(Rex) Devon Hjelm

Xin Wang

Vibhav Vineet

Neel Joshi

Antonio Torralba

Stefanie Jegelka

Yale Song

Contrastive learning relies on an assumption that positive pairs contain related views that share certain underlying information about an in… (see more)stance, e.g., patches of an image or co-occurring multimodal signals of a video. What if this assumption is violated? The literature suggests that contrastive learning produces suboptimal representations in the presence of noisy views, e.g., false positive pairs with no apparent shared information. In this work, we pro-pose a new contrastive loss function that is robust against noisy views. We provide rigorous theoretical justifications by showing connections to robust symmetric losses for noisy binary classification and by establishing a new contrastive bound for mutual information maximization based on the Wasserstein distance measure. The proposed loss is completely modality-agnostic and a simple drop-in replacement for the InfoNCE loss, which makes it easy to apply to ex-isting contrastive frameworks. We show that our approach provides consistent improvements over the state-of-the-art on image, video, and graph contrastive learning bench-marks that exhibit a variety of real-world noise patterns.

2022-06-18

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

Robust Contrastive Learning against Noisy Views

Ching-Yao Chuang

(Rex) Devon Hjelm

Xin Wang

Vibhav Vineet

Neel Joshi

Antonio Torralba

Stefanie Jegelka

Ya-heng Song

Contrastive learning relies on an assumption that positive pairs contain related views that share certain underlying information about an in… (see more)stance, e.g., patches of an image or co-occurring multimodal signals of a video. What if this assumption is violated? The literature suggests that contrastive learning produces suboptimal representations in the presence of noisy views, e.g., false positive pairs with no apparent shared information. In this work, we pro-pose a new contrastive loss function that is robust against noisy views. We provide rigorous theoretical justifications by showing connections to robust symmetric losses for noisy binary classification and by establishing a new contrastive bound for mutual information maximization based on the Wasserstein distance measure. The proposed loss is completely modality-agnostic and a simple drop-in replacement for the InfoNCE loss, which makes it easy to apply to ex-isting contrastive frameworks. We show that our approach provides consistent improvements over the state-of-the-art on image, video, and graph contrastive learning bench-marks that exhibit a variety of real-world noise patterns.

2022-01-12

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Xin Wang

Publications