Portrait of Sangnie Bhardwaj is unavailable

Sangnie Bhardwaj

PhD - Université de Montréal


Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency
Tianhong Li
Sangnie Bhardwaj
Yonglong Tian
Han Zhang
Jarred Barber
Dina Katabi
Huiwen Chang
Dilip Krishnan
Current vision-language generative models rely on expansive corpora of paired image-text data to attain optimal performance and generalizati… (see more)on capabilities. However, automatically collecting such data (e.g. via large-scale web scraping) leads to low quality and poor image-text correlation, while human annotation is more accurate but requires significant manual effort and expense. We introduce
Steerable Equivariant Representation Learning
Sangnie Bhardwaj
Willie McClinton
Tongzhou Wang
Chen Sun
Phillip Isola
Dilip Krishnan
Pre-trained deep image representations are useful for post-training tasks such as classification through transfer learning, image retrieval,… (see more) and object detection. Data augmentations are a crucial aspect of pre-training robust representations in both supervised and self-supervised settings. Data augmentations explicitly or implicitly promote invariance in the embedding space to the input image transformations. This invariance reduces generalization to those downstream tasks which rely on sensitivity to these particular data augmentations. In this paper, we propose a method of learning representations that are instead equivariant to data augmentations. We achieve this equivariance through the use of steerable representations. Our representations can be manipulated directly in embedding space via learned linear maps. We demonstrate that our resulting steerable and equivariant representations lead to better performance on transfer learning and robustness: e.g. we improve linear probe top-1 accuracy by between 1% to 3% for transfer; and ImageNet-C accuracy by upto 3.4%. We further show that the steerability of our representations provides significant speedup (nearly 50x) for test-time augmentations; by applying a large number of augmentations for out-of-distribution detection, we significantly improve OOD AUC on the ImageNet-C dataset over an invariant representation.