Publications

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding

Vision-Language Models (VLMs), such as CLIP, exhibit strong image-text comprehension abilities, facilitating advances in several downstream … (see more)tasks such as zero-shot image classification, image-text retrieval, and text-to-image generation. However, the compositional reasoning abilities of existing VLMs remains subpar. The root of this limitation lies in the inadequate alignment between the images and captions in the pretraining datasets. Additionally, the current contrastive learning objective fails to focus on fine-grained grounding components like relations, actions, and attributes, resulting in"bag-of-words"representations. We introduce a simple and effective method to improve compositional reasoning in VLMs. Our method better leverages available datasets by refining and expanding the standard image-text contrastive learning framework. Our approach does not require specific annotations and does not incur extra parameters. When integrated with CLIP, our technique yields notable improvement over state-of-the-art baselines across five vision-language compositional benchmarks. We open-source our code at https://github.com/lezhang7/Enhance-FineGrained.

2024-06-16

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

Expressivity of Neural Networks with Fixed Weights and Learned Biases

Avery Hee-Woon Ryoo

Luca Mazzucato

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks

Daniel Beaglehole

Ioannis Mitliagkas

Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved p… (see more)roblems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we clarify the nature of this correlation and explain its emergence at early training times. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent kernel. We identify a centering of the NFA that isolates this alignment and is robust to initialization scale. We show that, through this centering, the speed of NFA development can be predicted analytically in terms of simple statistics of the inputs and labels.

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks

Daniel Beaglehole

Ioannis Mitliagkas

Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved p… (see more)roblems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we clarify the nature of this correlation and explain its emergence at early training times. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent kernel. We identify a centering of the NFA that isolates this alignment and is robust to initialization scale. We show that, through this centering, the speed of NFA development can be predicted analytically in terms of simple statistics of the inputs and labels.

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient Dissent in Language Model Training and Saturation

Andrei Mircea

Ekaterina Lobacheva

Irina Rish

We seek to shed light on language model (LM) saturation from the perspective of learning dynamics. To this end, we define a decomposition o… (see more)f the cross-entropy gradient, which forms a shared low-dimensional basis for analyzing the training dynamics of models across scales. Intuitively, this decomposition consists of attractive and repulsive components that increase the logit of the correct class and decrease the logits of incorrect classes, respectively. Our analysis in this subspace reveals a phenomenon we term \textit{gradient dissent}, characterized by gradient components becoming systematically opposed such that loss cannot be improved along one component without being degraded along the other. Notably, we find that complete opposition, which we term \textit{total dissent}, reliably occurs in tandem with the saturation of smaller LMs. Based on these results, we hypothesize that gradient dissent can provide a useful foundation for better understanding and mitigating saturation.

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Inpainting Galaxy Counts onto N-Body Simulations over Multiple Cosmologies and Astrophysics

Matthew Ho

Laurence Perreault-Levasseur

2024-06-16

ICML.cc/2024/Workshop/AI4Science (poster)

openreview.net

Linear Weight Interpolation Leads to Transient Performance Gains

Gaurav Iyer

Gintare Karolina Dziugaite

David Rolnick

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Local lateral connectivity is sufficient for replicating cortex-like topographical organization in deep neural networks

Xinyu Qian

Amirozhan Dehghani

Asa Borzabadi Farahani

Pouya Bashivan

Across the primate cortex, neurons that perform similar functions tend to be spatially grouped together. In high-level visual cortex, this w… (see more)idely observed biological rule manifests itself as a modular organization of neuronal clusters, each tuned to a specific object category. The tendency toward short connections is one of the most widely accepted views of why such an organization exists in the brains of many animals. Yet, how such a feat is implemented at the neural level remains unclear. Here, using artificial deep neural networks as test beds, we demonstrate that a topographical organization similar to that in the primary, intermediate, and high-level human visual cortex emerges when units in these models are laterally connected and their weight parameters are tuned by top-down credit assignment. Importantly, the emergence of the modular organization in the absence of explicit topography-inducing learning rules and objectives questions their necessity and suggests that local lateral connectivity alone may be sufficient for the formation of the topographic organization across the cortex.

2024-06-16

ICML.cc/2024/Workshop/AI4Science (poster)

doi.org

openreview.net

Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology

Oren Kraus

Kian Kenyon-Dean

Saber Saberian

Maryam Fallah

Peter McLean

Jess Leung

Vasudev Sharma

Ayla Khan

Jia Balakrishnan

Safiye Celik

Dominique Beaini

Maciej Sypetkowski

Chi Vicky Cheng

Kristen Morse

Maureen Makes

Ben Mabey

Berton Earnshaw

2024-06-16

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Jack Urbanek

Florian Bordes

Pietro Astolfi

Mary Williamson

Vasu Sharma

Adriana Romero Soriano

2024-06-16

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

A Survey on Fairness Without Demographics

Patrik Joslin Kenfack

Éts Montréal

Samira Ebrahimi Kahou

Ulrich Aivodji

The issue of bias in Machine Learning (ML) models is a significant challenge for the machine learning community. Real-world biases can be em… (see more)bedded in the data used to train models, and prior studies have shown that ML models can learn and even amplify these biases. This can result in unfair treatment of individuals based on their inherent characteristics or sensitive attributes such as gender, race, or age. Ensuring fairness is crucial with the increasing use of ML models in high-stakes scenarios and has gained significant attention from researchers in recent years. However, the challenge of ensuring fairness becomes much greater when the assumption of full access to sensitive attributes does not hold. The settings where the hypothesis does not hold include cases where (1) only limited or noisy demographic information is available or (2) demographic information is entirely unobserved due to privacy restrictions. This survey reviews recent research efforts to enforce fairness when sensitive attributes are missing. We propose a taxonomy of existing works and, more importantly, highlight current challenges and future research directions to stimulate research in ML fairness in the setting of missing sensitive attributes.

2024-06-16

TMLR (accepted)

openreview.net

The Butterfly Effect: Tiny Perturbations Cause Neural Network Training to Diverge

Gül Sena Altıntaş

Devin Kwok

David Rolnick

Neural network training begins with a chaotic phase in which the network is sensitive to small perturbations, such as those caused by stocha… (see more)stic gradient descent (SGD). This sensitivity can cause identically initialized networks to diverge both in parameter space and functional similarity. However, the exact degree to which networks are sensitive to perturbation, and the sensitivity of networks as they transition out of the chaotic phase, is unclear. To address this uncertainty, we apply a controlled perturbation at a single point in training time and measure its effect on otherwise identical training trajectories. We find that both the

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Speed Science

Leading in a New Era

Supervision Requests

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Publications