Publications

Revisiting Successor Features for Inverse Reinforcement Learning

Arnav Kumar Jain

Harley Wiltzer

Jesse Farebrother

Irina Rish

Glen Berseth

Sanjiban Choudhury

2024-06-17

ICML.cc/2024/Workshop/MFHAIA (poster)

openreview.net

On The Local Geometry of Deep Generative Manifolds

Ahmed Imtiaz Humayun

Ibtihel Amara

Candice Schumann

Golnoosh Farnadi

Negar Rostamzadeh

Mohammad Havaei

In this paper, we study theoretically inspired local geometric descriptors of the data manifolds approximated by pre-trained generative mode… (voir plus)ls. The descriptors – local scaling (ψ), local rank (ν), and local complexity (δ) — characterize the uncertainty, dimensionality, and smoothness on the learned manifold, using only the network weights and architecture. We investigate and emphasize their critical role in understanding generative models. Our analysis reveals that the local geometry is intricately linked to the quality and diversity of generated outputs. Additionally, we see that the geometric properties are distinct for out-of-distribution (OOD) inputs as well as for prompts memorized by Stable Diffusion, showing the possible application of our proposed descriptors for downstream detection and assessment of pre-trained generative models.

2024-06-17

ICML.cc/2024/Workshop/GRaM (publié)

openreview.net

Using neural biomarkers to personalize dosing of vagus nerve stimulation

Antonin Berthon

Lorenz Wernisch

Myrta Stoukidi

Michael Thornton

Olivier Tessier-Lariviere

Pascal Fortier-Poisson

Jorin Mamen

Max Pinkney

Susannah Lee

Elvijs Sarkans

Luca Annecchino

Ben Appleton

Philip Garsed

Bret Patterson

Samuel Gonshaw

Matjaž Jakopec

Sudhakaran Shunmugam

Tristan Edwards

Aleksi Tukiainen

Joel Jennings … (voir 3 de plus)

Guillaume Lajoie

Emil Hewage

Oliver Armitage

2024-06-17

Bioelectronic Medicine (publié)

doi.org

Cell Morphology-Guided Small Molecule Generation with GFlowNets

Stephen Zhewen Lu

Ziqing Lu

Ehsan Hajiramezanali

Tommaso Biancalani

Yoshua Bengio

Gabriele Scalia

Michał Koziarski

2024-06-16

ICML.cc/2024/Workshop/AI4Science (poster)

doi.org

openreview.net

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding

Le Zhang

Rabiul Awal

Aishwarya Agrawal

Vision-Language Models (VLMs), such as CLIP, exhibit strong image-text comprehension abilities, facilitating advances in several downstream … (voir plus)tasks such as zero-shot image classification, image-text retrieval, and text-to-image generation. However, the compositional reasoning abilities of existing VLMs remains subpar. The root of this limitation lies in the inadequate alignment between the images and captions in the pretraining datasets. Additionally, the current contrastive learning objective fails to focus on fine-grained grounding components like relations, actions, and attributes, resulting in"bag-of-words"representations. We introduce a simple and effective method to improve compositional reasoning in VLMs. Our method better leverages available datasets by refining and expanding the standard image-text contrastive learning framework. Our approach does not require specific annotations and does not incur extra parameters. When integrated with CLIP, our technique yields notable improvement over state-of-the-art baselines across five vision-language compositional benchmarks. We open-source our code at https://github.com/lezhang7/Enhance-FineGrained.

2024-06-16

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (publié)

doi.org

arxiv.org

Expressivity of Neural Networks with Fixed Weights and Learned Biases

Ezekiel Williams

Avery Hee-Woon Ryoo

Thomas Jiralerspong

Alexandre Payeur

Matt Perich

Luca Mazzucato

Guillaume Lajoie

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks

Daniel Beaglehole

Ioannis Mitliagkas

Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved p… (voir plus)roblems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we clarify the nature of this correlation and explain its emergence at early training times. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent kernel. We identify a centering of the NFA that isolates this alignment and is robust to initialization scale. We show that, through this centering, the speed of NFA development can be predicted analytically in terms of simple statistics of the inputs and labels.

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks

Daniel Beaglehole

Ioannis Mitliagkas

Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved p… (voir plus)roblems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we clarify the nature of this correlation and explain its emergence at early training times. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent kernel. We identify a centering of the NFA that isolates this alignment and is robust to initialization scale. We show that, through this centering, the speed of NFA development can be predicted analytically in terms of simple statistics of the inputs and labels.

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient Dissent in Language Model Training and Saturation

Andrei Mircea

Ekaterina Lobacheva

Irina Rish

We seek to shed light on language model (LM) saturation from the perspective of learning dynamics. To this end, we define a decomposition o… (voir plus)f the cross-entropy gradient, which forms a shared low-dimensional basis for analyzing the training dynamics of models across scales. Intuitively, this decomposition consists of attractive and repulsive components that increase the logit of the correct class and decrease the logits of incorrect classes, respectively. Our analysis in this subspace reveals a phenomenon we term \textit{gradient dissent}, characterized by gradient components becoming systematically opposed such that loss cannot be improved along one component without being degraded along the other. Notably, we find that complete opposition, which we term \textit{total dissent}, reliably occurs in tandem with the saturation of smaller LMs. Based on these results, we hypothesize that gradient dissent can provide a useful foundation for better understanding and mitigating saturation.

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Inpainting Galaxy Counts onto N-Body Simulations over Multiple Cosmologies and Astrophysics

Antoine Bourdin

Ronan Legin

Matthew Ho

Alexandre Adam

Yashar Hezaveh

Laurence Perreault-Levasseur

2024-06-16

ICML.cc/2024/Workshop/AI4Science (poster)

openreview.net

Linear Weight Interpolation Leads to Transient Performance Gains

Gaurav Iyer

Gintare Karolina Dziugaite

David Rolnick

2024-06-16

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Local lateral connectivity is sufficient for replicating cortex-like topographical organization in deep neural networks

Xinyu Qian

Amirozhan Dehghani

Asa Borzabadi Farahani

Pouya Bashivan

Across the primate cortex, neurons that perform similar functions tend to be spatially grouped together. In high-level visual cortex, this w… (voir plus)idely observed biological rule manifests itself as a modular organization of neuronal clusters, each tuned to a specific object category. The tendency toward short connections is one of the most widely accepted views of why such an organization exists in the brains of many animals. Yet, how such a feat is implemented at the neural level remains unclear. Here, using artificial deep neural networks as test beds, we demonstrate that a topographical organization similar to that in the primary, intermediate, and high-level human visual cortex emerges when units in these models are laterally connected and their weight parameters are tuned by top-down credit assignment. Importantly, the emergence of the modular organization in the absence of explicit topography-inducing learning rules and objectives questions their necessity and suggests that local lateral connectivity alone may be sufficient for the formation of the topographic organization across the cortex.

2024-06-16

ICML.cc/2024/Workshop/AI4Science (poster)

doi.org

openreview.net

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications