Publications

Towards a "Universal Translator" for Neural Dynamics at Single-Cell, Single-Spike Resolution

Yizi Zhang

Yanchen Wang

Donato M. Jiménez-Benetó

Zixuan Wang

Mehdi Azabou

Blake Richards

Renee Tung

Olivier Winter

International Brain Laboratory

Eva L Dyer

Liam Paninski

Cole Lincoln Hurwitz

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Trajectory Flow Matching with Applications to Clinical Time Series Modelling

Xi Zhang

Yuan Pu

Yuki Kawamura

Andrew Loza

Yoshua Bengio

Dennis Shung

Alexander Tong

Modeling stochastic and irregularly sampled time series is a challenging problem found in a wide range of applications, especially in medici… (see more)ne. Neural stochastic differential equations (Neural SDEs) are an attractive modeling technique for this problem, which parameterize the drift and diffusion terms of an SDE with neural networks. However, current algorithms for training Neural SDEs require backpropagation through the SDE dynamics, greatly limiting their scalability and stability. To address this, we propose **Trajectory Flow Matching** (TFM), which trains a Neural SDE in a *simulation-free* manner, bypassing backpropagation through the dynamics. TFM leverages the flow matching technique from generative modeling to model time series. In this work we first establish necessary conditions for TFM to learn time series data. Next, we present a reparameterization trick which improves training stability. Finally, we adapt TFM to the clinical time series setting, demonstrating improved performance on three clinical time series datasets both in terms of absolute performance and uncertainty prediction.

2024-09-25

NeurIPS.cc/2024/Conference (spotlight)

VisMin: Visual Minimal-Change Understanding

Rabiul Awal

Saba Ahmadi

Le Zhang

Aishwarya Agrawal

Fine-grained understanding of objects, attributes, and relationships between objects is crucial for visual-language models (VLMs). To evalua… (see more)te VLMs' fine-grained understanding, existing benchmarks primarily focus on evaluating VLMs' capability to distinguish between two very similar captions given an image. In this paper, our focus is on evaluating VLMs' capability to distinguish between two very similar images given a caption. To this end, we introduce a new, challenging benchmark termed Visual Minimal-Change Understanding (VisMin), which requires models to predict the correct image-caption match given two images and two captions. Importantly, the image pair (as well as the caption pair) contains minimal changes, i.e., between the two images (as well as between the two captions), only one aspect changes at a time from among the following possible types of changes: object, attribute, count, and spatial relation. These four types of minimal changes are specifically designed to test the models' understanding of objects, attributes of objects (such as color, material, shape), counts of objects, and spatial relationships between objects. To curate our benchmark, we built an automatic pipeline using large language models and diffusion models, followed by a rigorous 4-step verification process by human annotators. Empirical experiments reveal that current VLMs exhibit notable deficiencies in understanding spatial relationships and counting abilities. Furthermore, leveraging the automated nature of our data creation process, we generate a large-scale training dataset, which we use to finetune CLIP (a foundational VLM) and Idefics2 (a multimodal large language model). Our findings show that both these models benefit significantly from fine-tuning on this data, as evident by marked improvements in fine-grained understanding across a wide range of benchmarks. Additionally, such fine-tuning improves CLIP's general image-text alignment capabilities too. All resources including the benchmark, the training data, and the finetuned model checkpoints will be released.

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Wasserstein Distributionally Robust Optimization through the Lens of Structural Causal Models and Individual Fairness

Ahmad Reza Ehyaei

Golnoosh Farnadi

Samira Samadi

2024-09-25

NeurIPS.cc/2024/Conference (poster)

When is an Embedding Model More Promising than Another?

Maxime DARRIN

Philippe Formont

Ismail Ben Ayed

Jackie Cheung

Pablo Piantanida

2024-09-25

NeurIPS.cc/2024/Conference (poster)

Frequency-based View Selection in Gaussian Splatting Reconstruction

Monica Li

Pierre-Yves Lajoie

Giovanni Beltrame

Three-dimensional reconstruction is a fundamental problem in robotics perception. We examine the problem of active view selection to perform… (see more) 3D Gaussian Splatting reconstructions with as few input images as possible. Although 3D Gaussian Splatting has made significant progress in image rendering and 3D reconstruction, the quality of the reconstruction is strongly impacted by the selection of 2D images and the estimation of camera poses through Structure-from-Motion (SfM) algorithms. Current methods to select views that rely on uncertainties from occlusions, depth ambiguities, or neural network predictions directly are insufficient to handle the issue and struggle to generalize to new scenes. By ranking the potential views in the frequency domain, we are able to effectively estimate the potential information gain of new viewpoints without ground truth data. By overcoming current constraints on model architecture and efficacy, our method achieves state-of-the-art results in view selection, demonstrating its potential for efficient image-based 3D reconstruction.

2024-09-24

ArXiv (preprint)

A neuronal least-action principle for real-time learning in cortical circuits

Walter Senn

Dominik Dold

Akos F. Kungl

Benjamin Ellenberger

Jakob Jordan

Yoshua Bengio

João Sacramento

Mihai A. Petrovici

One of the most fundamental laws of physics is the principle of least action. Motivated by its predictive power, we introduce a neuronal lea… (see more)st-action principle for cortical processing of sensory streams to produce appropriate behavioural outputs in real time. The principle postulates that the voltage dynamics of cortical pyramidal neurons prospectively minimize the local somato-dendritic mismatch error within individual neurons. For motor output neurons, it implies minimizing an instantaneous behavioural error. For deep network neurons, it implies a prospective firing to overcome integration delays and correct for possible output errors right in time. The neuron-specific errors are extracted in the apical dendrites of pyramidal neurons through a cortical microcircuit that tries to explain away the feedback from the periphery, and correct the trajectory on the fly. Any motor output is in a moving equilibrium with the sensory inputs and the motor feedback during the whole sensory-motor trajectory. Ongoing synaptic plasticity reduces the somato-dendritic mismatch error within each cortical neuron and performs gradient descent on the output cost at any moment in time. The neuronal least-action principle offers an axiomatic framework to derive local neuronal and synaptic dynamics for global real-time computation and learning in the brain and in physical substrates in general.

2024-09-23

bioRxiv (preprint)

Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting

Humza Wajid Hameed

G'eraldin Nanfack

Eugene Belilovsky

Spurious correlations are a major source of errors for machine learning models, in particular when aiming for group-level fairness. It has b… (see more)een recently shown that a powerful approach to combat spurious correlations is to re-train the last layer on a balanced validation dataset, isolating robust features for the predictor. However, key attributes can sometimes be discarded by neural networks towards the last layer. In this work, we thus consider retraining a classifier on a set of features derived from all layers. We utilize a recently proposed feature selection strategy to select unbiased features from all the layers. We observe this approach gives significant improvements in worst-group accuracy on several standard benchmarks.

2024-09-23

ArXiv (preprint)

Protein Language Models: Is Scaling Necessary?

Quentin Fournier

Robert M. Vernon

Almer van der Sloot

Benjamin Schulz

Sarath Chandar

Christopher James Langmead

2024-09-23

bioRxiv (preprint)

A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models

Stephen R. Pfohl

Heather Cole-Lewis

Rory A Sayres

Darlene Neal

Mercy Nyamewaa Asiedu

Awa Dieng

Nenad Tomašev

Qazi Mamunur Rashid

Shekoofeh Azizi

Negar Rostamzadeh

Liam G. McCoy

L. A. Celi

Yun Liu

Mike Schaekermann

Alanna Walton

Alicia Parrish

Chirag Nagpal

Preeti Singh

Akeiylah Dewitt

P. A. Mansfield … (see 10 more)

Sushant Prakash

Katherine Heller

Alan Karthikesalingam

Christopher Semturs

Joelle Barral

Greg C. Corrado

Yossi Matias

Jamila Smith-Loud

Ivor Horn

Karan Singhal

2024-09-23

Nature Medicine (published)

What Are They Doing? Joint Audio-Speech Co-Reasoning

Yingzhi Wang

Pooneh Mousavi

Artem Ploujnikov

Mirco Ravanelli

2024-09-22

ArXiv (preprint)