Publications

IDs for AI Systems

Alan Chan

Noam Kolt

Peter Wills

Usman Anwar

Christian Schroeder de Witt

Nitarshan Rajkumar

Lewis Hammond

David M. Krueger

Lennart Heim

Markus Anderljung

AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. … (voir plus)A user may not be able to verify whether a system has certain safety certifications. An investigator may not know whom to investigate when a system causes an incident. It may not be clear whom to contact to shut down a malfunctioning system. Across a number of domains, IDs address analogous problems by identifying particular entities (e.g., a particular Boeing 747) and providing information about other entities of the same class (e.g., some or all Boeing 747s). We propose a framework in which IDs are ascribed to instances of AI systems (e.g., a particular chat session with Claude 3), and associated information is accessible to parties seeking to interact with that system. We characterize IDs for AI systems, provide concrete examples where IDs could be useful, argue that there could be significant demand for IDs from key actors, analyze how those actors could incentivize ID adoption, explore a potential implementation of our framework for deployers of AI systems, and highlight limitations and risks. IDs seem most warranted in settings where AI systems could have a large impact upon the world, such as in making financial transactions or contacting real humans. With further study, IDs could help to manage a world where AI systems pervade society.

2024-06-16

ArXiv (prépublication)

doi.org

arxiv.org

IDs for AI Systems

Alan Chan

Noam Kolt

Peter Wills

Usman Anwar

Christian Schroeder de Witt

Nitarshan Rajkumar

Lewis Hammond

David M. Krueger

Lennart Heim

Markus Anderljung

AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. … (voir plus)A user may not be able to verify whether a system has certain safety certifications. An investigator may not know whom to investigate when a system causes an incident. It may not be clear whom to contact to shut down a malfunctioning system. Across a number of domains, IDs address analogous problems by identifying particular entities (e.g., a particular Boeing 747) and providing information about other entities of the same class (e.g., some or all Boeing 747s). We propose a framework in which IDs are ascribed to instances of AI systems (e.g., a particular chat session with Claude 3), and associated information is accessible to parties seeking to interact with that system. We characterize IDs for AI systems, provide concrete examples where IDs could be useful, argue that there could be significant demand for IDs from key actors, analyze how those actors could incentivize ID adoption, explore a potential implementation of our framework for deployers of AI systems, and highlight limitations and risks. IDs seem most warranted in settings where AI systems could have a large impact upon the world, such as in making financial transactions or contacting real humans. With further study, IDs could help to manage a world where AI systems pervade society.

2024-06-16

ArXiv (prépublication)

doi.org

arxiv.org

Improving Molecular Modeling with Geometric GNNs: an Empirical Study

Fragkiskos D. Malliaros

Alexandre AGM Duval

2024-06-16

ICML.cc/2024/Workshop/ML4LMS (poster)

doi.org

openreview.net

Joint Multimodal Transformer for Emotion Recognition in the Wild

Paul Waligora

Muhammad Haseeb Aslam

Muhammad Osama Zeeshan

Soufiane Belharbi

Alessandro Lameiras Koerich

Marco Pedersoli

Simon Bacon

Eric Granger

Multimodal emotion recognition (MMER) systems typically outperform unimodal systems by leveraging the inter-and intra-modal relationships be… (voir plus)tween, e.g., visual, textual, physiological, and auditory modalities. This paper proposes an MMER method that relies on a joint multi-modal transformer (JMT) for fusion with key-based cross-attention. This framework can exploit the complementary nature of diverse modalities to improve predictive accuracy. Separate backbones capture intra-modal spatiotemporal dependencies within each modality over video sequences. Subsequently, our JMT fusion architecture integrates the individual modality embeddings, allowing the model to effectively capture inter- and intra-modal relationships. Extensive experiments on two challenging expression recognition tasks – (1) dimensional emotion recognition on the Affwild2 dataset (with face and voice) and (2) pain estimation on the Biovid dataset (with face and biosensors) – indicate that our JMT fusion can provide a cost-effective solution for MMER. Empirical results show that MMER systems with our proposed fusion allow us to outperform relevant baseline and state-of-the-art methods. Code is available at: https://github.com/PoloWlg/Joint-Multimodal-Transformer-6th-ABAW

2024-06-16

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (publié)

doi.org

arxiv.org

Learning Generative Population Models From Multiple Clinical Datasets Via Probabilistic Programming

João Loula

Katherine M. Collins

Ulrich Schaechtle

Joshua B. Tenenbaum

Adrian Weller

Feras Saad

Timothy J. O'Donnell

Vikash Mansinghka

Accurate, efficient generative models of clinical populations could accelerate clinical research and improve patient outcomes. For example, … (voir plus)such models could infer probable treatment outcomes for different subpopulations, generate high-fidelity synthetic data that can be shared across organizational boundaries, and discover new relationships among clinical variables. Using Bayesian structure learning, we show that it is possible to learn probabilistic program models of clinical populations by combining data from multiple, sparsely overlapping clinical datasets. Through experiments with multiple clinical trials and real-world evidence from census health surveys, we show that our model generates higher quality synthetic data than neural network baselines, supports more accurate inferences across datasets than traditional statistical methods, and can be queried more efficiently than both, opening up new avenues for accessible and efficient AI assistance in clinical research.

2024-06-16

ICML.cc/2024/Workshop/AccMLBio (poster)

openreview.net

Lost in Translation: The Algorithmic Gap Between LMs and the Brain

Tosato Tommaso

Tikeng Notsawo Pascal Junior

Helbling Saskia

Irina Rish

Guillaume Dumas

Language Models (LMs) have achieved impressive performance on various linguistic tasks, but their relationship to human language processing … (voir plus)in the brain remains unclear. This paper examines the gaps and overlaps between LMs and the brain at different levels of analysis, emphasizing the importance of looking beyond input-output behavior to examine and compare the internal processes of these systems. We discuss how insights from neuroscience, such as sparsity, modularity, internal states, and interactive learning, can inform the development of more biologically plausible language models. Furthermore, we explore the role of scaling laws in bridging the gap between LMs and human cognition, highlighting the need for efficiency constraints analogous to those in biological systems. By developing LMs that more closely mimic brain function, we aim to advance both artificial intelligence and our understanding of human cognition.

2024-06-16

ICML.cc/2024/Workshop/LLMs_and_Cognition (poster)

openreview.net

Neural Ratio Estimators Meet Distributional Shift and Mode Misspecification: A Cautionary Tale from Strong Gravitational Lensing

Andreas Filipp

Yashar Hezaveh

Laurence Perreault-Levasseur

In recent years, there has been increasing interest in the field of astrophysics in applying Neural Ratio Estimators (NREs) to large-scale i… (voir plus)nference problems where both amortization and marginalization over a large number of nuisance parameters are needed. Here, in order to assess the true potential of this method to produce unbiased inference on real data, we investigate the robustness of NREs to distribution shifts and model misspecification in the specific scientific application of the measurement of dark matter population-level parameters using strong gravitational lensing. We investigate the behaviour of a trained NRE for test data presenting distributional shifts inside the bounds of training, as well as out of distribution, both in the linear and non-linear parameters of this problem. While our results show that NREs perform when tested perfectly in distribution, we find that they exhibit significant biases and drawbacks when confronted with slight deviations from the examples seen in the training distribution. This indicates the necessity for caution when applying NREs to real astrophysical data, where underlying distributions are not perfectly known and models do not perfectly reconstruct the true underlying distributions.

2024-06-16

ICML.cc/2024/Workshop/SPIGM (poster)

openreview.net

Revisiting Successor Features for Inverse Reinforcement Learning

Sanjiban Choudhury

2024-06-16

ICML.cc/2024/Workshop/MFHAIA (poster)

openreview.net

On The Local Geometry of Deep Generative Manifolds

Ahmed Imtiaz Humayun

Candice Schumann

In this paper, we study theoretically inspired local geometric descriptors of the data manifolds approximated by pre-trained generative mode… (voir plus)ls. The descriptors – local scaling (ψ), local rank (ν), and local complexity (δ) — characterize the uncertainty, dimensionality, and smoothness on the learned manifold, using only the network weights and architecture. We investigate and emphasize their critical role in understanding generative models. Our analysis reveals that the local geometry is intricately linked to the quality and diversity of generated outputs. Additionally, we see that the geometric properties are distinct for out-of-distribution (OOD) inputs as well as for prompts memorized by Stable Diffusion, showing the possible application of our proposed descriptors for downstream detection and assessment of pre-trained generative models.

2024-06-16

ICML.cc/2024/Workshop/GRaM (publié)

openreview.net

Cell Morphology-Guided Small Molecule Generation with GFlowNets

Stephen Zhewen Lu

Ziqing Lu

Ehsan Hajiramezanali

Tommaso Biancalani

Yoshua Bengio

Gabriele Scalia

Michał Koziarski

2024-06-15

ICML.cc/2024/Workshop/AI4Science (poster)

doi.org

openreview.net

Expressivity of Neural Networks with Fixed Weights and Learned Biases

Ezekiel Williams

Avery Hee-Woon Ryoo

Thomas Jiralerspong

Alexandre Payeur

Matthew G Perich

Luca Mazzucato

Guillaume Lajoie

2024-06-15

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Gradient descent induces alignment between weights and the pre-activation tangents for deep non-linear networks

Daniel Beaglehole

Ioannis Mitliagkas

Atish Agarwala

Understanding the mechanisms through which neural networks extract statistics from input-label pairs is one of the most important unsolved p… (voir plus)roblems in supervised learning. Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA). However, the reason these quantities become correlated during training is poorly understood. In this work, we clarify the nature of this correlation and explain its emergence at early training times. We identify that the NFA is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent kernel. We identify a centering of the NFA that isolates this alignment and is robust to initialization scale. We show that, through this centering, the speed of NFA development can be predicted analytically in terms of simple statistics of the inputs and labels.

2024-06-15

ICML.cc/2024/Workshop/HiLD (poster)

openreview.net

Mila Techaide 2026

Désinformation 2.0 : quand l’IA brouille nos ondes

Avantage IA : productivité dans la fonction publique

Publications

Mila Techaide 2026

Désinformation 2.0 : quand l’IA brouille nos ondes

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications