Publications

There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its … (voir plus)caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's vision encoder outputs a set of visual features that are mixed into a final representation by conditioning on information derived from the text. We show that Llip outperforms non-contextualized baselines like CLIP and SigLIP on a variety of tasks even with large-scale encoders. Llip improves zero-shot classification by an average of 2.9\% zero-shot classification benchmarks with a ViT-G/14 encoder. Specifically, Llip attains a zero-shot top-1 accuracy of 83.5\% on ImageNet outperforming a similarly sized CLIP by 1.4\%. We also demonstrate improvement on zero-shot retrieval on MS-COCO by 6.0\%. We provide a comprehensive analysis of the components introduced by the method and demonstrate that Llip leads to richer visual representations.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

openreview.net

Nearest Neighbour Score Estimators for Diffusion Generative Models

Matthew Niedoba

Dylan Green

Saeid Naderiparizi

Vasileios Lioutas

Jonathan Wilder Lavington

Xiaoxuan Liang

Yunpeng Liu

Ke Zhang

Setareh Dabiri

Adam Ścibior

Berend Zwartsenberg

Frank Wood

Score function estimation is the cornerstone of both training and sampling from diffusion generative models. Despite this fact, the most com… (voir plus)monly used estimators are either biased neural network approximations or high variance Monte Carlo estimators based on the conditional score. We introduce a novel nearest neighbour score function estimator which utilizes multiple samples from the training set to dramatically decrease estimator variance. We leverage our low variance estimator in two compelling applications. Training consistency models with our estimator, we report a significant increase in both convergence speed and sample quality. In diffusion models, we show that our estimator can replace a learned network for probability-flow ODE integration, opening promising new avenues of future research. Code will be released upon paper acceptance.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

openreview.net

A Persuasive Approach to Combating Misinformation

Safwan Hossain

Andjela Mladenovic

Yiling Chen

Gauthier Gidel

Bayesian Persuasion is proposed as a tool for social media platforms to combat the spread of misinformation. Since platforms can use machine… (voir plus) learning to predict the popularity and misinformation features of to-be-shared posts, and users are largely motivated to share popular content, platforms can strategically signal this informational advantage to change user beliefs and persuade them not to share misinformation. We characterize the optimal signaling scheme with imperfect predictions as a linear program and give sufficient and necessary conditions on the classifier to ensure optimal platform utility is non-decreasing and continuous. Next, this interaction is considered under a performative model, wherein platform intervention affects the user's future behaviour. The convergence and stability of optimal signaling under this performative process are fully characterized. Lastly, we experimentally validate that our approach significantly reduces misinformation in both the single round and performative setting.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

openreview.net

Position: Cracking the Code of Cascading Disparity Towards Marginalized Communities

Golnoosh Farnadi

Mohammad Havaei

Negar Rostamzadeh

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

openreview.net

Randomized Confidence Bounds for Stochastic Partial Monitoring

Maxime Heuillet

Ola Ahmad

Audrey Durand

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

openreview.net

A Reinforcement Learning Pipeline for Band Gap-directed Crystal Generation

Prashant Govindarajan

Mathieu Reymond

Santiago Miret

Antoine Clavaud

Mariano Phielipp

Sarath Chandar

Property-driven AI-automated material discovery presents unique challenges owing to the complex nature of the chemical structural space and … (voir plus)computationally expensive simulations. For crystalline solids, the band gap is an important property for designing semiconductors and batteries. However, optimizing crystals for a target band gap is difficult and not well-explored. Reinforcement learning (RL) shows promise towards optimizing crystals, as it can freely explore the chemical space. However, it relies on regular band gap evaluations, which can only be accurately computed through expensive Density Functional Theory (DFT) simulations. In this study, we propose an active learning-inspired pipeline that combines RL and DFT simulations for optimizing crystal compositions given a target band gap. The pipeline includes an RL policy for predicting atom types and a band gap network that is fine-tuned with DFT data. Preliminary results indicate the need for furthering the state-of-the-art to address the inherent challenges of the problem.

2024-07-08

BOKU.ac.at/2024/AI4Mat (poster)

openreview.net

Robust Data-driven Prescriptiveness Optimization

Mehran Poursoltani

Érick Delage

Angelos Georghiou

The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information t… (voir plus)o provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)

doi.org

openreview.net

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Publications

Le traitement du langage naturel à l'ère de l'IA générative

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications