Publications

2025-04-22

Proceedings of the Algorithmic Fairness Through the Lens of Metrics and Evaluation (published)

Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery

Amin Abyaneh

Mahrokh G. Boroujeni

Hsiu-Chin Lin

Giancarlo Ferrari-Trecate

Imitation learning is a data-driven approach to learning policies from expert behavior, but it is prone to unreliable outcomes in out-of-sam… (see more)ple (OOS) regions. While previous research relying on stable dynamical systems guarantees convergence to a desired state, it often overlooks transient behavior. We propose a framework for learning policies modeled by contractive dynamical systems, ensuring that all policy rollouts converge regardless of perturbations, and in turn, enable efficient OOS recovery. By leveraging recurrent equilibrium networks and coupling layers, the policy structure guarantees contractivity for any parameter choice, which facilitates unconstrained optimization. We also provide theoretical upper bounds for worst-case and expected loss to rigorously establish the reliability of our method in deployment. Empirically, we demonstrate substantial OOS performance improvements for simulated robotic manipulation and navigation tasks.

2025-04-22

ICLR (Accept (Poster))

openreview.net

Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML

Prakhar Ganesh

Usman Gohar

Lu Cheng

With fairness concerns gaining significant attention in Machine Learning (ML), several bias mitigation techniques have been proposed, often … (see more)compared against each other to find the best method. These benchmarking efforts tend to use a common setup for evaluation under the assumption that providing a uniform environment ensures a fair comparison. However, bias mitigation techniques are sensitive to hyperparameter choices, random seeds, feature selection, etc., meaning that comparison on just one setting can unfairly favour certain algorithms. In this work, we show significant variance in fairness achieved by several algorithms and the influence of the learning pipeline on fairness scores. We highlight that most bias mitigation techniques can achieve comparable performance, given the freedom to perform hyperparameter optimization, suggesting that the choice of the evaluation parameters-rather than the mitigation technique itself-can sometimes create the perceived superiority of one method over another. We hope our work encourages future research on how various choices in the lifecycle of developing an algorithm impact fairness, and trends that guide the selection of appropriate algorithms.

2025-04-22

Proceedings of the Algorithmic Fairness Through the Lens of Metrics and Evaluation (published)

Distilling semantically aware orders for autoregressive image generation

Rishav Pramanik

Antoine Poupon

Juan A. Rodriguez

Masih Aminbeidokhti

David Vázquez

Christopher Pal

Zhaozheng Yin

Marco Pedersoli

2025-04-22

ArXiv (preprint)

arxiv.org

Fair Resource Allocation in Weakly Coupled Markov Decision Processes

Xiaohui Tu

Yossiri Adulyasak

Nima Akbarzadeh

Erick Delage

We consider fair resource allocation in sequential decision-making environments modeled as weakly coupled Markov decision processes, where r… (see more)esource constraints couple the action spaces of

2025-04-22

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics (published)

A flaw in using pre-trained pLLMs in protein-protein interaction inference models

Joseph Szymborski

Amin Emad

With the growing pervasiveness of pre-trained protein large language models (pLLMs), pLLM-based methods are increasingly being put forward f… (see more)or the protein-protein interaction (PPI) inference task. Here, we identify and confirm that existing pre-trained pLLMs are a source of data leakage for the downstream PPI task. We characterize the extent of the data leakage problem by training and comparing small and efficient pLLMs on a dataset that controls for data leakage (“strict”) with one that does not (“non-strict”). While data leakage from pre-trained pLLMs cause measurable inflation of testing scores, we find that this does not necessarily extend to other, non-paired biological tasks such as protein keyword annotation. Further, we find no connection between the context-lengths of pLLMs and the performance of pLLM-based PPI inference methods on proteins with sequence lengths that surpass it. Furthermore, we show that pLLM-based and non-pLLM-based models fail to generalize in tasks such as prediction of the human-SARS-CoV-2 PPIs or the effect of point mutations on binding-affinities. This study demonstrates the importance of extending existing protocols for the evaluation of pLLM-based models applied to paired biological datasets and identifies areas of weakness of current pLLM models.

2025-04-22

bioRxiv (preprint)

Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation

Zhixiang Chi

Li Gu

Huan Liu

Ziqiang Wang

Yanan Wu

Yang Wang

Konstantinos N Plataniotis

Few-shot Test-Time Domain Adaptation focuses on adapting a model at test time to a specific domain using only a few unlabeled examples, addr… (see more)essing domain shift. Prior methods leverage CLIP's strong out-of-distribution (OOD) abilities by generating domain-specific prompts to guide its generalized, frozen features. However, since downstream datasets are not explicitly seen by CLIP, solely depending on the feature space knowledge is constrained by CLIP's prior knowledge. Notably, when using a less robust backbone like ViT-B/16, performance significantly drops on challenging real-world benchmarks. Departing from the state-of-the-art of inheriting the intrinsic OOD capability of CLIP, this work introduces learning directly on the input space to complement the dataset-specific knowledge for frozen CLIP. Specifically, an independent side branch is attached in parallel with CLIP and enforced to learn exclusive knowledge via revert attention. To better capture the dataset-specific label semantics for downstream adaptation, we propose to enhance the inter-dispersion among text features via greedy text ensemble and refinement. The text and visual features are then progressively fused in a domain-aware manner by a generated domain prompt to adapt toward a specific domain. Extensive experiments show our method's superiority on 5 large-scale benchmarks (WILDS and DomainNet), notably improving over smaller networks like ViT-B/16 with gains of \textbf{+5.1} in F1 for iWildCam and \textbf{+3.1\%} in WC Acc for FMoW.

2025-04-22

International Conference on Learning Representations (Accept (Poster))

openreview.net

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Lazar Atanackovic

Xi Zhang

Brandon Amos

Mathieu Blanchette

Leo J. Lee

Yoshua Bengio

Alexander Tong

Kirill Neklyudov

Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynam… (see more)ics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities. That is, the change of the population at any moment in time depends on the population itself due to the interactions between samples. In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depend on the microenvironment of cells specific to each patient. We propose Meta Flow Matching (MFM), a practical approach to integrate along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations. Namely, we embed the population of samples using a Graph Neural Network (GNN) and use these embeddings to train a Flow Matching model. This gives MFM the ability to generalize over the initial distributions, unlike previously proposed methods. We demonstrate the ability of MFM to improve the prediction of individual treatment responses on a large-scale multi-patient single-cell drug screen dataset.

2025-04-22

International Conference on Learning Representations (Accept (Poster))

openreview.net

Multilingual Hallucination Gaps

Cléa Chataigner

Afaf Taïk

Large language models (LLMs) are increasingly used as alternatives to traditional searchengines given their capacity to generate text that r… (see more)esembles human language. However, thisshift is concerning, as LLMs often generate hallucinations—misleading or false informationthat appears highly credible. In this study, we explore the phenomenon of hallucinationsacross multiple languages in free-form text generation, focusing on what we call multilingualhallucination gaps. These gaps reflect differences in the frequency of hallucinated answersdepending on the prompt and language used. To quantify such hallucinations, we used theFActScore metric and extended its framework to a multilingual setting. We conductedexperiments using LLMs from the LLaMA, Qwen, and Aya families, generating biographiesin 19 languages and comparing the results to Wikipedia pages. Our results reveal varia-tions in hallucination rates, especially between high- and low-resource languages, raisingimportant questions about LLM multilingual performance and the challenges in evaluatinghallucinations in multilingual free-form text generation.

2025-04-22

Proceedings of the Algorithmic Fairness Through the Lens of Metrics and Evaluation (published)

Performative Prediction on Games and Mechanism Design

Fernando P. Santos

2025-04-22

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics (published)

Planning and Learning in Risk-Aware Restless Multi-Arm Bandits

Nima Akbarzadeh

Yossiri Adulyasak

Erick Delage

2025-04-22

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics (published)

Privacy-Preserving Group Fairness in Cross-Device Federated Learning

Sikha Pentyala

Nicola Neophytou

Anderson Nascimento

Martine De Cock

Group fairness ensures that the outcome of machine learning (ML) based decision making systems are notbiased towards a certain group of peop… (see more)le defined by a sensitive attribute such as gender or ethnicity. Achievinggroup fairness in Federated Learning (FL) is challenging because mitigating bias inherently requires usingthe sensitive attribute values of all clients, while FL is aimed precisely at protecting privacy by not givingaccess to the clients’ data. As we show in this paper, this conflict between fairness and privacy in FL can beresolved by combining FL with Secure Multiparty Computation (MPC) and Differential Privacy (DP). Tothis end, we propose a privacy-preserving approach to calculate group fairness notions in the cross-device FLsetting. Then, we propose two bias mitigation pre-processing and post-processing techniques in cross-deviceFL under formal privacy guarantees, without requiring the clients to disclose their sensitive attribute values.Empirical evaluations on real world datasets demonstrate the effectiveness of our solution to train fair andaccurate ML models in federated cross-device setups with privacy guarantees to the users.

2025-04-22

Proceedings of the Algorithmic Fairness Through the Lens of Metrics and Evaluation (published)