Portrait of Pablo Piantanida

Pablo Piantanida

Associate Academic Member
Full Professor, Université Paris-Saclay
Director, International Laboratory on Learning Systems (ILLS), McGill University
Associate professor, École de technologie supérieure (ETS), Department of Systems Engineering
Research Topics
AI Safety
Information Theory
Machine Learning Theory
Natural Language Processing

Biography

I am a professor at CentraleSupélec (Université Paris-Saclay) with the French National Centre for Scientific Research (CNRS), and Director of the International Laboratory on Learning Systems (ILLS) which gathers McGill University, École de technologie supérieure (ÉTS), Mila – Quebec AI Institute, France’s Centre Nationale de la Recherche Scientifique (CNRS), Université Paris-Saclay, and the École CentraleSupélec.

My research revolves around the application of advanced statistical and information-theoretic techniques to the field of machine learning. I am interested in developing rigorous techniques based on information measures and concepts for building safe and trustworthy AI systems and establishing confidence in their behavior and robustness, thereby securing their use in society. My primary areas of expertise include information theory, information geometry, learning theory, privacy, fairness, with applications to computer vision and natural language processing.

I obtained my undergraduate education at the University of Buenos Aires and pursued graduate studies in applied mathematics at Paris-Saclay University in France. Throughout my career, I have also held visiting positions at INRIA, Université de Montréal and Ecole de Technologie Supérieure (ÉTS), among others.

My earlier research encompassed the fields of information theory beyond distributed compression, statistical decision, universal source coding, cooperation, feedback, index coding, key generation, security, and privacy, among others.

I teach courses on machine learning, information theory and deep learning, covering topics such as statistical learning theory, information measures, statistical principles of neural networks.

Current Students

PhD - McGill University
Principal supervisor :
PhD - McGill University
Principal supervisor :
PhD - École de technologie suprérieure
Collaborating researcher - Sorbonne université
PhD - École de technologie suprérieure
Postdoctorate - École de technologie suprérieure
Co-supervisor :
Collaborating researcher - University of Toulon
PhD - McGill University
Principal supervisor :
PhD - Université Paris Dauphine-PSL
Master's Research - École de technologie suprérieure
Collaborating researcher - Sorbonne Université

Publications

BayesAdapter: enhanced uncertainty estimation in CLIP few-shot adaptation
Pablo Morales-Álvarez
Stergios Christodoulidis
Maria Vakalopoulou
Jose Dolz
The emergence of large pre-trained vision-language models (VLMs) represents a paradigm shift in machine learning, with unprecedented results… (see more) in a broad span of visual recognition tasks. CLIP, one of the most popular VLMs, has exhibited remarkable zero-shot and transfer learning capabilities in classification. To transfer CLIP to downstream tasks, adapters constitute a parameter-efficient approach that avoids backpropagation through the large model (unlike related prompt learning methods). However, CLIP adapters have been developed to target discriminative performance, and the quality of their uncertainty estimates has been overlooked. In this work we show that the discriminative performance of state-of-the-art CLIP adapters does not always correlate with their uncertainty estimation capabilities, which are essential for a safe deployment in real-world scenarios. We also demonstrate that one of such adapters is obtained through MAP inference from a more general probabilistic framework. Based on this observation we introduce BayesAdapter, which leverages Bayesian inference to estimate a full probability distribution instead of a single point, better capturing the variability inherent in the parameter space. In a comprehensive empirical evaluation we show that our approach obtains high quality uncertainty estimates in the predictions, standing out in calibration and selective classification. Our code will be publicly available upon acceptance of the paper.
Learning Task-Agnostic Representations through Multi-Teacher Distillation
Eric Granger
Jackie CK Cheung
Ismail Ben Ayed
Mohammadhadi Shateri
Casting complex inputs into tractable representations is a critical step across various fields. Diverse embedding models emerge from differe… (see more)nces in architectures, loss functions, input modalities and datasets, each capturing unique aspects of the input. Multi-teacher distillation leverages this diversity to enrich representations but often remains tailored to specific tasks. In this paper, we introduce a task-agnostic framework based on a ``majority vote" objective function. We demonstrate that this function is bounded by the mutual information between student and teachers' embeddings, leading to a task-agnostic distillation loss that eliminates dependence on task-specific labels or prior knowledge. Our evaluations across text, vision models, and molecular modeling show that our method effectively leverages teacher diversity, resulting in representations enabling better performance for a wide range of downstream tasks such as classification, clustering, or regression. Additionally, we train and release state-of-the-art embedding models, enhancing downstream performance in various modalities.
THUNDER: Tile-level Histopathology image UNDERstanding benchmark
Pierre Marza
Leo Fillioux
Sofiène Boutaj
KUNAL MAHATHA
Christian Desrosiers
Jose Dolz
Stergios Christodoulidis
Maria Vakalopoulou
Progress in a research field can be hard to assess, in particular when many concurrent methods are proposed in a short period of time. This … (see more)is the case in digital pathology, where many foundation models have been released recently to serve as feature extractors for tile-level images, being used in a variety of downstream tasks, both for tile- and slide-level problems. Benchmarking available methods then becomes paramount to get a clearer view of the research landscape. In particular, in critical domains such as healthcare, a benchmark should not only focus on evaluating downstream performance, but also provide insights about the main differences between methods, and importantly, further consider uncertainty and robustness to ensure a reliable usage of proposed models. For these reasons, we introduce THUNDER, a tile-level benchmark for digital pathology foundation models, allowing for efficient comparison of many models on diverse datasets with a series of downstream tasks, studying their feature spaces and assessing the robustness and uncertainty of predictions informed by their embeddings. THUNDER is a fast, easy-to-use, dynamic benchmark that can already support a large variety of state-of-the-art foundation, as well as local user-defined models for direct tile-based comparison. In this paper, we provide a comprehensive comparison of 23 foundation models on 16 different datasets covering diverse tasks, feature analysis, and robustness. The code for THUNDER is publicly available at https://github.com/MICS-Lab/thunder.
Collaborative Rational Speech Act: Pragmatic Reasoning for Multi-Turn Dialog
Lautaro Estienne
Gabriel Ben Zenou
Nona Naderi
Jackie Chi Kit Cheung
As AI systems take on collaborative roles, they must reason about shared goals and beliefs-not just generate fluent language. The Rational S… (see more)peech Act (RSA) framework offers a principled approach to pragmatic reasoning, but existing extensions face challenges in scaling to multi-turn, collaborative scenarios. In this paper, we introduce Collaborative Rational Speech Act (CRSA), an information-theoretic (IT) extension of RSA that models multi-turn dialog by optimizing a gain function adapted from rate-distortion theory. This gain is an extension of the gain model that is maximized in the original RSA model but takes into account the scenario in which both agents in a conversation have private information and produce utterances conditioned on the dialog. We demonstrate the effectiveness of CRSA on referential games and template-based doctor-patient dialogs in the medical domain. Empirical results show that CRSA yields more consistent, interpretable, and collaborative behavior than existing baselines-paving the way for more pragmatic and socially aware language agents.
Rational Retrieval Acts: Leveraging Pragmatic Reasoning to Improve Sparse Retrieval
Arthur Satouf
Gabriel Ben-Zenou
Benjamin Piwowarski
Habiboulaye Amadou-Boubacar
Current sparse neural information retrieval (IR) methods, and to a lesser extent more traditional models such as BM25, do not take into acco… (see more)unt the document collection and the complex interplay between different term weights when representing a single document. In this paper, we show how the Rational Speech Acts (RSA), a linguistics framework used to minimize the number of features to be communicated when identifying an object in a set, can be adapted to the IR case -- and in particular to the high number of potential features (here, tokens). RSA dynamically modulates token-document interactions by considering the influence of other documents in the dataset, better contrasting document representations. Experiments show that incorporating RSA consistently improves multiple sparse retrieval models and achieves state-of-the-art performance on out-of-domain datasets from the BEIR benchmark. https://github.com/arthur-75/Rational-Retrieval-Acts
Multiple-model coding scheme for electrical signal compression
Corentin Presvôts
Michel Kieffer
Thibault Prevost
Patrick Panciatici
Zuxing Li
A Strong Baseline for Molecular Few-Shot Learning
Hugo Jeannin
Ismail Ben Ayed
Few-shot learning has recently attracted significant interest in drug discovery, with a recent, fast-growing literature mostly involving con… (see more)voluted meta-learning strategies. We revisit the more straightforward fine-tuning approach for molecular data, and propose a regularized quadratic-probe loss based on the the Mahalanobis distance. We design a dedicated block-coordinate descent optimizer, which avoid the degenerate solutions of our loss. Interestingly, our simple fine-tuning approach achieves highly competitive performances in comparison to state-of-the-art methods, while being applicable to black-box settings and removing the need for specific episodic pre-training strategies. Furthermore, we introduce a new benchmark to assess the robustness of the competing methods to domain shifts. In this setting, our fine-tuning baseline obtains consistently better results than meta-learning methods.
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Quantizing machine learning models has demonstrated its effectiveness in lowering memory and inference costs while maintaining performance l… (see more)evels comparable to the original models. In this work, we investigate the impact of quantization procedures on the privacy of data-driven models, specifically focusing on their vulnerability to membership inference attacks. We derive an asymptotic theoretical analysis of Membership Inference Security (MIS), characterizing the privacy implications of quantized algorithm weights against the most powerful (and possibly unknown) attacks. Building on these theoretical insights, we propose a novel methodology to empirically assess and rank the privacy levels of various quantization procedures. Using synthetic datasets, we demonstrate the effectiveness of our approach in assessing the MIS of different quantizers. Furthermore, we explore the trade-off between privacy and performance using real-world data and models in the context of molecular modeling.
On Estimating the Strength of Differentially Private Mechanisms in a Black-Box Setting
Daniele Gorla
Louis Jalouzot
Federica Granese
Catuscia Palamidessi
We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism … (see more)is a priori unknown to them (the so-called “black-box” scenario). In particular, we explore four notions of differential privacy, namely local/central
When is an Embedder More Promising than Another?
Ismail Ben Ayed
Jackie Chi Kit Cheung
Embedders play a central role in machine learning, projecting any object into numerical representations that can, in turn, be leveraged to p… (see more)erform various downstream tasks. The evaluation of embedding models typically depends on domain-specific empirical approaches utilizing downstream tasks, primarily because of the lack of a standardized framework for comparison. However, acquiring adequately large and representative datasets for conducting these assessments is not always viable and can prove to be prohibitively expensive and time-consuming. In this paper, we present a unified approach to evaluate embedders. First, we establish theoretical foundations for comparing embedding models, drawing upon the concepts of sufficiency and informativeness. We then leverage these concepts to devise a tractable comparison criterion (information sufficiency), leading to a task-agnostic and self-supervised ranking procedure. We demonstrate experimentally that our approach aligns closely with the capability of embedding models to facilitate various downstream tasks in both natural language processing and molecular biology. This effectively offers practitioners a valuable tool for prioritizing model trials.
Perfectly Accurate Membership Inference by a Dishonest Central Server in Federated Learning
Georg Pichler
Marco Romanelli
Leonardo Rey Vega
Federated Learning is expected to provide strong privacy guarantees, as only gradients or model parameters but no plain text training data i… (see more)s ever exchanged either between the clients or between the clients and the central server. In this paper, we challenge this claim by introducing a simple but still very effective membership inference attack algorithm, which relies only on a single training step. In contrast to the popular honest-but-curious model, we investigate a framework with a dishonest central server. Our strategy is applicable to models with ReLU activations and uses the properties of this activation function to achieve perfect accuracy. Empirical evaluation on visual classification tasks with MNIST, CIFAR10, CIFAR100 and CelebA datasets show that our method provides perfect accuracy in identifying one sample in a training set with thousands of samples. Occasional failures of our method lead us to discover duplicate images in the CIFAR100 and CelebA datasets.
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews
Scientific peer review is essential for the quality of academic publications. However, the increasing number of paper submissions to confere… (see more)nces has strained the reviewing process. This surge poses a burden on area chairs who have to carefully read an ever-growing volume of reviews and discern each reviewer's main arguments as part of their decision process. In this paper, we introduce \sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, \sys extracts both common and unique opinions from the reviews. We introduce novel uniqueness scores based on the Rational Speech Act framework to identify relevant sentences in the reviews. Our method aims to provide a pragmatic glimpse into all reviews, offering a balanced perspective on their opinions. Our experimental results with both automatic metrics and human evaluation show that \sys generates more discriminative summaries than baseline methods in terms of human evaluation while achieving comparable performance with these methods in terms of automatic metrics.