Irene Tenison

Knowledge Distillation for Federated Learning: a Practical Guide

Paolo Bellavista

Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data. This paves th… (see more)e way for stronger privacy guarantees when building predictive models. The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits: (i) Clients must implement the same model architecture; (ii) Transmitting model weights and model updates implies high communication cost, which scales up with the number of model parameters; (iii) In presence of non-IID data distributions, parameter-averaging aggregation schemes perform poorly due to client model drifts. Federated adaptations of regular Knowledge Distillation (KD) can solve and/or mitigate the weaknesses of parameter-averaging FL algorithms while possibly introducing other trade-offs. In this article, we provide a review of KD-based algorithms tailored for specific FL issues.

2024-07-31

Proceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Gradient Masked Averaging for Federated Learning

Irene Tenison

Sai Aravind Sreeramadas

Vaikkunth Mugunthan

Edouard Oyallon

Irina Rish

Eugene Belilovsky

Federated learning (FL) is an emerging paradigm that permits a large number of clients with heterogeneous data to coordinate learning of a u… (see more)nified global model without the need to share data amongst each other. A major challenge in federated learning is the heterogeneity of data across client, which can degrade the performance of standard FL algorithms. Standard FL algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. However, we argue that in heterogeneous settings, averaging can result in information loss and lead to poor generalization due to the bias induced by dominant client gradients. We hypothesize that to generalize better across non-i.i.d datasets, the algorithms should focus on learning the invariant mechanism that is constant while ignoring spurious mechanisms that differ across clients. Inspired from recent works in Out-of-Distribution generalization, we propose a gradient masked averaging approach for FL as an alternative to the standard averaging of client updates. This aggregation technique for client updates can be adapted as a drop-in replacement in most existing federated algorithms. We perform extensive experiments on multiple FL algorithms with in-distribution, real-world, feature-skewed out-of-distribution, and quantity imbalanced datasets and show that it provides consistent improvements, particularly in the case of heterogeneous clients.

2023-10-23

Transactions on Machine Learning Research (accepted)

PRACTICAL GUIDE

Paolo Bellavista

2021-12-31

(published)

www.semanticscholar.org

Gradient Masked Federated Optimization

Irene Tenison

Sreya Francis

Irina Rish

2021-04-20

ArXiv (preprint)

arxiv.org

Towards Causal Federated Learning For Enhanced Robustness and Privacy

Sreya Francis

Irene Tenison

Irina Rish

Federated Learning is an emerging privacy-preserving distributed machine learning approach to building a shared model by performing distribu… (see more)ted training locally on participating devices (clients) and aggregating the local models into a global one. As this approach prevents data collection and aggregation, it helps in reducing associated privacy risks to a great extent. However, the data samples across all participating clients are usually not independent and identically distributed (non-iid), and Out of Distribution(OOD) generalization for the learned models can be poor. Besides this challenge, federated learning also remains vulnerable to various attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this paper, we propose an approach for learning invariant (causal) features common to all participating clients in a federated learning setup and analyze empirically how it enhances the Out of Distribution (OOD) accuracy as well as the privacy of the final learned model.

2021-04-13

ArXiv (preprint)

arxiv.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Irene Tenison

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Irene Tenison

Publications