Publications

Re-Weighted Softmax Cross-Entropy to Control Forgetting in Federated Learning

Gwen Legate

Lucas Caccia

In Federated Learning, a global model is learned by aggregating model updates computed at a set of independent client nodes, to reduce commu… (voir plus)nication costs multiple gradient steps are performed at each node prior to aggregation. A key challenge in this setting is data heterogeneity across clients resulting in differing local objectives which can lead clients to overly minimize their own local objective, diverging from the global solution. We demonstrate that individual client models experience a catastrophic forgetting with respect to data from other clients and propose an efficient approach that modifies the cross-entropy objective on a per-client basis by re-weighting the softmax logits prior to computing the loss. This approach shields classes outside a client's label set from abrupt representation change and we empirically demonstrate it can alleviate client forgetting and provide consistent improvements to standard federated learning algorithms. Our method is particularly beneficial under the most challenging federated learning settings where data heterogeneity is high and client participation in each round is low.

2023-01-01

CoLLAs (publié)

doi.org

openreview.net

Re-Weighted Softmax Cross-Entropy to Control Forgetting in Federated Learning

Gwen Legate

Lucas Caccia

Eugene Belilovsky

In Federated Learning a global model is learned by aggregating model updates computed at a set of independent client nodes. To reduce commun… (voir plus)ication costs, multiple gradient steps are performed at each node prior to aggregation. A key challenge in this setting is data heterogeneity across clients resulting in differing local objectives. This can lead clients to overly minimize their own local objective consequently diverging from the global solution. We demonstrate that individual client models experience a catastrophic forgetting with respect to data from other clients and propose an efficient approach that modifies the cross-entropy objective on a per-client basis by re-weighting the softmax logits prior to computing the loss. This approach shields classes outside a client’s label set from abrupt representation change and we empirically demonstrate it can alleviate client forgetting and provide consistent improvements to standard federated learning algorithms. Our method is particularly beneficial under the most challenging federated learning settings where data heterogeneity is high and client participation in each round is low.

2023-01-01

CoLLAs (publié)

doi.org

openreview.net

Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

Zhongyu Li

Xue Bin Peng

Pieter Abbeel

Sergey Levine

Glen Berseth

Koushil Sreenath

2023-01-01

arXiv.org (prépublication)

doi.org

Sample Boosting Algorithm (SamBA) - An interpretable greedy ensemble classifier based on local expertise for fat data

Baptiste Bauvin

Cécile Capponi

Florence Clerc

Pascal Germain

Sokol Koço

Jacques Corbeil

2023-01-01

UAI (publié)

proceedings.mlr.press

openreview.net

Scaling Self-Supervised End-to-End Driving with Multi-View Attention Learning

Yi Xiao

Felipe Codevilla

Diego Porres

Antonio M. López

2023-01-01

arXiv.org (prépublication)

doi.org

Screening methods for congenital anomalies in low and lower-middle income countries: A systematic review.

Justina O. Seyi-Olajide

Xiya Ma

Elena Guadagno

Adesoji Ademuyiwa

Dan Poenaru

2023-01-01

Journal of Pediatric Surgery (publié)

doi.org

Self-Influence Guided Data Reweighting for Language Model Pre-training

Megh Thakkar

Tolga Bolukbasi

Sriram Ganapathy

Shikhar Vashishth

Sarath Chandar Anbil Parthipan

Partha Talukdar

Language Models (LMs) pre-trained with selfsupervision on large text corpora have become the default starting point for developing models fo… (voir plus)r various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data samples may not be the optimal choice. While data reweighting has been explored in the context of task-specific supervised learning and LM fine-tuning, model-driven reweighting for pretraining data has not been explored. We fill this important gap and propose PRESENCE, a method for jointly reweighting samples by leveraging self-influence (SI) scores as an indicator of sample importance and pre-training. PRESENCE promotes novelty and stability for model pre-training. Through extensive analysis spanning multiple model sizes, datasets, and tasks, we present PRESENCE as an important first step in the research direction of sample reweighting for pre-training language models.

2023-01-01

EMNLP (publié)

doi.org

openreview.net

SORBETmatcher results for OAEI 2023.

Francis Gosselin

Amal Zouaq

2023-01-01

OM@ISWC (publié)

dblp.uni-trier.de

Of Stances, Themes, and Anomalies in COVID-19 Mask-Wearing Tweets

Jwen Fai Low

Benjamin Fung

Farkhund Iqbal

COVID-19 is an opportunity to study public acceptance of a “new” healthcare intervention, universal masking, which unlike vaccination, i… (voir plus)s mostly alien to the Anglosphere public despite being practiced in ages past. Using a collection of over two million tweets, we studied the ways in which proponents and opponents of masking vied for influence as well as the themes driving the discourse. Pro-mask tweets encouraging others to mask up dominated Twitter early in the pandemic though its continued dominance has been eroded by anti-mask tweets criticizing others for their masking behavior. Engagement, represented by the counts of likes, retweets, and replies, and controversiality and disagreeableness, represented by ratios of the aforementioned counts, favored pro-mask tweets initially but with anti-mask tweets slowly gaining ground. Additional analysis raised the possibility of the platform owners suppressing certain parts of the mask-wearing discussion.

2023-01-01

IEEE Access (publié)

doi.org

Stochastic Generative Flow Networks

Ling Pan

Dinghuai Zhang

Moksh J. Jain

Longbo Huang

Yoshua Bengio

Generative Flow Networks (or GFlowNets for short) are a family of probabilistic agents that learn to sample complex combinatorial structures… (voir plus) through the lens of ``inference as control''. They have shown great potential in generating high-quality and diverse candidates from a given energy landscape. However, existing GFlowNets can be applied only to deterministic environments, and fail in more general tasks with stochastic dynamics, which can limit their applicability. To overcome this challenge, this paper introduces Stochastic GFlowNets, a new algorithm that extends GFlowNets to stochastic environments. By decomposing state transitions into two steps, Stochastic GFlowNets isolate environmental stochasticity and learn a dynamics model to capture it. Extensive experimental results demonstrate that Stochastic GFlowNets offer significant advantages over standard GFlowNets as well as MCMC- and RL-based approaches, on a variety of standard benchmarks with stochastic dynamics.

2023-01-01

UAI (publié)

doi.org

openreview.net

Stochastic Generative Flow Networks

Ling Pan

Dinghuai Zhang

Moksh J. Jain

Longbo Huang

Yoshua Bengio

2023-01-01

UAI (publié)