Publications

Re-Weighted Softmax Cross-Entropy to Control Forgetting in Federated Learning
Gwen Legate
Lucas Caccia
In Federated Learning, a global model is learned by aggregating model updates computed at a set of independent client nodes, to reduce commu… (voir plus)nication costs multiple gradient steps are performed at each node prior to aggregation. A key challenge in this setting is data heterogeneity across clients resulting in differing local objectives which can lead clients to overly minimize their own local objective, diverging from the global solution. We demonstrate that individual client models experience a catastrophic forgetting with respect to data from other clients and propose an efficient approach that modifies the cross-entropy objective on a per-client basis by re-weighting the softmax logits prior to computing the loss. This approach shields classes outside a client's label set from abrupt representation change and we empirically demonstrate it can alleviate client forgetting and provide consistent improvements to standard federated learning algorithms. Our method is particularly beneficial under the most challenging federated learning settings where data heterogeneity is high and client participation in each round is low.
Re-Weighted Softmax Cross-Entropy to Control Forgetting in Federated Learning
Gwen Legate
Lucas Caccia
In Federated Learning a global model is learned by aggregating model updates computed at a set of independent client nodes. To reduce commun… (voir plus)ication costs, multiple gradient steps are performed at each node prior to aggregation. A key challenge in this setting is data heterogeneity across clients resulting in differing local objectives. This can lead clients to overly minimize their own local objective consequently diverging from the global solution. We demonstrate that individual client models experience a catastrophic forgetting with respect to data from other clients and propose an efficient approach that modifies the cross-entropy objective on a per-client basis by re-weighting the softmax logits prior to computing the loss. This approach shields classes outside a client’s label set from abrupt representation change and we empirically demonstrate it can alleviate client forgetting and provide consistent improvements to standard federated learning algorithms. Our method is particularly beneficial under the most challenging federated learning settings where data heterogeneity is high and client participation in each round is low.
Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning
Zhongyu Li
Xue Bin Peng
Pieter Abbeel
Sergey Levine
Koushil Sreenath
Sample Boosting Algorithm (SamBA) - An interpretable greedy ensemble classifier based on local expertise for fat data
Baptiste Bauvin
Cécile Capponi
Florence Clerc
Sokol Koço
Jacques Corbeil
Scaling Self-Supervised End-to-End Driving with Multi-View Attention Learning
Yi Xiao
Felipe Codevilla
Diego Porres
Antonio M. López
Screening methods for congenital anomalies in low and lower-middle income countries: A systematic review.
Justina O. Seyi-Olajide
Xiya Ma
Elena Guadagno
Adesoji Ademuyiwa
Self-Influence Guided Data Reweighting for Language Model Pre-training
Megh Thakkar
Tolga Bolukbasi
Sriram Ganapathy
Shikhar Vashishth
Partha Talukdar
Language Models (LMs) pre-trained with selfsupervision on large text corpora have become the default starting point for developing models fo… (voir plus)r various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data samples may not be the optimal choice. While data reweighting has been explored in the context of task-specific supervised learning and LM fine-tuning, model-driven reweighting for pretraining data has not been explored. We fill this important gap and propose PRESENCE, a method for jointly reweighting samples by leveraging self-influence (SI) scores as an indicator of sample importance and pre-training. PRESENCE promotes novelty and stability for model pre-training. Through extensive analysis spanning multiple model sizes, datasets, and tasks, we present PRESENCE as an important first step in the research direction of sample reweighting for pre-training language models.
SORBETmatcher results for OAEI 2023.
Francis Gosselin
Of Stances, Themes, and Anomalies in COVID-19 Mask-Wearing Tweets
Jwen Fai Low
Farkhund Iqbal
COVID-19 is an opportunity to study public acceptance of a “new” healthcare intervention, universal masking, which unlike vaccination, i… (voir plus)s mostly alien to the Anglosphere public despite being practiced in ages past. Using a collection of over two million tweets, we studied the ways in which proponents and opponents of masking vied for influence as well as the themes driving the discourse. Pro-mask tweets encouraging others to mask up dominated Twitter early in the pandemic though its continued dominance has been eroded by anti-mask tweets criticizing others for their masking behavior. Engagement, represented by the counts of likes, retweets, and replies, and controversiality and disagreeableness, represented by ratios of the aforementioned counts, favored pro-mask tweets initially but with anti-mask tweets slowly gaining ground. Additional analysis raised the possibility of the platform owners suppressing certain parts of the mask-wearing discussion.
Stochastic Generative Flow Networks
Ling Pan
Dinghuai Zhang
Moksh J. Jain
Longbo Huang
Generative Flow Networks (or GFlowNets for short) are a family of probabilistic agents that learn to sample complex combinatorial structures… (voir plus) through the lens of ``inference as control''. They have shown great potential in generating high-quality and diverse candidates from a given energy landscape. However, existing GFlowNets can be applied only to deterministic environments, and fail in more general tasks with stochastic dynamics, which can limit their applicability. To overcome this challenge, this paper introduces Stochastic GFlowNets, a new algorithm that extends GFlowNets to stochastic environments. By decomposing state transitions into two steps, Stochastic GFlowNets isolate environmental stochasticity and learn a dynamics model to capture it. Extensive experimental results demonstrate that Stochastic GFlowNets offer significant advantages over standard GFlowNets as well as MCMC- and RL-based approaches, on a variety of standard benchmarks with stochastic dynamics.
Stochastic Generative Flow Networks
Ling Pan
Dinghuai Zhang
Moksh J. Jain
Longbo Huang
SUMMIT: Scaffolding OSS Issue Discussion Through Summarization
Saskia Gilmer
Avinash Bhat
Shuvam Shah
Kevin Cherry
Jinghui Cheng