Publications

Leveraging Function Space Aggregation for Federated Learning at Scale
Nikita Dhawan
Nicole Elyse Mitchell
Zachary Charles
Zachary Garrett
The federated learning paradigm has motivated the development of methods for aggregating multiple client updates into a global server model,… (see more) without sharing client data. Many federated learning algorithms, including the canonical Federated Averaging (FedAvg), take a direct (possibly weighted) average of the client parameter updates, motivated by results in distributed optimization. In this work, we adopt a function space perspective and propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients, using an estimate based on their Fisher information. We evaluate FedFish on realistic, large-scale cross-device benchmarks. While the performance of FedAvg can suffer as client models drift further apart, we demonstrate that FedFish is more robust to longer local training. Our evaluation across several settings in image and language benchmarks shows that FedFish outperforms FedAvg as local training epochs increase. Further, FedFish results in global networks that are more amenable to efficient personalization via local fine-tuning on the same or shifted data distributions. For instance, federated pretraining on the C4 dataset, followed by few-shot personalization on Stack Overflow, results in a 7% improvement in next-token prediction by FedFish over FedAvg.
Metrics reloaded: recommendations for image analysis validation.
Lena Maier-Hein
Annika Reinke
Evangelia Christodoulou
Ben Glocker
PATRICK GODAU
Fabian Isensee
Jens Kleesiek
Michal Kozubek
Mauricio Reyes
MICHAEL A. RIEGLER
Manuel Wiesenfarth
Michael Baumgartner
Matthias Eisenmann
DOREEN HECKMANN-NÖTZEL
A. EMRE KAVUR
TIM RÄDSCH
Minu Dietlinde Tizabi
Laura Acion
Michela Antonelli
Spyridon Bakas
Peter Bankhead
Allison Benis
M. Jorge Cardoso
Veronika Cheplygina
BETH A. CIMINI
Gary S. Collins
Keyvan Farahani
Bram van Ginneken
Daniel A. Hashimoto
Michael M. Hoffman
Merel Huisman
Pierre Jannin
CHARLES E. KAHN
Alexandros Karargyris
Alan Karthikesalingam
H. Kenngott
Annette Kopp-Schneider
Anna Kreshuk
Tahsin Kurc
Bennett Landman
GEERT LITJENS
Amin Madani
Klaus Maier-Hein
Anne L. Martel
Peter Mattson
Erik Meijering
Bjoern Menze
David Moher
Karel G.M. Moons
Henning Müller
Felix Nickel
Jens Petersen
Nasir Rajpoot
Nicola Rieke
Julio Saez-Rodriguez
Clarisa S'anchez Guti'errez
Shravya Shetty
M. Smeden
Carole H. Sudre
Ronald M. Summers
Abdel Aziz Taha
Sotirios A. Tsaftaris
Ben Van Calster
PAUL F. JÄGER
The impact of gender on pediatric surgical access and outcomes in Africa
Sacha Williams
Olivia Serhan
Jenny Wang
Christian Guindi,
Elena Guadagno
Maeve Trudeau
Emmanuel Ameh
Kokila Lakhoo
Girls, whose care is often affected by barriers steeped in gender inequity, may be at higher risk of poor surgical outcomes. This study expl… (see more)ored the impact of gender on pediatric surgical care in Africa. Differences in access to care and clinical outcomes for boys and girls were examined for pediatric surgical conditions that do not differ by physiological sex. A systematic review of African pediatric surgical studies ensued, followed by a random effects meta-analysis, and risk of bias assessment. Of the 12281 records retrieved, 54 were selected for review. Most studies were retrospective (57.4%), single-site (94.4%), from Egypt, Nigeria, Ghana, or Ethiopia (55.6%), focussed on gastrointestinal conditions (63.0%), published in 2010 or sooner (85.1%), had study durations of 5 years or less (68.5%), and cohorts of less than 200 children (57.4%). Sixty percent reported the outcome of mortality. Meta-analysis odds ratios revealed surgery was performed 3.6 times more often on boys (95% CI: 2.6, 4.9); and mortality was 1.6 times greater for girls (95% CI: 1.3, 2.0). African girls appear to face gender inequities in pediatric surgical care. Findings will be further explored in a mixed-methods study. I Gender disparities in global surgical care have been documented in the African adult population. However gender specific differentials in surgical access and outcomes have yet to be documented for African pediatric populations. This study provides first-time evidence of gender inequity in pediatric surgical care in Africa.
The Leukemoid Reaction in Severe Alcoholic Hepatitis: A Case Report
Sachin Agrawal
Sunil Kumar
Sourya Acharya
Deep Learning for Data-Driven Districting-and-Routing
Arthur Ferraz
Cheikh Ahmed
Thibaut Vidal
In-Context Learning Can Re-learn Forbidden Tasks
Despite significant investment into safety training, large language models (LLMs) deployed in the real world still suffer from numerous vuln… (see more)erabilities. One perspective on LLM safety training is that it algorithmically forbids the model from answering toxic or harmful queries. To assess the effectiveness of safety training, in this work, we study forbidden tasks, i.e., tasks the model is designed to refuse to answer. Specifically, we investigate whether in-context learning (ICL) can be used to re-learn forbidden tasks despite the explicit fine-tuning of the model to refuse them. We first examine a toy example of refusing sentiment classification to demonstrate the problem. Then, we use ICL on a model fine-tuned to refuse to summarise made-up news articles. Finally, we investigate whether ICL can undo safety training, which could represent a major security risk. For the safety task, we look at Vicuna-7B, Starling-7B, and Llama2-7B. We show that the attack works out-of-the-box on Starling-7B and Vicuna-7B but fails on Llama2-7B. Finally, we propose an ICL attack that uses the chat template tokens like a prompt injection attack to achieve a better attack success rate on Vicuna-7B and Starling-7B. Trigger Warning: the appendix contains LLM-generated text with violence, suicide, and misinformation.
When is Momentum Extragradient Optimal? A Polynomial-Based Analysis
Junhyung Lyle Kim
Anastasios Kyrillidis
Fabian Pedregosa
The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective op… (see more)timization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extragradient method achieves convergence. Building on the recently proven accelerated convergence of the momentum extragradient method for bilinear games \citep{azizian2020accelerating}, we use a polynomial-based analysis to identify three distinct scenarios where this method exhibits further accelerated convergence. These scenarios encompass situations where the eigenvalues reside on the (positive) real line, lie on the real line alongside complex conjugates, or exist solely as complex conjugates. Furthermore, we derive the hyperparameters for each scenario that achieve the fastest convergence rate.
Feature learning as alignment: a structural property of gradient descent in non-linear neural networks
Daniel Beaglehole
Atish Agarwala
Understanding the mechanisms through which neural networks extract statistics from input-label pairs through feature learning is one of the … (see more)most important unsolved problems in supervised learning. Prior works demonstrated that the gram matrices of the weights (the neural feature matrices, NFM) and the average gradient outer products (AGOP) become correlated during training, in a statement known as the neural feature ansatz (NFA). Through the NFA, the authors introduce mapping with the AGOP as a general mechanism for neural feature learning. However, these works do not provide a theoretical explanation for this correlation or its origins. In this work, we further clarify the nature of this correlation, and explain its emergence. We show that this correlation is equivalent to alignment between the left singular structure of the weight matrices and the newly defined pre-activation tangent features at each layer. We further establish that the alignment is driven by the interaction of weight changes induced by SGD with the pre-activation features, and analyze the resulting dynamics analytically at early times in terms of simple statistics of the inputs and labels. We prove the derivative alignment occurs with high probability in specific high dimensional settings. Finally, motivated by the observation that the NFA is driven by this centered correlation, we introduce a simple optimization rule that dramatically increases the NFA correlations at any given layer and improves the quality of features learned.
AICOM-MP: an AI-based Monkeypox Detector for Resource-Constrained Environments
Tianyi Yang
Tianze Yang
Andrew Liu
Na An
Jie Tang
Shaoshan Liu
Xue Liu
Polynomial Lawvere Logic
Giorgio Bacci
Radu Mardare
Gordon D. Plotkin
Toward Human-AI Alignment in Large-Scale Multi-Player Games
Sugandha Sharma
Guy Davidson
Anssi Kanervisto
Udit Arora
Katja Hofmann
Ida Momennejad
Achieving human-AI alignment in complex multi-agent games is crucial for creating trustworthy AI agents that enhance gameplay. We propose a … (see more)method to evaluate this alignment using an interpretable task-sets framework, focusing on high-level behavioral tasks instead of low-level policies. Our approach has three components. First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (100K+ games), uncovering behavioral patterns in a complex task space. This task space serves as a basis set for a behavior manifold capturing interpretable axes: fight-flight, explore-exploit, and solo-multi-agent. Second, we train an AI agent to play Bleeding Edge using a Generative Pretrained Causal Transformer and measure its behavior. Third, we project human and AI gameplay to the proposed behavior manifold to compare and contrast. This allows us to interpret differences in policy as higher-level behavioral concepts, e.g., we find that while human players exhibit variability in fight-flight and explore-exploit behavior, AI players tend towards uniformity. Furthermore, AI agents predominantly engage in solo play, while humans often engage in cooperative and competitive multi-agent patterns. These stark differences underscore the need for interpretable evaluation, design, and integration of AI in human-aligned applications. Our study advances the alignment discussion in AI and especially generative AI research, offering a measurable framework for interpretable human-agent alignment in multiplayer gaming.
Carthago Delenda Est: Co-opetitive Indirect Information Diffusion Model for Influence Operations on Online Social Media
Jwen Fai Low
Benjamin C. M. Fung
Farkhund Iqbal
Claude Fachkha