Publications

Untold stories: A qualitative investigation of patient and family experiences with congenital diaphragmatic hernia.
Alexandra Dimmer
Zanib Nafees
Sabrina Beauseigle
Franco A Carnevale
Elena Guadagno
Pramod Puligandla
In Which Areas of Technical AI Safety Could Geopolitical Rivals Cooperate?
BEN BUCKNALL
Saad Siddiqui
Robert Trager
LARA THURNHERR
CONOR MCGURK
BEN HARACK
International cooperation is common in AI research, including between geopolitical rivals. While many experts advocate for greater internati… (see more)onal cooperation on AI safety to address shared global risks, some view cooperation on AI with suspicion, arguing that it can pose unacceptable risks to national security. However, the extent to which cooperation on AI safety poses such risks, as well as provides benefits, depends on the specific area of cooperation. In this paper, we consider technical factors that impact the risks of international cooperation on AI safety research, focusing on the degree to which such cooperation can advance dangerous capabilities, result in the sharing of sensitive information, or provide opportunities for harm. We begin by why nations historically cooperate on strategic technologies and analyse current US-China cooperation in AI as a case study. We further argue that existing frameworks for managing associated risks can be supplemented with consideration of key risks specific to cooperation on technical AI safety research. Through our analysis, we find that research into AI verification mechanisms and shared protocols may be suitable areas for such cooperation. Through this analysis we aim to help researchers and governments identify and mitigate the risks of international cooperation on AI safety research, so that the benefits of cooperation can be fully realised.
An Empirical Study on Method-Level Performance Evolution in Open-Source Java Projects
Kaveh Shahedi
Nana Gyambrah
Heng Li
Maxime Lamothe
Performance is a critical quality attribute in software development, yet the impact of method-level code changes on performance evolution re… (see more)mains poorly understood. While developers often make intuitive assumptions about which types of modifications are likely to cause performance regressions or improvements, these beliefs lack empirical validation at a fine-grained level. We conducted a large-scale empirical study analyzing performance evolution in 15 mature open-source Java projects hosted on GitHub. Our analysis encompassed 739 commits containing 1,499 method-level code changes, using Java Microbenchmark Harness (JMH) for precise performance measurement and rigorous statistical analysis to quantify both the significance and magnitude of performance variations. We employed bytecode instrumentation to capture method-specific execution metrics and systematically analyzed four key aspects: temporal performance patterns, code change type correlations, developer and complexity factors, and domain-size interactions. Our findings reveal that 32.7% of method-level changes result in measurable performance impacts, with regressions occurring 1.3 times more frequently than improvements. Contrary to conventional wisdom, we found no significant differences in performance impact distributions across code change categories, challenging risk-stratified development strategies. Algorithmic changes demonstrate the highest improvement potential but carry substantial regression risk. Senior developers produce more stable changes with fewer extreme variations, while code complexity correlates with increased regression likelihood. Domain-size interactions reveal significant patterns, with web server + small projects exhibiting the highest performance instability. Our study provides empirical evidence for integrating automated performance testing into continuous integration pipelines.
Low-Rank Expert Merging for Multi-Source Domain Adaptation in Person Re-Identification
Taha Mustapha Nehdi
Nairouz Mrabah
Atif Belal
Eric Granger
Spatio-Temporal Conditional Diffusion Models for Forecasting Future Multiple Sclerosis Lesion Masks Conditioned on Treatments
Gian Mario Favero
Ge Ya Luo
Douglas Arnold
Christopher Pal
CISO: Species Distribution Modeling Conditioned on Incomplete Species Observations
Hager Radi Abdelwahed
Mélisande Teng
Robin Zbinden
Laura Pollock
Devis Tuia
Species distribution models (SDMs) are widely used to predict species' geographic distributions, serving as critical tools for ecological re… (see more)search and conservation planning. Typically, SDMs relate species occurrences to environmental variables representing abiotic factors, such as temperature, precipitation, and soil properties. However, species distributions are also strongly influenced by biotic interactions with other species, which are often overlooked. While some methods partially address this limitation by incorporating biotic interactions, they often assume symmetrical pairwise relationships between species and require consistent co-occurrence data. In practice, species observations are sparse, and the availability of information about the presence or absence of other species varies significantly across locations. To address these challenges, we propose CISO, a deep learning-based method for species distribution modeling Conditioned on Incomplete Species Observations. CISO enables predictions to be conditioned on a flexible number of species observations alongside environmental variables, accommodating the variability and incompleteness of available biotic data. We demonstrate our approach using three datasets representing different species groups: sPlotOpen for plants, SatBird for birds, and a new dataset, SatButterfly, for butterflies. Our results show that including partial biotic information improves predictive performance on spatially separate test sets. When conditioned on a subset of species within the same dataset, CISO outperforms alternative methods in predicting the distribution of the remaining species. Furthermore, we show that combining observations from multiple datasets can improve performance. CISO is a promising ecological tool, capable of incorporating incomplete biotic information and identifying potential interactions between species from disparate taxa.
Long Range Navigator (LRN): Extending robot planning horizons beyond metric maps
Matt Schmittle
Rohan Baijal
Nathan Hatch
Rosario Scalise
Mateo Guaman Castro
Sidharth Talia
Byron Boots
Siddhartha Srinivasa
Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method
Masoumeh Sharafi
Soufiane Belharbi
Houssem Ben Salem
Ali Etemad
Alessandro Lameiras Koerich
Simon Bacon
Eric Granger
Facial expression recognition (FER) models are employed in many video-based affective computing applications, such as human-computer interac… (see more)tion and healthcare monitoring. However, deep FER models often struggle with subtle expressions and high inter-subject variability, limiting their performance in real-world applications. To improve their performance, source-free domain adaptation (SFDA) methods have been proposed to personalize a pretrained source model using only unlabeled target domain data, thereby avoiding data privacy, storage, and transmission constraints. This paper addresses a challenging scenario where source data is unavailable for adaptation, and only unlabeled target data consisting solely of neutral expressions is available. SFDA methods are not typically designed to adapt using target data from only a single class. Further, using models to generate facial images with non-neutral expressions can be unstable and computationally intensive. In this paper, personalized feature translation (PFT) is proposed for SFDA. Unlike current image translation methods for SFDA, our lightweight method operates in the latent space. We first pre-train the translator on the source domain data to transform the subject-specific style features from one source subject into another. Expression information is preserved by optimizing a combination of expression consistency and style-aware objectives. Then, the translator is adapted on neutral target data, without using source data or image synthesis. By translating in the latent space, PFT avoids the complexity and noise of face expression generation, producing discriminative embeddings optimized for classification. Using PFT eliminates the need for image synthesis, reduces computational overhead (using a lightweight translator), and only adapts part of the model, making the method efficient compared to image-based translation.
Bias-inducing geometries: an exactly solvable data model with fairness implications
Stefano Sarao Mannelli
Federica Gerace
Luca Saglietti
Revealing dynamic temporal trajectories and underlying regulatory networks with
<i>Cflows</i>
Manik Kuchroo
Shabarni Gupta
Aarthi Venkat
Chen Liu
Beatriz P. San Juan
Laura Rangel
Brandon Zhu
John G. Lock
Christine L. Chaffer
While single-cell technologies provide snapshots of tumor states, building continuous trajectories and uncovering causative gene regulatory … (see more)networks remains a significant challenge. We present Cflows , an AI framework that combines neural ODE networks with Granger causality to infer continuous cell state transitions and gene regulatory interactions from static scRNA-seq data. In a new 5-time point dataset capturing tumorsphere development over 30 days, Cflows reconstructs two types of trajectories leading to tumorsphere formation or apoptosis. Trajectory-based cell-of-origin analysis delineated a novel cancer stem cell profile characterized by CD44 hi EPCAM + CAV1 + , and uncovered a cell cycle–dependent enrichment of tumorsphere-initiating potential in G2/M or S-phase cells. Cflows uncovers ESRRA as a crucial causal driver of the tumor-forming gene regulatory network. Indeed, ESRRA inhibition significantly reduces tumor growth and metastasis in vivo. Cflows offers a powerful framework for uncovering cellular transitions and dynamic regulatory networks from static single-cell data.
Whither symbols in the era of advanced neural networks?
Thomas L. Griffiths
Brenden M. Lake
R. Thomas McCoy
Ellie Pavlick
Some of the strongest evidence that human minds should be thought about in terms of symbolic systems has been the way they combine ideas, pr… (see more)oduce novelty, and learn quickly. We argue that modern neural networks -- and the artificial intelligence systems built upon them -- exhibit similar abilities. This undermines the argument that the cognitive processes and representations used by human minds are symbolic, although the fact that these neural networks are typically trained on data generated by symbolic systems illustrates that such systems play an important role in characterizing the abstract problems that human minds have to solve. This argument leads us to offer a new agenda for research on the symbolic basis of human thought.
Persistent Instability in LLM's Personality Measurements: Effects of Scale, Reasoning, and Conversation History
Yorguin-Jose Mantilla-Ramos
Mahmood Hegazy
Alberto Tosato
D. Lemay