Publications

FairFLRep: Fairness aware fault localization and repair of Deep Neural Networks
Moses Openja
Paolo Arcaini
Fuyuki Ishikawa
EZH2 Inhibition Induces an Integrated Stress Response Driving Glutamine-Dependent Vulnerability in TNBC
Lucas Porras
Marina Fukano
Ann-Sophie Gironne
Elise Quadri
Gabriel Alzial
Hugo Philippeau
Yousef Aleassa
Anie Monast
Faustine Gorse
Myriame Saint-Arnaud
Mariana De Sa Tavares Russo
Sylvie Mader
Daina Avizonis
Morag Park
Geneviève Deblois
EZH2, the catalytic subunit of Polycomb Repressive Complex II, is highly expressed and associated with poor prognosis in triple-negative bre… (see more)ast cancer (TNBC). Despite inducing significant changes in chromatin profiles and gene expression, EZH2 inhibition in TNBC models has limited impact on growth, suggesting adaptive compensatory mechanisms. Here, we demonstrate that EZH2 inhibition induces accumulation of double-stranded RNA and misfolded proteins in TNBC, activating an integrated stress response (ISR) via the PKR/PERK-eIF2α pathway. We identify Activating Transcription Factor 4 (ATF4) as a key effector upon EZH2 inhibition, driving metabolic changes characterized by increased amino acid uptake and glutamine dependency. Targeting this ISR-ATF4-mediated metabolic response using glutaminase inhibitor in combination with EZH2 inhibition significantly impairs TNBC cell proliferation and tumor progression. These findings reveal a stress-driven metabolic adaptation that enables TNBC survival upon EZH2 blockade, highlighting inhibition of this pathway as a strategy to enhance the efficacy of EZH2 inhibitors in TNBC.
An AI system to help scientists write expert-level empirical software
Eser Aygün
Gheorghe Comanici
Marc Coram
Hao Cui
Jake Garrison
Renee Johnston Anton Kast
Cory Y. McLean
Peter C. Norgaard
Zahra Shamsi
David Smalling
James Thompson
Subhashini Venugopalan
Brian P Williams
Chujun He
Sarah Martinson
Martyna Plomecka
Lai Wei
Yuchen Zhou
Qian-Ze Zhu … (see 21 more)
Matthew Abraham
Erica Brand
Anna Bulanova
Jeffrey A. Cardille
Chris Co
Scott Ellsworth
Grace Joseph
Malcolm Kane
Ryan K. Krueger
Johan Kartiwa
D. Liebling
Jan-Matthis Lueckmann
Paul Raccuglia
Xuefei Wang
Katherine Chou
James Manyika
Yossi Matias
J.C. Platt
Lizzie Dorfman
Shibl Mourad
Michael P. Brenner
The cycle of scientific discovery is frequently bottlenecked by the slow, manual creation of software to support computational experiments. … (see more)To address this, we present an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. The system uses a Large Language Model (LLM) and Tree Search (TS) to systematically improve the quality metric and intelligently navigate the large space of possible solutions. The system achieves expert-level results when it explores and integrates complex research ideas from external sources. The effectiveness of tree search is demonstrated across a wide range of benchmarks. In bioinformatics, it discovered 40 novel methods for single-cell data analysis that outperformed the top human-developed methods on a public leaderboard. In epidemiology, it generated 14 models that outperformed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations. Our method also produced state-of-the-art software for geospatial analysis, neural activity prediction in zebrafish, time series forecasting and numerical solution of integrals. By devising and implementing novel solutions to diverse tasks, the system represents a significant step towards accelerating scientific progress.
Task Robustness via Re-Labelling Vision-Action Robot Data
Massive Extremely High-velocity Outflow in the Quasar J164653.72+243942.2
Paola Rodríguez Hidalgo
Hyunseop Choi (최현섭)
Patrick B. Hall
Karen M. Leighly
Liliana Flores
Mikel M. Charles
Cora DeFrancesco
We present the analysis of one of the most extreme quasar outflows found to date in our survey of extremely high velocity outflows (EHVO). J… (see more)164653.72+243942.2 (z ~ 3.04) shows variable CIV1548,1551 absorption at speeds larger than 0.1c, accompanied by SiIV, NV and Lya, and disappearing absorption at lower speeds. We perform absorption measurements using the Apparent Optical Depth method and SimBAL. We find the absorption to be very broad (Δv ~35,100 km/s in the first epoch and ~13,000 km/s in the second one) and fast (vmax ~ -50,200 km/s and -49,000 km/s, respectively). We measure large column densities (
Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients
Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, … (see more)memory and communication constraints on these edge devices may preclude their participation in training. We consider a setting in which a subset of edge devices are below a critical memory or communication threshold required to conduct model updates. Under typical federated optimization algorithms, these devices are excluded from training which renders their data inaccessible and increases system induced bias. We are inspired by MeZO, a zeroth-order method used for memory-efficient fine-tuning. The increased variance inherent to zeroth-order gradient approximations has relegated previous zeroth-order optimizers exclusively to the domain of fine tuning; a limitation we seek to correct. We devise a federated, memory-efficient zeroth-order optimizer, ZOWarmUp that permits zeroth-order training from a random initialization. ZOWarmUp leverages differing client capabilities and careful variance reduction techniques to facilitate participation of under-represented, low-resource clients in model training. Like other federated zeroth-order methods, ZOWarmUp eliminates the need for edge devices to transmit their full gradients to the server and instead relies on only a small set of random seeds, rendering the up-link communication cost negligible. We present experiments using various datasets and model architectures to show that ZOWarmUp is a robust algorithm that can can be applied under a wide variety of circumstances. For systems with a high proportion of edge devices that would otherwise be excluded from training, this algorithm provides access to a greater volume and diversity of data, thus improving training outcomes.
Behaviour Discovery and Attribution for Explainable Reinforcement Learning
Rishav
S Ebrahimi Kahou
Learning Laplacian Eigenvectors: a Pre-training Method for Graph Neural Networks
Howard Dai
Nyambura Njenga
Catherine Ma
Ryan Pellico
Ian Adelstein
DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching
Early Deforestation Detection in the Tropics using L-band SAR and Optical multi-sensor data and Bayesian Statistics
Africa I. Flores-Anderson
Jeffrey A. Cardille
Josef Kellndorfer
Franz J. Meyer
Pontus Olofsson
FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation
Yusuf Cem Sübakan
Mirco Ravanaelli
Neural audio codecs are a fundamental component of modern generative audio pipelines. Although recent codecs achieve strong low-bitrate reco… (see more)nstruction and provide powerful representations for downstream tasks, most are non-streamable, limiting their use in real-time applications. We present FocalCodec-Stream, a hybrid codec based on focal modulation that compresses speech into a single binary codebook at 0.55 - 0.80 kbps with a theoretical latency of 80 ms. Our approach combines multi-stage causal distillation of WavLM with targeted architectural improvements, including a lightweight refiner module that enhances quality under latency constraints. Experiments show that FocalCodec-Stream outperforms existing streamable codecs at comparable bitrates, while preserving both semantic and acoustic information. The result is a favorable trade-off between reconstruction quality, downstream task performance, latency, and efficiency. Code and checkpoints will be released at https://github.com/lucadellalib/focalcodec.
Identifying birdsong syllables without labelled data
Identifying sequences of syllables within birdsongs is key to tackling a wide array of challenges, including bird individual identification … (see more)and better understanding of animal communication and sensory-motor learning. Recently, machine learning approaches have demonstrated great potential to alleviate the need for experts to label long audio recordings by hand. However, they still typically rely on the availability of labelled data for model training, restricting applicability to a few species and datasets. In this work, we build the first fully unsupervised algorithm to decompose birdsong recordings into sequences of syllables. We first detect syllable events, then cluster them to extract templates -- syllable representations -- before performing matching pursuit to decompose the recording as a sequence of syllables. We evaluate our automatic annotations against human labels on a dataset of Bengalese finch songs and find that our unsupervised method achieves high performance. We also demonstrate that our approach can distinguish individual birds within a species through their unique vocal signatures, for both Bengalese finches and another species, the great tit.