Publications

CellForge: Agentic Design of Virtual Cell Models

Xiangru Tang

Zhuoyun Yu

Jiapeng Chen

Yan Cui

Yanjun Shao

Weixu Wang

Fang Wu

Yuchen Zhuang

Wenqi Shi

Zhi Huang

Arman Cohan

Xihong Lin

Fabian Theis

Smita Krishnaswamy

Mark B. Gerstein

Virtual cell modeling represents an emerging frontier at the intersection of artificial intelligence and biology, aiming to predict quantiti… (voir plus)es such as responses to diverse perturbations quantitatively. However, autonomously building computational models for virtual cells is challenging due to the complexity of biological systems, the heterogeneity of data modalities, and the need for domain-specific expertise across multiple disciplines. Here, we introduce CellForge, an agentic system that leverages a multi-agent framework that transforms presented biological datasets and research objectives directly into optimized computational models for virtual cells. More specifically, given only raw single-cell multi-omics data and task descriptions as input, CellForge outputs both an optimized model architecture and executable code for training virtual cell models and inference. The framework integrates three core modules: Task Analysis for presented dataset characterization and relevant literature retrieval, Method Design, where specialized agents collaboratively develop optimized modeling strategies, and Experiment Execution for automated generation of code. The agents in the Design module are separated into experts with differing perspectives and a central moderator, and have to collaboratively exchange solutions until they achieve a reasonable consensus. We demonstrate CellForge's capabilities in single-cell perturbation prediction, using six diverse datasets that encompass gene knockouts, drug treatments, and cytokine stimulations across multiple modalities. CellForge consistently outperforms task-specific state-of-the-art methods. Overall, CellForge demonstrates how iterative interaction between LLM agents with differing perspectives provides better solutions than directly addressing a modeling challenge. Our code is publicly available at https://github.com/gersteinlab/CellForge.

2025-08-04

ArXiv (prépublication)

Infrared Object Detection with Ultra Small ConvNets: Is ImageNet Pretraining Still Useful?

Srikanth Muralidharan

Heitor Rapela Medeiros

Masih Aminbeidokhti

Eric Granger

Marco Pedersoli

Many real-world applications require recognition models that are robust to different operational conditions and modalities, but at the same … (voir plus)time run on small embedded devices, with limited hardware. While for normal size models, pre-training is known to be very beneficial in accuracy and robustness, for small models, that can be employed for embedded and edge devices, its effect is not clear. In this work, we investigate the effect of ImageNet pretraining on increasingly small backbone architectures (ultra-small models, with

2025-08-04

ArXiv (prépublication)

Infrared Object Detection with Ultra Small ConvNets: Is ImageNet Pretraining Still Useful?

Srikanth Muralidharan

Heitor Rapela Medeiros

Masih Aminbeidokhti

Eric Granger

Marco Pedersoli

Many real-world applications require recognition models that are robust to different operational conditions and modalities, but at the same … (voir plus)time run on small embedded devices, with limited hardware. While for normal size models, pre-training is known to be very beneficial in accuracy and robustness, for small models, that can be employed for embedded and edge devices, its effect is not clear. In this work, we investigate the effect of ImageNet pretraining on increasingly small backbone architectures (ultra-small models, with

2025-08-04

ArXiv (prépublication)

A Guide to Misinformation Detection Data and Evaluation

Camille Thibault

Jacob-Junqi Tian

Gabrielle Péloquin-Skulski

Taylor Lynn Curtis

James Zhou

Florence Laflamme

Luke Yuxiang Guan

Jean-François Godbout

Kellin Pelrine

2025-08-03

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (publié)

A Guide to Misinformation Detection Data and Evaluation

Camille Thibault

Jacob-Junqi Tian

Gabrielle Péloquin-Skulski

Taylor Lynn Curtis

James Zhou

Florence Laflamme

Yuxiang Guan

Jean-François Godbout

Kellin Pelrine

Misinformation is a complex societal issue, and mitigating solutions are difficult to create due to data deficiencies. To address this probl… (voir plus)em, we have curated the largest collection of (mis)information datasets in the literature, totaling 75. From these, we evaluated the quality of all of the 36 datasets that consist of statements or claims, as well as the 9 datasets that consists of data in purely paragraph form. We assess these datasets to identify those with solid foundations for empirical work and those with flaws that could result in misleading and non-generalizable results, such as insufficient label quality, spurious correlations. We further provide state-of-the-art baselines on all these datasets, but show that regardless of label quality, categorical labels may no longer give an accurate evaluation of detection model performance. We discuss alternatives to mitigate this problem. Overall, this guide aims to provide a roadmap for obtaining higher quality data and conducting more effective evaluations, ultimately improving research in misinformation detection. All datasets and other artifacts are available at [anonymized].

2025-08-03

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (publié)

openreview.net

Responsible AI Day

Ebrahim Bagheri

Faezeh Ensan

Calvin Hillis

Robin Cohen

Benjamin Fung

Sébastien Gambs

2025-08-03

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (publié)

Responsible AI Day

Ebrahim Bagheri

Faezeh Ensan

Calvin Hillis

Robin Cohen

Benjamin Fung

Sébastien Gambs

2025-08-03

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (publié)

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models

Istabrak Abbes

Gopeshh Subbaraj

Matthew D Riemer

Nizar Islah

Benjamin Therien

Tsuguchika Tabaru

Hiroaki Kingetsu

Sarath Chandar

Irina Rish

Training large language models (LLMs) typically involves pre-training on massive corpora, only to restart the process entirely when new data… (voir plus) becomes available. A more efficient and resource-conserving approach would be continual pre-training, where models are updated with new data rather than retraining from scratch. However, the introduction of new data often causes distribution shifts, leading to performance degradation on previously learned tasks. In this paper, we take a deeper look at two popular proposals for addressing this distribution shift within the continual learning literature: experience replay and gradient alignment. We consider continual pre-training of models within the Llama family of architectures at a large scale across languages with 100 billion tokens of training data in each language, finding that both replay and gradient alignment lead to more stable learning without forgetting. This conclusion holds both as we vary the model scale and as we vary the number and diversity of tasks. Moreover, we are the first to demonstrate the effectiveness of gradient alignment techniques in the context of LLM pre-training and propose an efficient implementation of meta-experience replay (MER) that imbues experience replay with the benefits of gradient alignment despite negligible compute and memory overhead. Our scaling analysis across model sizes and replay rates indicates that small rates of replaying old examples are definitely a more valuable use of compute than investing in model size, but that it is more compute efficient to scale the size of the model than invest in high rates of replaying old examples.

2025-08-03

ArXiv (prépublication)