Publications

Advancing Science- and Evidence-based AI Policy
Rishi Bommasani
Sanjeev Arora
Jennifer Chayes
Yejin Choi
Mariano-Florentino Cuéllar
Li Fei-Fei
Daniel E. Ho
Daniel Jurafsky
Sanmi Koyejo
Hima Lakkaraju
Arvind Narayanan
Alondra Nelson
Emma Pierson
Scott Singer
Suresh Venkatasubramanian
Ion Stoica
Percy Liang
Dawn Song
Policy must be informed by, but also facilitate the generation of, scientific evidence.
Advancing science- and evidence-based AI policy.
Rishi Bommasani
Sanjeev Arora
Jennifer Chayes
Yejin Choi
Mariano-Florentino Cuéllar
Li Fei-Fei
Daniel E. Ho
Dan Jurafsky
Sanmi Koyejo
Hima Lakkaraju
Arvind Narayanan
Alondra Nelson
Emma Pierson
Scott Singer
Suresh Venkatasubramanian
Ion Stoica
Percy Liang
Dawn Song
Computing Approximate Nash Equilibria for Integer Programming Games
Aloïs Duguet
Gabriele Dragotto
Sandra-ulrich Ngueveu
Co-Producing AI: Toward an Augmented, Participatory Lifecycle
Rashid A. Mushkani
Toumadher Ammar
Cassandre Chatonnier
Despite efforts to mitigate the inherent risks and biases of artificial intelligence (AI) algorithms, these algorithms can disproportionatel… (see more)y impact culturally marginalized groups. A range of approaches has been proposed to address or reduce these risks, including the development of ethical guidelines and principles for responsible AI, as well as technical solutions that promote algorithmic fairness. Drawing on design justice, expansive learning theory, and recent empirical work on participatory AI, we argue that mitigating these harms requires a fundamental re‑architecture of the AI production pipeline. This re‑design should center co‑production, diversity, equity, inclusion (DEI), and multidisciplinary collaboration. We introduce an augmented AI lifecycle consisting of five interconnected phases: co‑framing, co‑design, co‑implementation, co‑deployment, and co‑maintenance. The lifecycle is informed by four multidisciplinary workshops and grounded in themes of distributed authority and iterative knowledge exchange. Finally, we relate the proposed lifecycle to several leading ethical frameworks and outline key research questions that remain for scaling participatory governance.
Evaluating and Improving LitLLMs with Deep Research
Issam Hadj Laradji
Krishnamurthy Dj Dvijotham
Jason Stanley
Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially du… (see more)e to the recent influx of research papers. This paper explores the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We decompose the task into two components: (1) Retrieving related works given a query abstract and (2) Writing a literature review based on the retrieved results. We analyze how effective LLMs are for both components. For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods while providing insights into the LLM's decision-making process. In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We release this evaluation protocol to promote additional research and development in this regard. Our empirical results suggest that LLMs show promising potential for writing literature reviews when the task is decomposed into smaller components of retrieval and planning. Particularly, our ``Deep Research" retrieval variant improves coverage by over 5x compared to standard keyword search, addressing a key bottleneck in the pipeline. Further, we demonstrate that our planning-based approach achieves higher-quality reviews by minimizing hallucinated references in the generated review by 18-26\% compared to existing simpler LLM-based generation methods.
Computational Tracking of Cell Origins Using CellSexID from Single-Cell Transcriptomes
Huilin Tai
Qian Li
Jingtao Wang
Jiahui Tan
Bowen Zhao
Ryann Lang
Basil J. Petrof
Cell tracking in chimeric models is essential yet challenging, particularly in developmental biology, regenerative medicine, and transplanta… (see more)tion research. Existing methods such as fluorescent labeling and genetic barcoding are technically demanding, costly, and often impractical for dynamic or heterogeneous tissues. Here, we introduce CellSexID, a computational framework that leverages sex as a surrogate marker for cell origin inference. Using a machine learning model trained on single-cell transcriptomic data, CellSexID accurately predicts the sex of individual cells, enabling in silico distinction between donor and recipient cells in sex-mismatched settings. The model identifies minimal sex-linked gene sets through ensemble feature selection and has been validated using both public datasets and experimental flow sorting, confirming the biological relevance of predicted populations. We further demonstrate CellSexID’s applicability beyond chimeric models, including organ transplantation and multiplexed sample demultiplexing. As a scalable and cost-effective alternative to physical labeling, CellSexID facilitates precise cell tracking and supports diverse biomedical applications involving mixed cellular origins.
Towards a General Recipe for Combinatorial Optimization with Multi-Filter GNNs
Capacity-Constrained Continual Learning
Zheng Wen
Benjamin Van Roy
Satinder Singh
Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, compar… (see more)atively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.
Tree semantic segmentation from aerial image time series
Tree semantic segmentation from aerial image time series
A systematic review of risk stratification for pediatric appendicitis
Mahshid Mortazavi
Alexandra Dimmer
Elena Guadagno
Sherif Emil
Sparsity regularization via tree-structured environments for disentangled representations
Many causal systems such as biological processes in cells can only be observed indirectly via measurements, such as gene expression. Causal … (see more)representation learning---the task of correctly mapping low-level observations to latent causal variables---could advance scientific understanding by enabling inference of latent variables such as pathway activation. In this paper, we develop methods for inferring latent variables from multiple related datasets (environments) and tasks. As a running example, we consider the task of predicting a phenotype from gene expression, where we often collect data from multiple cell types or organisms that are related in known ways. The key insight is that the mapping from latent variables driven by gene expression to the phenotype of interest changes sparsely across closely related environments. To model sparse changes, we introduce Tree-Based Regularization (TBR), an objective that minimizes both prediction error and regularizes closely related environments to learn similar predictors. We prove that under assumptions about the degree of sparse changes, TBR identifies the true latent variables up to some simple transformations. We evaluate the theory empirically with both simulations and ground-truth gene expression data. We find that TBR recovers the latent causal variables better than related methods across these settings, even under settings that violate some assumptions of the theory.