Publications

Evaluating and Improving LitLLMs with Deep Research
Abhay Puri
Issam Hadj Laradji
Krishnamurthy Dj Dvijotham
Jason Stanley
Literature reviews are an essential component of scientific research, but they remain time-intensive and challenging to write, especially du… (see more)e to the recent influx of research papers. This paper explores the zero-shot abilities of recent Large Language Models (LLMs) in assisting with the writing of literature reviews based on an abstract. We decompose the task into two components: (1) Retrieving related works given a query abstract and (2) Writing a literature review based on the retrieved results. We analyze how effective LLMs are for both components. For retrieval, we introduce a novel two-step search strategy that first uses an LLM to extract meaningful keywords from the abstract of a paper and then retrieves potentially relevant papers by querying an external knowledge base. Additionally, we study a prompting-based re-ranking mechanism with attribution and show that re-ranking doubles the normalized recall compared to naive search methods while providing insights into the LLM's decision-making process. In the generation phase, we propose a two-step approach that first outlines a plan for the review and then executes steps in the plan to generate the actual review. To evaluate different LLM-based literature review methods, we create test sets from arXiv papers using a protocol designed for rolling use with newly released LLMs to avoid test set contamination in zero-shot evaluations. We release this evaluation protocol to promote additional research and development in this regard. Our empirical results suggest that LLMs show promising potential for writing literature reviews when the task is decomposed into smaller components of retrieval and planning. Particularly, our ``Deep Research" retrieval variant improves coverage by over 5x compared to standard keyword search, addressing a key bottleneck in the pipeline. Further, we demonstrate that our planning-based approach achieves higher-quality reviews by minimizing hallucinated references in the generated review by 18-26\% compared to existing simpler LLM-based generation methods.
Towards a General Recipe for Combinatorial Optimization with Multi-Filter GNNs
Capacity-Constrained Continual Learning
Zheng Wen
Benjamin Van Roy
Satinder Singh
Any agents we can possibly build are subject to capacity constraints, as memory and compute resources are inherently finite. However, compar… (see more)atively little attention has been dedicated to understanding how agents with limited capacity should allocate their resources for optimal performance. The goal of this paper is to shed some light on this question by studying a simple yet relevant continual learning problem: the capacity-constrained linear-quadratic-Gaussian (LQG) sequential prediction problem. We derive a solution to this problem under appropriate technical conditions. Moreover, for problems that can be decomposed into a set of sub-problems, we also demonstrate how to optimally allocate capacity across these sub-problems in the steady state. We view the results of this paper as a first step in the systematic theoretical study of learning under capacity constraints.
Tree semantic segmentation from aerial image time series
Tree semantic segmentation from aerial image time series
A systematic review of risk stratification for pediatric appendicitis
Mahshid Mortazavi
Alexandra Dimmer
Elena Guadagno
Sherif Emil
Sparsity regularization via tree-structured environments for disentangled representations
Many causal systems such as biological processes in cells can only be observed indirectly via measurements, such as gene expression. Causal … (see more)representation learning---the task of correctly mapping low-level observations to latent causal variables---could advance scientific understanding by enabling inference of latent variables such as pathway activation. In this paper, we develop methods for inferring latent variables from multiple related datasets (environments) and tasks. As a running example, we consider the task of predicting a phenotype from gene expression, where we often collect data from multiple cell types or organisms that are related in known ways. The key insight is that the mapping from latent variables driven by gene expression to the phenotype of interest changes sparsely across closely related environments. To model sparse changes, we introduce Tree-Based Regularization (TBR), an objective that minimizes both prediction error and regularizes closely related environments to learn similar predictors. We prove that under assumptions about the degree of sparse changes, TBR identifies the true latent variables up to some simple transformations. We evaluate the theory empirically with both simulations and ground-truth gene expression data. We find that TBR recovers the latent causal variables better than related methods across these settings, even under settings that violate some assumptions of the theory.
Comparative genomics of Pseudomonas paraeruginosa.
Maxime Déraspe
Lori L. Burrows
Romé Voulhoux
D. Centrón
Paul H Roy
The PA7-clade (or group 3) of Pseudomonas aeruginosa is now recognized as a distinct species, Pseudomonas paraeruginosa. We report here the … (see more)genomic sequences of six new strains of P. paraeruginosa: Zw26 (the first complete genome of a cystic fibrosis isolate of P. paraeruginosa), draft genomes of four burn and wound strains from Argentina very closely related to PA7, and of Pa5196, the strain in which arabinosylation of type IV pili was documented. We compared the genomes of 82 strains of P. paraeruginosa and confirmed that the species is divided into two sub-clades. Core genomes are very similar, while most differences are found in "regions of genomic plasticity" (RGPs). Several genomic deletions were identified, and most are common to the CR1 sub-clade that includes Zw26 and Pa5196. All strains lack the type 3 secretion system (T3SS) and instead use an alternative virulence strategy involving an exolysin, a characteristic shared with group 5 P. aeruginosa. All strains tend to be multiresistant like PA7, with a significant proportion of carbapenem-resistant strains, either oprD mutants or carrying carbapenemase genes. Although P. paraeruginosa is still relatively rare, it has a worldwide distribution. Its multiresistance and its alternative virulence strategy need to be considered in future therapeutic development.IMPORTANCEPseudomonas aeruginosa is an important opportunistic pathogen causing respiratory infections, notably in cystic fibrosis, and burn and wound infections. Our study reports six new genomes of Pseudomonas paraeruginosa, a new species recently reported as distinct from P. aeruginosa. The number of sequenced genomes of P. paraeruginosa is only about 1% that of P. aeruginosa. We compare the genomic content of nearly all strains of P. paraeruginosa in GenBank, highlighting the differences in core and accessory genomes, antimicrobial resistance genes, and virulence factors. This novel species is very similar in environmental spectrum to P. aeruginosa but is notably resistant to last-line antibiotics and uses an alternative virulence strategy based on exolysin-this strategy being shared with some P. aeruginosa outliers.
Comparative genomics of Pseudomonas paraeruginosa
Maxime Déraspe
Lori L. Burrows
Romé Voulhoux
Daniela Centrón
Paul H Roy
ABSTRACT The PA7-clade (or group 3) of Pseudomonas aeruginosa is now recognized as a distinct species, Pseudomonas paraeruginosa. We report … (see more)here the genomic sequences of six new strains of P. paraeruginosa: Zw26 (the first complete genome of a cystic fibrosis isolate of P. paraeruginosa), draft genomes of four burn and wound strains from Argentina very closely related to PA7, and of Pa5196, the strain in which arabinosylation of type IV pili was documented. We compared the genomes of 82 strains of P. paraeruginosa and confirmed that the species is divided into two sub-clades. Core genomes are very similar, while most differences are found in “regions of genomic plasticity” (RGPs). Several genomic deletions were identified, and most are common to the CR1 sub-clade that includes Zw26 and Pa5196. All strains lack the type 3 secretion system (T3SS) and instead use an alternative virulence strategy involving an exolysin, a characteristic shared with group 5 P. aeruginosa. All strains tend to be multiresistant like PA7, with a significant proportion of carbapenem-resistant strains, either oprD mutants or carrying carbapenemase genes. Although P. paraeruginosa is still relatively rare, it has a worldwide distribution. Its multiresistance and its alternative virulence strategy need to be considered in future therapeutic development. IMPORTANCE Pseudomonas aeruginosa is an important opportunistic pathogen causing respiratory infections, notably in cystic fibrosis, and burn and wound infections. Our study reports six new genomes of Pseudomonas paraeruginosa, a new species recently reported as distinct from P. aeruginosa. The number of sequenced genomes of P. paraeruginosa is only about 1% that of P. aeruginosa. We compare the genomic content of nearly all strains of P. paraeruginosa in GenBank, highlighting the differences in core and accessory genomes, antimicrobial resistance genes, and virulence factors. This novel species is very similar in environmental spectrum to P. aeruginosa but is notably resistant to last-line antibiotics and uses an alternative virulence strategy based on exolysin—this strategy being shared with some P. aeruginosa outliers.
Comparative genomics of
<i>Pseudomonas paraeruginosa</i>
Maxime Déraspe
Lori L. Burrows
Romé Voulhoux
Daniela Centrón
Paul H Roy
ReCatcher: Towards LLMs Regression Testing for Code Generation
Altaf Allah Abbassi
Leuson Da Silva
Amin Nikanjam
Large Language Models (LLMs) for code generation evolve rapidly through fine-tuning, merging, or new model releases. However, such updates c… (see more)an introduce regressions, not only in correctness but also in code quality and performance. To address this, we present ReCatcher, a regression testing framework for Python code generation. ReCatcher systematically compares two LLMs, typically a current model and a candidate update, across three dimensions: logical correctness, static code quality, and execution performance. We apply ReCatcher to assess regressions across three update scenarios, fine-tuning, merging, and model release, using CodeLlama, DeepSeek-Coder, and GPT-4o. Our evaluation shows that fine-tuning with cross-language datasets increases syntax errors by up to 12%. Merging with general-purpose models like Llama2 leads to regressions in correctness by up to 18%. GPT-4o introduces regressions of up to 50% in handling missing imports compared to GPT-3.5-turbo, while GPT-4o-mini suffers up to 80% performance degradation in execution time versus GPT-4o. Overall, logical correctness, performance, and error handling (e.g., syntax errors and missing imports) are the most regression-prone areas. Comparing ReCatcher with baseline solutions, it presents better and consistent accuracy across logical and performance aspects. ReCatcher highlights the importance of systematic regression evaluation before adopting new models, while assisting researchers and practitioners in making more informed update decisions.
Multiscale Neural PDE Surrogates for Prediction and Downscaling: Application to Ocean Currents
Abdessamad El-Kabid
Redouane Lguensat
Alex Hern'andez-Garc'ia
Accurate modeling of physical systems governed by partial differential equations is a central challenge in scientific computing. In oceanogr… (see more)aphy, high-resolution current data are critical for coastal management, environmental monitoring, and maritime safety. However, available satellite products, such as Copernicus data for sea water velocity at ~0.08 degrees spatial resolution and global ocean models, often lack the spatial granularity required for detailed local analyses. In this work, we (a) introduce a supervised deep learning framework based on neural operators for solving PDEs and providing arbitrary resolution solutions, and (b) propose downscaling models with an application to Copernicus ocean current data. Additionally, our method can model surrogate PDEs and predict solutions at arbitrary resolution, regardless of the input resolution. We evaluated our model on real-world Copernicus ocean current data and synthetic Navier-Stokes simulation datasets.