Publications

Comparative genomics of
<i>Pseudomonas paraeruginosa</i>
Maxime Déraspe
Lori L. Burrows
Romé Voulhoux
Daniela Centrón
J. Corbeil
Paul H Roy
Comparative genomics of Pseudomonas paraeruginosa
Maxime Déraspe
Lori L. Burrows
Romé Voulhoux
Daniela Centrón
J. Corbeil
Paul H Roy
ABSTRACT The PA7-clade (or group 3) of Pseudomonas aeruginosa is now recognized as a distinct species, Pseudomonas paraeruginosa. We report … (see more)here the genomic sequences of six new strains of P. paraeruginosa: Zw26 (the first complete genome of a cystic fibrosis isolate of P. paraeruginosa), draft genomes of four burn and wound strains from Argentina very closely related to PA7, and of Pa5196, the strain in which arabinosylation of type IV pili was documented. We compared the genomes of 82 strains of P. paraeruginosa and confirmed that the species is divided into two sub-clades. Core genomes are very similar, while most differences are found in “regions of genomic plasticity” (RGPs). Several genomic deletions were identified, and most are common to the CR1 sub-clade that includes Zw26 and Pa5196. All strains lack the type 3 secretion system (T3SS) and instead use an alternative virulence strategy involving an exolysin, a characteristic shared with group 5 P. aeruginosa. All strains tend to be multiresistant like PA7, with a significant proportion of carbapenem-resistant strains, either oprD mutants or carrying carbapenemase genes. Although P. paraeruginosa is still relatively rare, it has a worldwide distribution. Its multiresistance and its alternative virulence strategy need to be considered in future therapeutic development. IMPORTANCE Pseudomonas aeruginosa is an important opportunistic pathogen causing respiratory infections, notably in cystic fibrosis, and burn and wound infections. Our study reports six new genomes of Pseudomonas paraeruginosa, a new species recently reported as distinct from P. aeruginosa. The number of sequenced genomes of P. paraeruginosa is only about 1% that of P. aeruginosa. We compare the genomic content of nearly all strains of P. paraeruginosa in GenBank, highlighting the differences in core and accessory genomes, antimicrobial resistance genes, and virulence factors. This novel species is very similar in environmental spectrum to P. aeruginosa but is notably resistant to last-line antibiotics and uses an alternative virulence strategy based on exolysin—this strategy being shared with some P. aeruginosa outliers.
ReCatcher: Towards LLMs Regression Testing for Code Generation
Altaf Allah Abbassi
Leuson Da Silva
Amin Nikanjam
TRUTH: Teaching LLMs to Rerank for Truth in Misinformation Detection
WiSE-OD: Benchmarking Robustness in Infrared Object Detection
Heitor Rapela Medeiros
Atif Belal
Masih Aminbeidokhti
Eric Granger
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation
Meng Cao
Leila Pishdad
Yanshuai Cao
Jackie CK Cheung
Final-answer-based metrics are commonly used for evaluating large language models (LLMs) on math word problems, often taken as proxies for r… (see more)easoning ability. However, such metrics conflate two distinct sub-skills: abstract formulation (capturing mathematical relationships using expressions) and arithmetic computation (executing the calculations). Through a disentangled evaluation on GSM8K and SVAMP, we find that the final-answer accuracy of Llama-3 and Qwen2.5 (1B-32B) without CoT is overwhelmingly bottlenecked by the arithmetic computation step and not by the abstract formulation step. Contrary to the common belief, we show that CoT primarily aids in computation, with limited impact on abstract formulation. Mechanistically, we show that these two skills are composed conjunctively even in a single forward pass without any reasoning steps via an abstract-then-compute mechanism: models first capture problem abstractions, then handle computation. Causal patching confirms these abstractions are present, transferable, composable, and precede computation. These behavioural and mechanistic findings highlight the need for disentangled evaluation to accurately assess LLM reasoning and to guide future improvements.
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data
Jiaming Zhou
Abbas Ghaddar
Ge Zhang
Yaochen Hu
Soumyasundar Pal
Mark J. Coates
Jianye HAO
B. Wang
Yingxue Zhang
Multiscale Neural PDE Surrogates for Prediction and Downscaling: Application to Ocean Currents
Abdessamad El-Kabid
Redouane Lguensat
Accurate modeling of physical systems governed by partial differential equations is a central challenge in scientific computing. In oceanogr… (see more)aphy, high-resolution current data are critical for coastal management, environmental monitoring, and maritime safety. However, available satellite products, such as Copernicus data for sea water velocity at ~0.08 degrees spatial resolution and global ocean models, often lack the spatial granularity required for detailed local analyses. In this work, we (a) introduce a supervised deep learning framework based on neural operators for solving PDEs and providing arbitrary resolution solutions, and (b) propose downscaling models with an application to Copernicus ocean current data. Additionally, our method can model surrogate PDEs and predict solutions at arbitrary resolution, regardless of the input resolution. We evaluated our model on real-world Copernicus ocean current data and synthetic Navier-Stokes simulation datasets.
AURA: A Multi-Modal Medical Agent for Understanding, Reasoning&Annotation
Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capa… (see more)ble of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.
Enhancing Changepoint Detection: Penalty Learning through Deep Learning Techniques
Tung L. Nguyen
Tracing Optimization for Performance Modeling and Regression Detection
Kaveh Shahedi
Heng Li
Maxime Lamothe
Software performance modeling plays a crucial role in developing and maintaining software systems. A performance model analytically describe… (see more)s the relationship between the performance of a system and its runtime activities. This process typically examines various aspects of a system's runtime behavior, such as the execution frequency of functions or methods, to forecast performance metrics like program execution time. By using performance models, developers can predict expected performance and thereby effectively identify and address unexpected performance regressions when actual performance deviates from the model's predictions. One common and precise method for capturing performance behavior is software tracing, which involves instrumenting the execution of a program, either at the kernel level (e.g., system calls) or application level (e.g., function calls). However, due to the nature of tracing, it can be highly resource-intensive, making it impractical for production environments where resources are limited. In this work, we propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions, particularly application-level functions, from tracing while still building accurate performance models that can capture performance degradations. By selecting an optimal set of functions to be traced, we can construct optimized performance models that achieve an R-2 score of up to 99% and, sometimes, outperform full tracing models (models using non-optimized tracing data), while significantly reducing the tracing overhead by more than 80% in most cases. Our optimized performance models can also capture performance regressions in our studied programs effectively, demonstrating their usefulness in real-world scenarios. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
Corrigendum to "Child- and Proxy-reported Differences in Patient-reported Outcome and Experience Measures in Pediatric Surgery: Systematic Review and Meta-analysis" [Journal of Pediatric Surgery 60 (2025) 162172].
Zanib Nafees
Siena O'Neill
Alexandra Dimmer
Elena Guadagno
Julia Ferreira
Nancy Mayo