TRAIL : IA responsable pour les professionnels et les leaders
Apprenez à intégrer des pratique d'IA responsable dans votre organisation avec le programme TRAIL. Inscrivez-vous à la prochaine cohorte qui débutera le 15 avril.
Avantage IA : productivité dans la fonction publique
Apprenez à tirer parti de l’IA générative pour soutenir et améliorer votre productivité au travail. La prochaine cohorte se déroulera en ligne les 28 et 30 avril 2026.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
TRUTH: Teaching LLMs to Rerank for Truth in Misinformation Detection
Final-answer-based metrics are commonly used for evaluating large language models (LLMs) on math word problems, often taken as proxies for r… (voir plus)easoning ability. However, such metrics conflate two distinct sub-skills: abstract formulation (capturing mathematical relationships using expressions) and arithmetic computation (executing the calculations). Through a disentangled evaluation on GSM8K and SVAMP, we find that the final-answer accuracy of Llama-3 and Qwen2.5 (1B-32B) without CoT is overwhelmingly bottlenecked by the arithmetic computation step and not by the abstract formulation step. Contrary to the common belief, we show that CoT primarily aids in computation, with limited impact on abstract formulation. Mechanistically, we show that these two skills are composed conjunctively even in a single forward pass without any reasoning steps via an abstract-then-compute mechanism: models first capture problem abstractions, then handle computation. Causal patching confirms these abstractions are present, transferable, composable, and precede computation. These behavioural and mechanistic findings highlight the need for disentangled evaluation to accurately assess LLM reasoning and to guide future improvements.
Software performance modeling plays a crucial role in developing and maintaining software systems. A performance model analytically describe… (voir plus)s the relationship between the performance of a system and its runtime activities. This process typically examines various aspects of a system's runtime behavior, such as the execution frequency of functions or methods, to forecast performance metrics like program execution time. By using performance models, developers can predict expected performance and thereby effectively identify and address unexpected performance regressions when actual performance deviates from the model's predictions. One common and precise method for capturing performance behavior is software tracing, which involves instrumenting the execution of a program, either at the kernel level (e.g., system calls) or application level (e.g., function calls). However, due to the nature of tracing, it can be highly resource-intensive, making it impractical for production environments where resources are limited. In this work, we propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions, particularly application-level functions, from tracing while still building accurate performance models that can capture performance degradations. By selecting an optimal set of functions to be traced, we can construct optimized performance models that achieve an R-2 score of up to 99% and, sometimes, outperform full tracing models (models using non-optimized tracing data), while significantly reducing the tracing overhead by more than 80% in most cases. Our optimized performance models can also capture performance regressions in our studied programs effectively, demonstrating their usefulness in real-world scenarios. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
2025-07-21
ACM Transactions on Software Engineering and Methodology (publié)
Corrigendum to "Child- and Proxy-reported Differences in Patient-reported Outcome and Experience Measures in Pediatric Surgery: Systematic Review and Meta-analysis" [Journal of Pediatric Surgery 60 (2025) 162172].
Corrigendum to "Virtual Reality for Pediatric Trauma Education - A Preliminary Face and Content Validation Study" [Journal of Pediatric Surgery 60 (2025) 161951].
The impact of statistical adjustment for assay performance on inferences from SARS-CoV-2 serological surveillance studies
Jiacheng Chen
Yuan Yu
Sheila F O’Brien
Carmen L Charlton
Steven J Drews
Jane M Heffernan
Amber M Smith
Yu Nakagama
Yasutoshi Kido
David L Buckeridge
W Alton Russell
Abstract Choice of immunoassay influences population seroprevalence estimates. Post hoc adjustments for assay performance could improve comp… (voir plus)arability of estimates across studies and enable pooled analyses. We assessed post hoc adjustment methods using data from 2021 to 2023 SARS-CoV-2 serosurveillance studies in Alberta, Canada: one that tested 124 008 blood donations using Roche immunoassays (SARS-CoV-2 nucleocapsid total antibody and anti–SARS-CoV-2 S) and another that tested 214 780 patient samples using Abbott immunoassays (SARS-CoV-2 IgG and anti–SARS-CoV-2 S). Comparing datasets, seropositivity for antibodies against nucleocapsid (anti-N) diverged after May 2022 due to differential loss of sensitivity as a function of time since infection. The commonly used Rogan-Gladen adjustment did not reduce this divergence. Regression-based adjustments using the assays’ semiquantitative results produced more similar estimates of anti-N seroprevalence and rolling incidence proportion (proportion of individuals infected in recent months). Seropositivity for antibodies targeting SARS-CoV-2 spike protein was similar without adjustment, and concordance was not improved when applying an alternative, functional threshold. These findings suggest that assay performance substantially impacted population inferences from SARS-CoV-2 serosurveillance studies in the Omicron period. Unlike methods that ignore time-varying assay sensitivity, regression-based methods using the semiquantitative assay resulted in increased concordance in estimated anti-N seropositivity and rolling incidence between cohorts using different assays.
Single-cell spatial transcriptomics such as in-situ hybridization or sequencing technologies can provide subcellular resolution that enables… (voir plus) the identification of individual cell identities, locations, and a deep understanding of subcellular mechanisms. However, accurate segmentation and annotation that allows individual cell boundaries to be determined remains a major challenge that limits all the above and downstream insights. Current machine learning methods heavily rely on nuclei or cell body staining, resulting in the significant loss of both transcriptome depth and the limited ability to learn latent representations of spatial colocalization relationships. Here, we propose Bering, a graph deep learning model that leverages transcript colocalization relationships for joint noise-aware cell segmentation and molecular annotation in 2D and 3D spatial transcriptomics data. Graph embeddings for the cell annotation are transferred as a component of multi-modal input for cell segmentation, which is employed to enrich gene relationships throughout the process. To evaluate performance, we benchmarked Bering with state-of-the-art methods and observed significant improvement in cell segmentation accuracies and numbers of detected transcripts across various spatial technologies and tissues. To streamline segmentation processes, we constructed expansive pre-trained models, which yield high segmentation accuracy in new data through transfer learning and self-distillation, demonstrating the generalizability of Bering.
As AI systems take on collaborative roles, they must reason about shared goals and beliefs-not just generate fluent language. The Rational S… (voir plus)peech Act (RSA) framework offers a principled approach to pragmatic reasoning, but existing extensions face challenges in scaling to multi-turn, collaborative scenarios. In this paper, we introduce Collaborative Rational Speech Act (CRSA), an information-theoretic (IT) extension of RSA that models multi-turn dialog by optimizing a gain function adapted from rate-distortion theory. This gain is an extension of the gain model that is maximized in the original RSA model but takes into account the scenario in which both agents in a conversation have private information and produce utterances conditioned on the dialog. We demonstrate the effectiveness of CRSA on referential games and template-based doctor-patient dialogs in the medical domain. Empirical results show that CRSA yields more consistent, interpretable, and collaborative behavior than existing baselines-paving the way for more pragmatic and socially aware language agents.
The immune system’s most basic task is to decide what is “self” and “non-self”, but a precise definition of self versus non-self r… (voir plus)emains challenging. According to the discontinuity theory of immunity, effector responses depend on how quickly an antigenic stimulus changes: rapid change triggers an immune response, whereas gradual change fosters tolerance. We present a model of adaptive immune dynamics including T cells, Tregs and cytokines that reproduces the hallmarks of the discontinuity theory. The model allows for sharp discrimination between acute and chronic infections based on the growth rate of the immune challenge, and vaccination-like acute dynamics upon presentation of a bolus of immune challenge. We further show that the model behavior only depends on a handful of testable assumptions that we map to geometric constraints in phase space. This suggests that the model properties are generic and robust across alternative mechanistic details. We also examine the impact of multiple concurrent immune challenges in this model, and demonstrate the occurrence of dynamical antagonism, wherein, in some parameter regimes, slow-growing challenges hinder acute responses to fast-growing ones, with further counter-intuitive behaviors for sequential co-infections. Together, these results place the discontinuity theory on firm mathematical footing and encourage further investigation of interferences of multi-agent immune challenges, from chronic viral co-infections to cancer immunoediting.