Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Inferring Metabolic States from Single Cell Transcriptomic Data via Geometric Deep Learning
Retriever-augmented instruction-following models are attractive alternatives to fine-tuned approaches for information-seeking tasks such as … (voir plus)question answering (QA). By simply prepending retrieved documents in its input along with an instruction, these models can be adapted to various information domains and tasks without additional fine-tuning. While the model responses tend to be natural and fluent, the additional verbosity makes traditional QA evaluation metrics such as exact match (EM) and F1 unreliable for accurately quantifying model performance. In this work, we investigate the performance of instruction-following models across three information-seeking QA tasks. We use both automatic and human evaluation to evaluate these models along two dimensions: 1) how well they satisfy the user's information need (correctness), and 2) whether they produce a response based on the provided knowledge (faithfulness). Guided by human evaluation and analysis, we highlight the shortcomings of traditional metrics for both correctness and faithfulness. We then propose simple token-overlap based and model-based metrics that reflect the true performance of these models. Our analysis reveals that instruction-following models are competitive, and sometimes even outperform fine-tuned models for correctness. However, these models struggle to stick to the provided knowledge and often hallucinate in their responses. We hope our work encourages a more holistic evaluation of instruction-following models for QA. Our code and data is available at https://github.com/McGill-NLP/instruct-qa
2024-05-16
Transactions of the Association for Computational Linguistics (publié)
Plasma RNAemia, delayed antibody responses and inflammation predict COVID-19 outcomes, but the mechanisms underlying these immunovirological… (voir plus) patterns are poorly understood. We profile 782 longitudinal plasma samples from 318 hospitalized COVID-19 patients. Integrated analysis using k-means reveal four patient clusters in a discovery cohort: mechanically ventilated critically-ill cases are subdivided into good prognosis and high-fatality clusters (reproduced in a validation cohort), while non-critical survivors are delineated by high and low antibody responses. Only the high-fatality cluster is enriched for transcriptomic signatures associated with COVID-19 severity, and each cluster has distinct RBD-specific antibody elicitation kinetics. Both critical and non-critical clusters with delayed antibody responses exhibit sustained IFN signatures, which negatively correlate with contemporaneous RBD-specific IgG levels and absolute SARS-CoV-2-specific B and CD4+ T cell frequencies. These data suggest that the Interferon paradox previously described in murine LCMV models is operative in COVID-19, with excessive IFN signaling delaying development of adaptive virus-specific immunity.
Mastery of Key Performance Indicators (KPIs) in the realm of photovoltaic solar power plants is pivotal for evaluating their effectiveness a… (voir plus)nd fine-tuning their operational efficiency. The assessment of these plants' performance has con-sistently stood as a focal point in scientific research. Nevertheless, the investigation into the process of selecting a framework for classifying KPIs, particularly through their categorization based on criteria, sub-criteria, or aspects, has been relatively limited in research. This article addresses this gap by conducting a comprehensive literature review on various KPIs and, drawing upon both literature and practical experience, formulating a set of criteria to serve as the foundation for a Multi-Criteria Decision Analysis (MCDA) method. This intricate taxonomic framework enhances the understanding of infrastructure performance for stakeholders in the solar industry. By streamlining decision-making, it simplifies the selection of KPIs tailored to specific requirements, thus mitigating the complexity arising from the abundance of KPIs in the literature. As a result, decision-makers can make well-informed choices regarding the monitoring and evaluation framework that best suits the performance goals of their solar plant.
2024-05-16
2024 4th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) (publié)
In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their level of … (voir plus)knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood, which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a utility function that can encourage the model to produce responses only when it is confident in them. This utility function can be used to score generation of different length and abstention. To optimize this function, we introduce ReSearch, a process of"self-reflection"consisting of iterative self-prompting and self-evaluation. We use the ReSearch algorithm to generate synthetic data on which we finetune our models. Compared to their original versions, our resulting models generate fewer \emph{hallucinations} overall at no additional inference cost, for both known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to abstain by augmenting the samples generated by the model during the search procedure with an answer expressing abstention.
The capacity for embryonic cells to differentiate relies on a large-scale reprogramming of the oocyte and sperm nucleus into a transient tot… (voir plus)ipotent state. In zebrafish, this reprogramming step is achieved by the pioneer factors Nanog, Pou5f3, and Sox19b (NPS). Yet, it remains unclear whether cells lacking this reprogramming step are directed towards wild type states or towards novel developmental canals in the Waddington landscape of embryonic development. Here we investigate the developmental fate of embryonic cells mutant for NPS by analyzing their single-cell gene expression profiles. We find that cells lacking the first developmental reprogramming steps can acquire distinct cell states. These states are manifested by gene expression modules that result from a failure of nuclear reprogramming, the persistence of the maternal program, and the activation of somatic compensatory programs. As a result, most mutant cells follow new developmental canals and acquire new mixed cell states in development. In contrast, a group of mutant cells acquire primordial germ cell-like states, suggesting that NPS-dependent reprogramming is dispensable for these cell states. Together, these results demonstrate that developmental reprogramming after fertilization is required to differentiate most canonical developmental programs, and loss of the transient totipotent state canalizes embryonic cells into new developmental states in vivo.
In this work, we propose Salient Sparse Federated Learning (SSFL), a streamlined approach for sparse federated learning with efficient commu… (voir plus)nication. SSFL identifies a sparse subnetwork prior to training, leveraging parameter saliency scores computed separately on local client data in non-IID scenarios, and then aggregated, to determine a global mask. Only the sparse model weights are communicated each round between the clients and the server. We validate SSFL's effectiveness using standard non-IID benchmarks, noting marked improvements in the sparsity--accuracy trade-offs. Finally, we deploy our method in a real-world federated learning framework and report improvement in communication time.
Identifying functionally important cell states and structure within a heterogeneous tumor remains a significant biological and computational… (voir plus) challenge. Moreover, current clustering or trajectory-based computational models are ill-equipped to address the notion that cancer cells reside along a phenotypic continuum. To address this, we present Archetypal Analysis network (AAnet), a neural network that learns key archetypal cell states within a phenotypic continuum of cell states in single-cell data. Applied to single-cell RNA sequencing data from pre-clinical models and a cohort of 34 clinical breast cancers, AAnet identifies archetypes that resolve distinct biological cell states and processes, including cell proliferation, hypoxia, metabolism and immune interactions. Notably, archetypes identified in primary tumors are recapitulated in matched liver, lung and lymph node metastases, demonstrating that a significant component of intratumoral heterogeneity is driven by cell intrinsic properties. Using spatial transcriptomics as orthogonal validation, AAnet-derived archetypes show discrete spatial organization within tumors, supporting their distinct archetypal biology. We further reveal that ligand:receptor cross-talk between cancer and adjacent stromal cells contributes to intra-archetypal biological mimicry. Finally, we use AAnet archetype identifiers to validate GLUT3 as a critical mediator of a hypoxic cell archetype harboring a cancer stem cell population, which we validate in human triple-negative breast cancer specimens. AAnet is a powerful tool to reveal functional cell states within complex samples from multimodal single-cell data.