TRAIL : IA responsable pour les professionnels et les leaders
Apprenez à intégrer des pratique d'IA responsable dans votre organisation avec le programme TRAIL. Inscrivez-vous à la prochaine cohorte qui débutera le 15 avril.
Avantage IA : productivité dans la fonction publique
Apprenez à tirer parti de l’IA générative pour soutenir et améliorer votre productivité au travail. La prochaine cohorte se déroulera en ligne les 28 et 30 avril 2026.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance… (voir plus) of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
Uncovering executive function profiles within interindividual variability: A data driven clustering exploration of design fluency in school-aged children
Large collections of high-dimensional data have become nearly ubiquitous across many academic fields and application domains, ranging from b… (voir plus)iology to the humanities. Since working directly with high-dimensional data poses challenges, the demand for algorithms that create low-dimensional representations, or embeddings, for data visualization, exploration, and analysis is now greater than ever. In recent years, numerous embedding algorithms have been developed, and their usage has become widespread in research and industry. This surge of interest has resulted in a large and fragmented research field that faces technical challenges alongside fundamental debates, and it has left practitioners without clear guidance on how to effectively employ existing methods. Aiming to increase coherence and facilitate future work, in this review we provide a detailed and critical overview of recent developments, derive a list of best practices for creating and using low-dimensional embeddings, evaluate popular approaches on a variety of datasets, and discuss the remaining challenges and open problems in the field.
We investigate the robustness of Neural Ratio Estimators (NREs) and Neural Posterior Estimators (NPEs) to distributional shifts in the conte… (voir plus)xt of measuring the abundance of dark matter subhalos using strong gravitational lensing data. While these data-driven inference frameworks can be accurate on test data from the same distribution as the training sets, in real applications, it is expected that simulated training data and true observational data will differ in their distributions. We explore the behavior of a trained NRE and trained sequential NPEs to estimate the population-level parameters of dark matter subhalos from a large sample of images of strongly lensed galaxies with test data presenting distributional shifts within and beyond the bounds of the training distribution in the nuisance parameters (e.g., the background source morphology). While our results show that NREs and NPEs perform well when tested perfectly in distribution, they exhibit significant biases when confronted with slight deviations from the examples seen in the training distribution. This indicates the necessity for caution when applying NREs and NPEs to real astrophysical data, where high-dimensional underlying distributions are not perfectly known.
RadiSeq: a single- and bulk-cell whole-genome DNA sequencing simulator for radiation-damaged cell models
Felix Mathew
Luc Galarneau
J. Kildea
Objective To build and validate a simulation framework to perform single-cell and bulk-cell whole genome sequencing simulation of radiation-… (voir plus)exposed Monte Carlo cell models to assist radiation genomics studies. Approach Sequencing the genomes of radiation-damaged cells can provide useful insight into radiation action for radiobiology research. However, carrying out post-irradiation sequencing experiments can often be challenging, expensive, and time-consuming. Although computational simulations have the potential to provide solutions to these experimental challenges, and aid in designing optimal experiments, the absence of tools currently limits such application. Monte Carlo toolkits exist to simulate radiation exposures of cell models but there are no tools to simulate single- and bulk-cell sequencing of cell models containing radiation-damaged DNA. Therefore, we aimed to develop a Monte Carlo simulation framework to address this gap by designing a tool capable of simulating sequencing processes for radiation-damaged cells. Main Results We developed RadiSeq – a multi-threaded whole-genome DNA sequencing simulator written in C++. RadiSeq can be used to simulate Illumina sequencing of radiation-damaged cell models produced by Monte Carlo simulations. RadiSeq has been validated through comparative analysis, where simulated data were matched against experimentally obtained data, demonstrating reasonable agreement between the two. Additionally, it comes with numerous features designed to closely resemble actual whole-genome sequencing. RadiSeq is also highly customizable with a single input parameter file. Significance RadiSeq enables the research community to perform complex simulations of radiation-exposed DNA sequencing, supporting the optimization, planning, and validation of costly and time-intensive radiation biology experiments. This framework provides a powerful tool for advancing radiation genomics research.
Foundation models based on large language models (LLMs) have shown great success in handling various tasks and modalities. However, adapting… (voir plus) these models for general-purpose audio-language tasks is challenging due to differences in acoustic environments and task variations. In this work, we introduce LiSTEN Learning Soft Token Embeddings for Neural Audio LLMs), a framework for adapting LLMs to speech and audio tasks. LiSTEN uses a dynamic prompt selection strategy with learnable key-value pairs, allowing the model to balance general and task-specific knowledge while avoiding overfitting in a multitask setting. Our approach reduces dependence on large-scale ASR or captioning datasets, achieves competitive performance with fewer trainable parameters, and simplifies training by using a single-stage process. Additionally, LiSTEN enhances interpretability by analyzing the diversity and overlap of selected prompts across different tasks.