TRAIL : IA responsable pour les professionnels et les leaders
Apprenez à intégrer des pratique d'IA responsable dans votre organisation avec le programme TRAIL. Inscrivez-vous à la séance d'information le 12 mars prochain pour en apprendre plus sur le programme.
Avantage IA : productivité dans la fonction publique
Apprenez à tirer parti de l’IA générative pour soutenir et améliorer votre productivité au travail. La prochaine cohorte se déroulera en ligne les 28 et 30 avril 2026.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Foundational vision-language models (VLMs) excel across diverse tasks, but adapting them to new domains without forgetting prior knowledge r… (voir plus)emains a critical challenge. Continual Learning (CL) addresses this challenge by enabling models to learn sequentially from new data while mitigating the forgetting of prior information, typically under supervised settings involving label shift. Nonetheless, abrupt distribution shifts can still cause substantial forgetting, potentially nullifying the benefits of supervised updates, especially when storing or replaying past data is infeasible. In this work, we propose leveraging unlabeled test-time data in an unsupervised manner to reinforce prior task performance without requiring replay or stored examples. Unlike traditional Test-Time Adaptation (TTA), which primarily focuses on domain shift or corruption, our method improves performance on earlier tasks by exploiting representative test samples encountered during deployment. We introduce a simple teacher-student framework with gradient-based sparse parameter updates, and show that it effectively mitigates forgetting in class-incremental CL for VLMs, offering a memory-free alternative to episodic replay with strong empirical results.
2026-01-29
Transactions on Machine Learning Research (accepté)
Perplexity -- a function measuring a model's overall level of"surprise"when encountering a particular output -- has gained significant tract… (voir plus)ion in recent years, both as a loss function and as a simple-to-compute metric of model quality. Prior studies have pointed out several limitations of perplexity, often from an empirical manner. Here we leverage recent results on Transformer continuity to show in a rigorous manner how perplexity may be an unsuitable metric for model selection. Specifically, we prove that, if there is any sequence that a compact decoder-only Transformer model predicts accurately and confidently -- a necessary pre-requisite for strong generalisation -- it must imply existence of another sequence with very low perplexity, but not predicted correctly by that same model. Further, by analytically studying iso-perplexity plots, we find that perplexity will not always select for the more accurate model -- rather, any increase in model confidence must be accompanied by a commensurate rise in accuracy for the new model to be selected.
The integrity of time in distributed Internet of Things (IoT) devices is crucial for reliable operation in energy cyber-physical systems, su… (voir plus)ch as smart grids and microgrids. However, IoT systems are vulnerable to clock drift, time-synchronization manipulation, and timestamp discontinuities, such as the Year 2038 (Y2K38) Unix overflow, all of which disrupt temporal ordering. Conventional anomaly-detection models, which assume reliable timestamps, fail to capture temporal inconsistencies. This paper introduces STGAT (Spatio-Temporal Graph Attention Network), a framework that models both temporal distortion and inter-device consistency in energy IoT systems. STGAT combines drift-aware temporal embeddings and temporal self-attention to capture corrupted time evolution at individual devices, and uses graph attention to model spatial propagation of timing errors. A curvature-regularized latent representation geometrically separates normal clock evolution from anomalies caused by drift, synchronization offsets, and overflow events. Experimental results on energy IoT telemetry with controlled timing perturbations show that STGAT achieves 95.7% accuracy, outperforming recurrent, transformer, and graph-based baselines with significant improvements (d>1.8, p0.001). Additionally, STGAT reduces detection delay by 26%, achieving a 2.3-time-step delay while maintaining stable performance under over
Federated learning (FL) has become an effective paradigm for privacy-preserving, distributed Intrusion Detection Systems (IDS) in cyber-phys… (voir plus)ical and Internet of Things (IoT) networks, where centralized data aggregation is often infeasible due to privacy and bandwidth constraints. Despite its advantages, most existing FL-based IDS assume closed-set learning and lack mechanisms such as uncertainty estimation, semantic generalization, and explicit modeling of epistemic ambiguity in zero-day attack scenarios. Additionally, robustness to heterogeneous and unreliable clients remains a challenge in practical applications. This paper introduces a semantics-driven federated IDS framework that incorporates language-derived semantic supervision into federated optimization, enabling open-set and zero-shot intrusion detection for previously unseen attack behaviors. The approach constructs semantic attack prototypes using a Tri-LLM ensemble of GPT-4o, DeepSeek-V3, and LLaMA-3-8B, aligning distributed telemetry features with high-level attack concepts. Inter-LLM semantic disagreement is modeled as epistemic uncertainty for zero-day risk estimation, while a trust-aware aggregation mechanism dynamically weights client updates based on reliability. Experimental results show stable semantic alignment across heterogeneous clients and consistent convergence. The framework achieves over 80% zero-shot detection accuracy on unseen attack patterns, improving zero-day discrimination by more than 10% compared to similarity-based baselines, while maintaining low aggregation instability in the presence of unreliable or compromised clients.
Lysergic acid diethylamide (LSD) profoundly alters conscious experience, yet the electrophysiological mechanisms by which it reshapes neural… (voir plus) dynamics remain incompletely understood. A hallmark of psychedelic states is widespread cortical desynchronization, typically inferred from reductions in spectral power, but whether such effects reflect genuine weakening of neural oscillations or are confounded by shifts in oscillatory peak frequencies remains unresolved. Here, we address this gap by combining source-resolved magnetoencephalography (MEG), spectral parameterization, temporal complexity metrics, and interpretable machine learning in an LSD versus placebo design, with and without music. We show that LSD induces robust, spatially structured increases in alpha and beta peak frequencies alongside genuine attenuation of oscillatory power, with these effects displaying partly dissociable cortical patterns. Beyond rhythmic activity, LSD is associated with flattening of the aperiodic 1/f spectral slope and increased neural signal fractality and complexity, preferentially affecting sensory, language, emotion, and imagery-related networks while sparing motor cortex. Machine-learning analyses further identify peak-frequency shifts, aperiodic parameters, and complexity measures as key discriminators of the psychedelic state. Music does not robustly amplify these neural signatures and instead shows a trend toward attenuation. Together, these findings provide a comprehensive electrophysiological account of how LSD reorganizes large-scale human brain dynamics and highlight features that may differentiate its neural signature from that of other psychedelics.
Patient safety in operating rooms has globally improved through interventions such as the World Health Organization (WHO) Surgical Safety Ch… (voir plus)ecklist and multidisciplinary team training. However, while evidence from high-income countries is well documented, there remains limited consolidated knowledge on the understanding, application, and effectiveness of safety culture interventions in African surgical settings, which this review seeks to address. This systematic review examined factors and protocols affecting surgical safety in African operating rooms. We hypothesized that persistent systemic barriers undermine safety culture despite adoption of global measures. Following PRISMA 2020, we searched eight databases (Medline, Embase, Cochrane, Africa-Wide, CINAHL, Global Health, Global Index Medicus, Web of Science) from inception to 5 December 2024, using variations of text words present in the title, abstract, or keyword fields, alongside relevant subject headings, to identify articles addressing surgical safety and culture throughout Africa. Included studies involved operating room professionals in African countries and used quantitative, qualitative, or mixed-methods designs. We excluded non-operating room settings, patient-only studies, inaccessible full texts, reviews, editorials, letters, conference abstracts, and duplicates. Two reviewers independently screened and appraised studies using the Mixed Methods Appraisal Tool. Findings were synthesized narratively with subgroup analysis by study type and theme. Out of 9,875 identified records, 22 studies from 12 African countries (2014–2024) met inclusion criteria, with Ethiopia contributing the highest number (n = 4). Various assessment tools, including the Hospital Survey on Patient Safety Culture, the Safety Attitudes Questionnaire, and the National Surgical, Obstetric, and Anaesthesia Plans interview manual, revealed recurring challenges: inadequate non-punitive responses to errors, communication barriers, hierarchical structures, and resource constraints. Four interventions showed promise: implementation and training on the WHO Surgical Safety Checklist, Safe Surgery 2020 initiatives, Non-Technical Skills for Surgeons training, and multidisciplinary training. The heterogeneity of study designs, sample sizes, and outcome measures limited direct comparisons and precluded meta-analysis. Nonetheless, the review highlights persistent barriers and emerging opportunities to strengthen patient safety culture in African operating rooms. While the WHO Surgical Safety Checklist remains valuable, sustainable progress requires multi-level strategies that address systemic constraints and incorporate context-sensitive adaptations. PROSPERO, CRD42024627076.
Equality saturation enables compilers to explore many semantically equivalent program variants, deferring optimization decisions to a final … (voir plus)extraction phase. However, existing frameworks exhibit sequential execution and hard-coded saturation loops. This limits scalability and requires significant engineering effort to customize saturation behavior. This paper addresses these limitations using three novel techniques. First, it shows how saturation can be parallelized thanks to the use of thread-safe data structures and the notion of deferred e-graph updates. Second, it provides an extensible mechanism to express custom and composable saturation strategies. Third, it generalizes e-graph metadata to support custom e-graph annotations. The implementation, written in Scala, is evaluated on four use-cases: classical program optimization, idiom recognition, scalability strategies and incremental equality saturation. The results show that it outperforms several existing equality saturation engines, including the highly optimized egglog library. When used to reimplement an existing idiom recognition technique, the new design finds higher-quality idioms, 16× faster. Additionally, the design is able to natively express state-of-the-art custom equality saturation behavior such as incremental equality saturation and multi-phase rewriting strategies without any modification to the core library.
2026-01-27
ACM SIGPLAN International Conference on Compiler Construction (publié)
Earth System Models (ESM) are our main tool for projecting the impacts of climate change. However, running these models at sufficient resolu… (voir plus)tion for local-scale risk-assessments is not computationally feasible. Deep learning-based super-resolution models offer a promising solution to downscale ESM outputs to higher resolutions by learning from data. Yet, due to regional variations in climatic processes, these models typically require retraining for each geographical area–demanding high-resolution observational data, which is unevenly available across the globe. This highlights the need to assess how well these models generalize across geographic regions. To address this, we introduce RainShift, a dataset and benchmark for evaluating downscaling under geographic distribution shifts. We evaluate state-of-the-art downscaling approaches including GANs and diffusion models in generalizing across data gaps between the Global North and Global South. Our findings reveal substantial performance drops in out-of-distribution regions, depending on model and geographic area. While expanding the training domain generally improves generalization, it is insufficient to overcome shifts between geographically distinct regions. We show that addressing these shifts through, for example, domain adaptation can improve spatial generalization. Our work advances the global applicability of downscaling methods and represents a step toward reducing inequities in access to high-resolution climate information.
In this work, we introduce Sudanese-Flores, an extension of the popular Flores+ machine translation (MT) benchmark to the Sudanese Arabic di… (voir plus)alect. We translate both the DEV and DEVTEST splits of the Modern Standard Arabic dataset into the corresponding Sudanese dialect, resulting in a total of 2,009 sentences. While the dialect was recently introduced in Google Translate, there are no available benchmark in this dialect despite spoken by over 40 million people. Our evaluation on two leading LLMs such as GPT-4.1 and Gemini 2.5 Flash showed that while the performance English to Arabic is impressive (more than 23 BLEU), they struggle on Sudanese dialect (less than 11 BLEU) in zero-shot settings. In few-shot scenario, we achieved only a slight boost in performance.
2026-01-26
AfricaNLP @ Conference of European Chapter of Association for Computational Linguistics (publié)
Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (voir plus)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce **C**ontractive **D**iffusion **P**olicies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity. Project page: https://contractive-diffusion.github.io
2026-01-25
International Conference on Learning Representations (poster)