Publications

Tri-LLM Cooperative Federated Zero-Shot Intrusion Detection with Semantic Disagreement and Trust-Aware Aggregation

Saeid Jamshidi

Omar Abdel Wahab

Kawser Wazed Nafi

Federated learning (FL) has become an effective paradigm for privacy-preserving, distributed Intrusion Detection Systems (IDS) in cyber-phys… (see more)ical and Internet of Things (IoT) networks, where centralized data aggregation is often infeasible due to privacy and bandwidth constraints. Despite its advantages, most existing FL-based IDS assume closed-set learning and lack mechanisms such as uncertainty estimation, semantic generalization, and explicit modeling of epistemic ambiguity in zero-day attack scenarios. Additionally, robustness to heterogeneous and unreliable clients remains a challenge in practical applications. This paper introduces a semantics-driven federated IDS framework that incorporates language-derived semantic supervision into federated optimization, enabling open-set and zero-shot intrusion detection for previously unseen attack behaviors. The approach constructs semantic attack prototypes using a Tri-LLM ensemble of GPT-4o, DeepSeek-V3, and LLaMA-3-8B, aligning distributed telemetry features with high-level attack concepts. Inter-LLM semantic disagreement is modeled as epistemic uncertainty for zero-day risk estimation, while a trust-aware aggregation mechanism dynamically weights client updates based on reliability. Experimental results show stable semantic alignment across heterogeneous clients and consistent convergence. The framework achieves over 80% zero-shot detection accuracy on unseen attack patterns, improving zero-day discrimination by more than 10% compared to similarity-based baselines, while maintaining low aggregation instability in the presence of unreliable or compromised clients.

2026-01-29

ArXiv (preprint)

arxiv.org

Boosting CVaR Policy Optimization with Quantile Gradients

Yudong Luo

Erick Delage

Optimizing Conditional Value-at-risk (CVaR) using policy gradient (a.k.a CVaR-PG) faces significant challenges of sample inefficiency. This … (see more)inefficiency stems from the fact that it focuses on tail-end performance and overlooks many sampled trajectories. We address this problem by augmenting CVaR with an expected quantile term. Quantile optimization admits a dynamic programming formulation that leverages all sampled data, thus improves sample efficiency. This does not alter the CVaR objective since CVaR corresponds to the expectation of quantile over the tail. Empirical results in domains with verifiable risk-averse behavior show that our algorithm within the Markovian policy class substantially improves upon CVaR-PG and consistently outperforms other existing methods.

2026-01-28

arXiv (preprint)

doi.org

arxiv.org

CNeuroMod-THINGS, a densely-sampled fMRI dataset for visual neuroscience

Marie St-Laurent

Basile Pinsard

Oliver Contier

Elizabeth DuPré

Katja Seeliger

Valentina Borghesani

Julie A. Boyle

Lune Bellec

Martin N. Hebart

Data-hungry neuro-AI modelling requires ever larger neuroimaging datasets. CNeuroMod-THINGS meets this need by capturing neural representati… (see more)ons for a wide set of semantic concepts using well-characterized images in a new densely-sampled, large-scale fMRI dataset. Importantly, CNeuroMod-THINGS exploits synergies between two existing projects: the THINGS initiative (THINGS) and the Courtois Project on Neural Modelling (CNeuroMod). THINGS has developed a common set of thoroughly annotated images broadly sampling natural and man-made objects which is used to acquire a growing collection of multimodal neural responses. Meanwhile, CNeuroMod is acquiring hundreds of hours of fMRI data from a core set of participants during controlled and naturalistic tasks, including visual tasks like movie watching and videogame playing. For CNeuroMod-THINGS, four CNeuroMod participants each completed 33-36 sessions of a continuous recognition paradigm using 4320 images from the THINGS stimulus set spanning 720 categories. We report behavioural and neuroimaging metrics that showcase the quality of the data. By bridging together large existing resources, CNeuroMod-THINGS expands our capacity to model human vision in controlled and naturalistic settings.

2026-01-28

Scientific Data (published)

doi.org

arxiv.org

LSD Reconfigures Cortical Dynamics Through Faster Brain Rhythms and Increased Fractal Dimension

Venkatesh Subramani

Timothy Nest

Annalisa Pascarella

Jérémy Brunel

Yorguin José Mantilla Ramos

Yann Harel

Suresh Muthukumaraswamy

Robin Carhart-Harris

Giulia Lioi

Nicolas Farrugia

Karim Jerbi

Lysergic acid diethylamide (LSD) profoundly alters conscious experience, yet the electrophysiological mechanisms by which it reshapes neural… (see more) dynamics remain incompletely understood. A hallmark of psychedelic states is widespread cortical desynchronization, typically inferred from reductions in spectral power, but whether such effects reflect genuine weakening of neural oscillations or are confounded by shifts in oscillatory peak frequencies remains unresolved. Here, we address this gap by combining source-resolved magnetoencephalography (MEG), spectral parameterization, temporal complexity metrics, and interpretable machine learning in an LSD versus placebo design, with and without music. We show that LSD induces robust, spatially structured increases in alpha and beta peak frequencies alongside genuine attenuation of oscillatory power, with these effects displaying partly dissociable cortical patterns. Beyond rhythmic activity, LSD is associated with flattening of the aperiodic 1/f spectral slope and increased neural signal fractality and complexity, preferentially affecting sensory, language, emotion, and imagery-related networks while sparing motor cortex. Machine-learning analyses further identify peak-frequency shifts, aperiodic parameters, and complexity measures as key discriminators of the psychedelic state. Music does not robustly amplify these neural signatures and instead shows a trend toward attenuation. Together, these findings provide a comprehensive electrophysiological account of how LSD reorganizes large-scale human brain dynamics and highlight features that may differentiate its neural signature from that of other psychedelics.

2026-01-28

bioRxiv (preprint)

doi.org

Patient safety culture in the operating room of African hospitals: a systematic review

Jacques Fadhili Bake

Naïcen Ghanmi

Elena Guadagno

K. M. Claude

Tsongo Kibendelwa Zacharie

Dan Poenaru

Patient safety in operating rooms has globally improved through interventions such as the World Health Organization (WHO) Surgical Safety Ch… (see more)ecklist and multidisciplinary team training. However, while evidence from high-income countries is well documented, there remains limited consolidated knowledge on the understanding, application, and effectiveness of safety culture interventions in African surgical settings, which this review seeks to address. This systematic review examined factors and protocols affecting surgical safety in African operating rooms. We hypothesized that persistent systemic barriers undermine safety culture despite adoption of global measures. Following PRISMA 2020, we searched eight databases (Medline, Embase, Cochrane, Africa-Wide, CINAHL, Global Health, Global Index Medicus, Web of Science) from inception to 5 December 2024, using variations of text words present in the title, abstract, or keyword fields, alongside relevant subject headings, to identify articles addressing surgical safety and culture throughout Africa. Included studies involved operating room professionals in African countries and used quantitative, qualitative, or mixed-methods designs. We excluded non-operating room settings, patient-only studies, inaccessible full texts, reviews, editorials, letters, conference abstracts, and duplicates. Two reviewers independently screened and appraised studies using the Mixed Methods Appraisal Tool. Findings were synthesized narratively with subgroup analysis by study type and theme. Out of 9,875 identified records, 22 studies from 12 African countries (2014–2024) met inclusion criteria, with Ethiopia contributing the highest number (n = 4). Various assessment tools, including the Hospital Survey on Patient Safety Culture, the Safety Attitudes Questionnaire, and the National Surgical, Obstetric, and Anaesthesia Plans interview manual, revealed recurring challenges: inadequate non-punitive responses to errors, communication barriers, hierarchical structures, and resource constraints. Four interventions showed promise: implementation and training on the WHO Surgical Safety Checklist, Safe Surgery 2020 initiatives, Non-Technical Skills for Surgeons training, and multidisciplinary training. The heterogeneity of study designs, sample sizes, and outcome measures limited direct comparisons and precluded meta-analysis. Nonetheless, the review highlights persistent barriers and emerging opportunities to strengthen patient safety culture in African operating rooms. While the WHO Surgical Safety Checklist remains valuable, sustainable progress requires multi-level strategies that address systemic constraints and incorporate context-sensitive adaptations. PROSPERO, CRD42024627076.

2026-01-28

Patient Safety in Surgery (published)

doi.org

Parallel and Customizable Equality Saturation

Jonathan Van der Cruysse

Abd-El-Aziz Zayed

Mai Jacob Peng

Christophe Dubach

Equality saturation enables compilers to explore many semantically equivalent program variants, deferring optimization decisions to a final … (see more)extraction phase. However, existing frameworks exhibit sequential execution and hard-coded saturation loops. This limits scalability and requires significant engineering effort to customize saturation behavior. This paper addresses these limitations using three novel techniques. First, it shows how saturation can be parallelized thanks to the use of thread-safe data structures and the notion of deferred e-graph updates. Second, it provides an extensible mechanism to express custom and composable saturation strategies. Third, it generalizes e-graph metadata to support custom e-graph annotations. The implementation, written in Scala, is evaluated on four use-cases: classical program optimization, idiom recognition, scalability strategies and incremental equality saturation. The results show that it outperforms several existing equality saturation engines, including the highly optimized egglog library. When used to reimplement an existing idiom recognition technique, the new design finds higher-quality idioms, 16× faster. Additionally, the design is able to natively express state-of-the-art custom equality saturation behavior such as incremental equality saturation and multi-phase rewriting strategies without any modification to the core library.

2026-01-27

ACM SIGPLAN International Conference on Compiler Construction (published)

doi.org

Signal from Structure: Exploiting Submodular Upper Bounds in Generative Flow Networks

Alexandre Larouche

Audrey Durand

Generative Flow Networks (GFlowNets; GFNs) are a class of generative models that learn to sample compositional objects proportionally to the… (see more)ir a priori unknown value, their reward. We focus on the case where the reward has a specified, actionable structure, namely that it is submodular. We show submodularity can be harnessed to retrieve upper bounds on the reward of compositional objects that have not yet been observed. We provide in-depth analyses of the probability of such bounds occurring, as well as how many unobserved compositional objects can be covered by a bound. Following the Optimism in the Face of Uncertainty principle, we then introduce SUBo-GFN, which uses the submodular upper bounds to train a GFN. We show that SUBo-GFN generates orders of magnitude more training data than classical GFNs for the same number of queries to the reward function. We demonstrate the effectiveness of SUBo-GFN in terms of distribution matching and high-quality candidate generation on synthetic and real-world submodular tasks.

2026-01-27

arXiv (preprint)

doi.org

arxiv.org

Benchmarking the geographic generalization of deep learning models for precipitation downscaling

Paula Harder

Luca Schmidt

Francis Pelletier

Nicole Ludwig

Matthew Chantry

Christian Lessig

Alex Hernández-García

David Rolnick

Earth System Models (ESM) are our main tool for projecting the impacts of climate change. However, running these models at sufficient resolu… (see more)tion for local-scale risk-assessments is not computationally feasible. Deep learning-based super-resolution models offer a promising solution to downscale ESM outputs to higher resolutions by learning from data. Yet, due to regional variations in climatic processes, these models typically require retraining for each geographical area–demanding high-resolution observational data, which is unevenly available across the globe. This highlights the need to assess how well these models generalize across geographic regions. To address this, we introduce RainShift, a dataset and benchmark for evaluating downscaling under geographic distribution shifts. We evaluate state-of-the-art downscaling approaches including GANs and diffusion models in generalizing across data gaps between the Global North and Global South. Our findings reveal substantial performance drops in out-of-distribution regions, depending on model and geographic area. While expanding the training domain generally improves generalization, it is insufficient to overcome shifts between geographically distinct regions. We show that addressing these shifts through, for example, domain adaptation can improve spatial generalization. Our work advances the global applicability of downscaling methods and represents a step toward reducing inequities in access to high-resolution climate information.

2026-01-26

Scientific Reports (published)

doi.org

Sudanese-Flores: Extending FLORES+ to Sudanese Arabic Dialect

Hadia Mohmmedosman Ahmed Samil

David Ifeoluwa Adelani

In this work, we introduce Sudanese-Flores, an extension of the popular Flores+ machine translation (MT) benchmark to the Sudanese Arabic di… (see more)alect. We translate both the DEV and DEVTEST splits of the Modern Standard Arabic dataset into the corresponding Sudanese dialect, resulting in a total of 2,009 sentences. While the dialect was recently introduced in Google Translate, there are no available benchmark in this dialect despite spoken by over 40 million people. Our evaluation on two leading LLMs such as GPT-4.1 and Gemini 2.5 Flash showed that while the performance English to Arabic is impressive (more than 23 BLEU), they struggle on Sudanese dialect (less than 11 BLEU) in zero-shot settings. In few-shot scenario, we achieved only a slight boost in performance.

2026-01-26

AfricaNLP @ Conference of European Chapter of Association for Computational Linguistics (published)

openreview.net

Anatomically-aware conformal prediction for medical image segmentation with random walks

Melanie Gaillochet

Christian Desrosiers

Hervé Lombaert

2026-01-25

ArXiv (preprint)

arxiv.org

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

Roger Creus Castanyer

Faisal Mohamed

Pablo Samuel Castro

Cyrus Neary

Glen Berseth

Reinforcement learning (RL) algorithms are highly sensitive to reward function specification, which remains a central challenge limiting the… (see more)ir broad applicability. We present ARM-FM: Automated Reward Machines via Foundation Models, a framework for automated, compositional reward design in RL that leverages the high-level reasoning capabilities of foundation models (FMs). Reward machines (RMs) - an automata-based formalism for reward specification - are used as the mechanism for RL objective specification, and are automatically constructed via the use of FMs. The structured formalism of RMs yields effective task decompositions, while the use of FMs enables objective specifications in natural language. Concretely, we (i) use FMs to automatically generate RMs from natural language specifications; (ii) associate language embeddings with each RM automata-state to enable generalization across tasks; and (iii) provide empirical evidence of ARM-FM's effectiveness in a diverse suite of challenging environments, including evidence of zero-shot generalization.

2026-01-25

International Conference on Learning Representations (poster)

doi.org

openreview.net

Contractive Diffusion Policies

Charlotte Morissette

Diffusion policies have emerged as powerful generative models for offline policy learning, whose sampling process can be rigorously characte… (see more)rized by a score function guiding a Stochastic Differential Equation (SDE). However, the same score-based SDE modeling that grants diffusion policies the flexibility to learn diverse behavior also incurs solver and score-matching errors, large data requirements, and inconsistencies in action generation. While less critical in image generation, these inaccuracies compound and lead to failure in continuous control settings. We introduce **C**ontractive **D**iffusion **P**olicies (CDPs) to induce contractive behavior in the diffusion sampling dynamics. Contraction pulls nearby flows closer to enhance robustness against solver and score-matching errors while reducing unwanted action variance. We develop an in-depth theoretical analysis along with a practical implementation recipe to incorporate CDPs into existing diffusion policy architectures with minimal modification and computational cost. We evaluate CDPs for offline learning by conducting extensive experiments in simulation and real world settings. Across benchmarks, CDPs often outperform baseline policies, with pronounced benefits under data scarcity. Project page: https://contractive-diffusion.github.io

2026-01-25

International Conference on Learning Representations (poster)

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications