Développez des compétences fondamentales en intelligence artificielle (IA) responsable grâce à des cours autodirigés, animés par des expert·e·s de Mila reconnu·e·s à l’échelle internationale.
Le Fellowship Mila en politiques de l'IA transforme l'expertise approfondie en IA en politiques rigoureuses d'intérêt public. Découvrez la dernière publication Combler la disparité en matière d’expertise : mécanismes de transfert des connaissances pour la réglementation de l’IA par Moritz von Knebel.
Ce programme soutient les startups spécialisées en IA à tout moment de l'année. Bénéficiez de ressources de pointe et d'un accompagnement sur mesure pour accélérer le développement de votre technologie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Stackelberg Dynamic Location Planning under Cumulative Demand
Dynamic facility location problems predominantly suppose a monopoly over the service or product provided. Nonetheless, this premise can be a… (voir plus) severe oversimplification in the presence of market competitors, as customers may prefer facilities installed by one of them. The monopolistic assumption can particularly worsen planning performance when demand depends on prior location decisions of the market participants, namely, when unmet demand from one period carries over to the next. Such a demand behaviour creates an intrinsic relationship between customer demand and location decisions of all market participants, and requires the decision-maker to anticipate the competitor's response. This work studies a novel competitive facility location problem that combines cumulative demand and market competition to devise high-quality solutions. We propose bilevel mixed-integer programming formulations for two variants of our problem, prove that the optimistic variant is
Graph Neural Networks (GNNs) have become essential in high-stakes domains such as drug discovery, yet their black-box nature remains a signi… (voir plus)ficant barrier to trustworthiness. While self-explainable GNNs attempt to bridge this gap, they often rely on standard message-passing backbones that inherit fundamental limitations, including the 1-Weisfeiler-Lehman (1-WL) expressivity barrier and a lack of fine-grained interpretability. To address these challenges, we propose SymGraph, a symbolic framework designed to transcend these constraints. By replacing continuous message passing with discrete structural hashing and topological role-based aggregation, our architecture theoretically surpasses the 1-WL barrier, achieving superior expressiveness without the overhead of differentiable optimization. Extensive empirical evaluations demonstrate that SymGraph achieves state-of-the-art performance, outperforming existing self-explainable GNNs. Notably, SymGraph delivers 10x to 100x speedups in training time using only CPU execution. Furthermore, SymGraph generates rules with superior semantic granularity compared to existing rule-based methods, offering great potential for scientific discovery and explainable AI.
Interpretability research on large language models (LLMs) has yielded important insights into model behaviour, yet recurring pitfalls persis… (voir plus)t: findings that do not generalise, and causal interpretations that outrun the evidence. Our position is that causal inference specifies what constitutes a valid mapping from model activations to invariant high-level structures, the data or assumptions needed to achieve it, and the inferences it can support. Specifically, Pearl's causal hierarchy clarifies what an interpretability study can justify. Observations establish associations between model behaviour and internal components. Interventions (e.g., ablations or activation patching) support claims how these edits affect a behavioural metric (e.g., average change in token probabilities) over a set of prompts. However, counterfactual claims -- i.e., asking what the model output would have been for the same prompt under an unobserved intervention -- remain largely unverifiable without controlled supervision. We show how causal representation learning (CRL) operationalises this hierarchy, specifying which variables are recoverable from activations and under what assumptions. Together, these motivate a diagnostic framework that helps practitioners select methods and evaluations matching claims to evidence such that findings generalise.
We challenge black-box purely deep neural approaches for molecules and graph generation, which are limited in controllability and lack forma… (voir plus)l guarantees. We introduce Neuro-Symbolic Graph Generative Modeling (NSGGM), a neurosymbolic framework that reapproaches molecule generation as a scaffold and interaction learning task with symbolic assembly. An autoregressive neural model proposes scaffolds and refines interaction signals, and a CPU-efficient SMT solver constructs full graphs while enforcing chemical validity, structural rules, and user-specific constraints, yielding molecules that are correct by construction and interpretable control that pure neural methods cannot provide. NSGGM delivers strong performance on both unconstrained generation and constrained generation tasks, demonstrating that neuro-symbolic modeling can match state-of-the-art generative performance while offering explicit controllability and guarantees. To evaluate more nuanced controllability, we also introduce a Logical-Constraint Molecular Benchmark, designed to test strict hard-rule satisfaction in workflows that require explicit, interpretable specifications together with verifiable compliance.
Rolling stock scheduling and crew scheduling are two fundamental problems that arise in the planning of urban rail operations and that are e… (voir plus)specially important in the case of flexible operations in real-world networks. These problems are often solved separately and sequentially in different planning stages, resulting in limited options to adjust crew schedules after rolling stock decisions have been made. To better adjust these two decision-making processes and achieve better solutions, this paper studies a joint rolling stock and crew scheduling problem in urban rail networks. A novel optimization model is formulated with the aim of reducing the operational cost of rolling stock units and crew members. In addition, the multi-train composition mode is considered to adequately match different frequency requirements and rolling stock transport capacities. To solve the model, a customized branch-and-price-and-cut solution algorithm is proposed to find the optimal schedule schemes, in which Benders decomposition is used to solve the linear programming relaxation of the path-based reformulation. Two customized column generation methods with label correcting are embedded to solve the master problem and pricing subproblem for generating paths (columns) corresponding to rolling stock units and crew groups, respectively. Finally, a branch-and-bound procedure with several acceleration techniques is proposed to find integer solutions. To demonstrate the computational performance and the robustness of the proposed approaches, a series of numerical experiments are performed in real-world instances of the Beijing urban rail network under different settings. The computational results confirm the high efficiency of the solution methodology and the benefits of the flexible operation schemes based on the solutions found by the proposed methods. Funding: This work was supported by National Natural Science Foundation of China [Grants 72288101, 72322022, 72371015]. The first author sincerely thanks the China Scholarship Council for supporting his visiting PhD program [Grant 202407090173]. Supplemental Material: The electronic companion is available at https://doi.org/10.1287/trsc.2024.0905 .
Deep neural nets achieve remarkable performance when training and test data share the same distribution, but this assumption frequently brea… (voir plus)ks in real-world deployment, where data undergoes continual distributional shifts. Continual Test-Time Adaptation (CTTA) addresses this challenge by adapting pretrained models to non-stationary target distributions on-the-fly, without access to source data or labeled targets, while mitigating two critical failure modes: catastrophic forgetting of source knowledge and error accumulation from noisy pseudo-labels over extended time horizons. In this comprehensive survey, we formally define the CTTA problem, analyze the diverse continual domain shift patterns that characterize different evaluation protocols, and propose a hierarchical taxonomy that categorizes existing methods into three families: optimization-based strategies (entropy minimization, pseudo-labeling, parameter restoration), parameter-efficient methods (normalization layer adaptation, adaptive parameter selection), and architecture-based approaches (teacher-student frameworks, adapters, visual prompting, masked modeling). We systematically review representative methods within each category and present comparative benchmarks and experimental results across standard evaluation settings. Finally, we discuss limitations of current approaches and highlight emerging research directions, including adaptation of foundation models and black-box systems, providing a roadmap for future research in robust continual test-time adaptation. We encourage visiting our repository at https://github.com/sarthaxxxxx/Awesome-Continual-Test-Time-Adaptation.
Model diffing methods aim to identify how fine-tuning changes a model's internal representations. Crosscoders approach this by learning shar… (voir plus)ed dictionaries of interpretable latent directions between base and fine-tuned models. However, existing formulations struggle with narrow fine-tuning, where behavioral changes are localized and asymmetric. We introduce Delta-Crosscoder, which combines BatchTopK sparsity with a delta-based loss prioritizing directions that change between models, plus an implicit contrastive signal from paired activations on matched inputs. Evaluated across 10 model organisms, including synthetic false facts, emergent misalignment, subliminal learning, and taboo word guessing (Gemma, LLaMA, Qwen; 1B-9B parameters), Delta-Crosscoder reliably isolates latent directions causally responsible for fine-tuned behaviors and enables effective mitigation, outperforming SAE-based baselines, while matching the Non-SAE-based. Our results demonstrate that crosscoders remain a powerful tool for model diffing.
The primary focus of multi-agent reinforcement learning (MARL) has been to study interactions among a fixed number of agents embedded in an … (voir plus)environment. However, in the real world, the number of agents is neither fixed nor known a priori. Moreover, an agent can decide to create other agents (for example, a cell may divide, or a company may spin off a division). In this paper, we propose a framework that allows agents to create other agents; we call this a fluid-agent environment. We present game-theoretic solution concepts for fluid-agent games and empirically evaluate the performance of several MARL algorithms within this framework. Our experiments include fluid variants of established benchmarks such as Predator-Prey and Level-Based Foraging, where agents can dynamically spawn, as well as a new environment we introduce that highlights how fluidity can unlock novel solution strategies beyond those observed in fixed-population settings. We demonstrate that this framework yields agent teams that adjust their size dynamically to match environmental demands.
Importance: Among individuals with high levels of amyloid-β (Aβ), women exhibit higher insoluble tau burden and accumulation than age-matc… (voir plus)hed men. It remains unclear whether this sex difference is influenced by soluble phosphorylated tau (p-tau), a biomarker that changes early in Alzheimer disease. Objective: To investigate whether sex and aggregated Aβ synergistically predict plasma phosphorylated tau 217 (p-tau217) levels and whether levels of p-tau217 predict cross-sectional and longitudinal tau aggregation in a sex-specific manner (as measured by positron emission tomography [PET]). Design, Setting, and Participants: This longitudinal study analyzed data between September 7, 2024, and October 29, 2025, from 1 clinical trial cohort and 4 observational study cohorts including men and women without cognitive impairment who had undergone multiple assessments via tau PET (18F-flortaucipir or 18F-MK-6240) and plasma p-tau217 assay at baseline. Cognitive performance was measured with the Preclinical Alzheimer Cognitive Composite. Data on cognitive performance were available from 3 of the 5 cohorts for a mean of 4.6 years (SD, 3.1 years). Across the 5 cohorts, the mean follow-up for tau PET was 3.6 years (SD, 1.7 years). Exposures: Self-reported sex (male or female), tau PET, and p-tau217 assay. Main Outcomes and Measures: The primary analyses used linear and mixed-effects models to assess baseline and longitudinal sex × p-tau217 interactions for 9 tau PET regions. The secondary analyses assessed sex × p-tau217 interactions for cognitive change using the Preclinical Alzheimer Cognitive Composite. Results: Across the 5 cohorts, there were a total of 1292 participants (63.6% women; mean age, 70.6 [SD, 6.4] years) with tau PET assessments. Compared with men, women had significantly higher baseline p-tau217 levels at higher aggregated Aβ Centiloid levels (β, -0.21 [95% CI, -0.37 to -0.05], P = .009; highest interaction was found in the Anti-Amyloid Treatment in Asymptomatic Alzheimer's Disease/Longitudinal Evaluation of Amyloid Risk and Neurodegeneration [A4/LEARN] cohort). The sex × p-tau217 interactions at baseline were significant for 1 tau PET region in the Harvard Aging Brain Study (HABS) cohort, for 2 tau PET regions in the A4/LEARN cohort, for 6 tau PET regions in the Wisconsin Registry of Alzheimer's Prevention (WRAP) cohort, and for 4 tau PET regions in the Presymptomatic Evaluation of Experimental or Novel Treatments for Alzheimer's Disease (PREVENT-AD) cohort. Longitudinal interactions were significant for 4 tau PET regions in the A4/LEARN cohort, for 5 tau PET regions in both the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort and the WRAP cohort, and for 2 PET regions in both the HABS cohort and the PREVENT-AD cohort. Compared with men, women displayed greater tau deposition and accumulation at higher p-tau217 levels. Use of a secondary model showed women with higher p-tau217 levels also exhibited faster rates of cognitive decline relative to men in the both the WRAP cohort and the ADNI cohort. Conclusion and Relevance: These findings add to growing evidence that women have a differential tau response to Aβ that may emerge at the point of p-tau secretion. These findings have implications for the therapeutics and diagnostics of preclinical Alzheimer disease.
Large-scale analysis of axon and myelin morphometry in nervous tissues is fundamental to neuroscience research, yet manual quantification re… (voir plus)mains a profound bottleneck, limiting the scale and efficiency of studies. To address this, we introduce the Axon Segmentation Training Initiative for Histology (ASTIH), a publicly accessible resource designed to propel the development and validation of automated histomorphometry tools. ASTIH comprises five meticulously curated datasets, standardized for machine learning applications, featuring over 69,000 manually segmented axon fibers. These datasets exhibit significant diversity, spanning three microscopy modalities (TEM, SEM, bright-field), three species (mouse, rat, rabbit), and three distinct anatomical regions (brain, spinal cord, peripheral nerves) with varying pixel resolutions (from ~0.2 to 0.002
Medical imaging systems increasingly rely on large vision language foundation models (VLFMs) trained on diverse biomedical corpora, yet thes… (voir plus)e models remain difficult to adapt to new clinical tasks without costly fine-tuning and large annotated datasets. We present PIKACHU (Prototypical In-Context Knowledge Adaptation for Clinical Heterogeneous Usage), a lightweight and generalizable framework that enables rapid few-shot adaptation of frozen medical FMs using only a handful of labeled examples. Unlike prior approaches that modify backbone weights or introduce heavy attention-based adapters, PIKACHU performs all task adaptation directly in the FM feature space through in-context prototypical reasoning. Given a small support set, the framework constructs class prototypes by averaging normalized embeddings from a frozen VLFM image encoder and performs prediction on query images using temperature-scaled cosine similarity. Only a single temperature parameter is learned. We evaluate PIKACHU across three heterogeneous medical imaging datasets - dermatological images (ISIC), Optical Coherence Tomography (OCT), and Diabetic Retinopathy (DR), using established vision models (SigLIP, PubMedCLIP, DinoV2, and ViT) as backbones. The proposed in-context learning (ICL) strategy consistently outperforms the baseline (zero-shot) approaches across all datasets and architectures, achieving substantial improvements in both accuracy and AUC. Notably, with PubMedCLIP as the backbone, PIKACHU achieves 0.69/0.76 (Acc./AUC) on the ISIC dataset, 0.72/0.78 on OCT, and 0.79/0.88 on DR, demonstrating robust generalization across diverse clinical imaging modalities. These results highlight the promise of feature-space in-context learning as efficient and deployable paradigms for test-time adaptation of foundation models, without the need for extensive retraining.