Développez des compétences fondamentales en intelligence artificielle (IA) responsable grâce à des cours autodirigés, animés par des expert·e·s de Mila reconnu·e·s à l’échelle internationale.
Le Fellowship Mila en politiques de l'IA transforme l'expertise approfondie en IA en politiques rigoureuses d'intérêt public. Découvrez la dernière publication Combler la disparité en matière d’expertise : mécanismes de transfert des connaissances pour la réglementation de l’IA par Moritz von Knebel.
Ce programme soutient les startups spécialisées en IA à tout moment de l'année. Bénéficiez de ressources de pointe et d'un accompagnement sur mesure pour accélérer le développement de votre technologie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
In recent years, many industries have utilized machine learning models (ML) in their systems. Ideally, machine learning models should be tra… (voir plus)ined on and applied to data from the same distributions. However, the data evolves over time in many application areas, leading to data and concept drift, which in turn causes the performance of the ML models to degrade over time. Therefore, maintaining up to date ML models plays a critical role in the MLOps pipeline. Existing ML model maintenance approaches are often computationally resource intensive, costly, time consuming, and model dependent. Thus, we propose an improved MLOps pipeline, a new model maintenance approach and a Similarity Based Model Reuse (SimReuse) tool to address the challenges of ML model maintenance. We identify seasonal and recurrent distribution patterns in time series datasets throughout a preliminary study. Recurrent distribution patterns enable us to reuse previously trained models for similar distributions in the future, thus avoiding frequent retraining. Then, we integrated the model reuse approach into the MLOps pipeline and proposed our improved MLOps pipeline. Furthermore, we develop SimReuse, a tool to implement the new components of our MLOps pipeline to store models and reuse them for inference of data segments with similar data distributions in the future. Our evaluation results on four time series datasets demonstrate that our model reuse approach can maintain the performance of models while significantly reducing maintenance time and costs. Our model reuse approach achieves ML performance comparable to the best baseline, while being 15 times more efficient in terms of computation time and costs. Therefore, industries and practitioners can benefit from our approach and use our tool to maintain the performance of their ML models in the deployment phase to reduce their maintenance costs.
Envisioning digital health ecosystem transformation in Canada «a conceptual foundation en Neuf Etapes.»
Nitika Pant Pai
Samira Abbasgholizadeh Rahimi
Juhi Tulsi
Susan Bartlett
Steven Grover
Ervin Sejdic
Canada’s journey towards digital health transformation is in a phase that precedes widespread catalytic change, trailing peer nations. In … (voir plus)this perspective piece, we discuss the nine steps that are key to catalyzing digital health transformation. We highlight the importance of foundational investments for health systems redesign. These investments in interoperability, unified digital core and ID, scalable health data systems, will enable precision-focused clinical care and prevention-focused public health with Smart Care Everywhere models. Our conceptual foundation highlights the importance of an agile health system with a unified digital core, capable of integrating multiple AI-enhanced digital tools, and managing the data deluge of multimodal data. We hereby advocate for a Smart, Scalable, Digitized, “Care Everywhere” model that can expand health care access to all of its populations: the served and the underserved. An essential component of the foundation is the creation of agile health systems and business models that prevent provider burnout while promoting collaborative, connected care that reaches served and under-served populations, with caring, compassion, enabling an improved engagement and connection. We also call for an investment in the continuous training of healthcare professionals, data professionals, and for an ethical, efficient implementation of AI/digital solutions everywhere from hospitals to community care settings. We also highlight the necessity of data governance policies to safeguard patient autonomy, promote data ownership, to ensure health data privacy, security, and confidentiality. This nine-step approach offers a framework for a unified, connected, patient-centred health ecosystem operationalized/made efficient with digital/AI solutions for patient communities, enabled by connectivity, caring, and compassion. Together, these nine steps serve as a conceptual foundation to enable a sustainable health system that advances access, equity, and efficiency in caring in health care nationwide.
Reinforcement learning from human feedback (RLHF) with proximal policy optimization (PPO) is widely used but often yields less diverse outpu… (voir plus)ts than supervised fine-tuning, suggesting an effect in which the policy’s support contracts during on-policy optimization. We formalize this “policy contraction” with the Support Retention Ratio (SRR)—the share of SFT completions that retain non-negligible probability under the RL policy—and additionally track token-entropy, Kullback–Leibler (KL) divergence to the reference, and repetition. We propose Contraction-Aware PPO (CaPPO), a minimum-norm multi-gradient update that co-optimizes reward, entropy, and KL, paired with a controller that steers exploration toward a target token entropy. On HH-RLHF, Summarize-from-Feedback, and UltraFeedback with Qwen2-7B, Qwen2.5-14B, Mistral-7B-Instruct, and Llama-3-8B-Instruct, CaPPO increases win rate by 2 to 4 points over PPO and improves diversity, gaining 0.2 to 0.3 higher SRR. The gains persist under decoding sweeps and are robust to reward scaling and critic variance. Treating reward, diversity, and stability as first-class objectives, CaPPO mitigates contraction without sacrificing alignment performance.
2025-12-31
International Conference on Learning Representations (Accept (Poster))
Protein-protein interactions (PPIs) are mediated at the residue level. Most sequence-based PPI models consider residue-residue interactions … (voir plus)across two proteins, which can yield accurate interaction scores but are too slow to scale. At proteome scale, identifying candidate PPIs requires evaluating nearly *all possible protein pairs*. For
2025-12-31
International Conference on Learning Representations (Accept (Poster))
Fast Sphere Decoding of Short Systematic Polar-like Codes
Huayi Zhou
Y. Liu
Xiaosi Tan
Chen Ji
Warren J. Gross
Chuan Zhang
Short polar-like codes are competitive for low latency requirements in future communications. Systematic polar codes have not been shown to … (voir plus)offer substantial benefits for decoders beyond improving the bit error rate. In this paper, we demonstrate that the sparsity of the equivalent generator matrix of systematic polar codes significantly reduces calculation complexity when using sphere decoding (SD). We propose a fast SD (Fast-SD) for systematic polar codes. Numerical results indicate that the proposed Fast-SD reduces calculation complexity by up to 33.25% compared to SD on short high-rate codes while maintaining maximum likelihood performance.
2025-12-31
IEEE Transactions on Vehicular Technology (publié)
The segmentation clock is an emergent embryonic oscillator that controls the periodic formation of vertebrae precursors (or somites). It rel… (voir plus)ies on the self-organization at the presomitic mesoderm (PSM) level of multiple coupled cellular oscillators. Dissociation-reaggregation experiments have further revealed that ensembles made of such cellular oscillators self-organize into an oscillatory bidimensional system, showing concentric waves around multiple foci. Here, we systematically study the dynamics of a two-dimensional lattice of phase oscillators locally coupled to their nearest neighbors through a biharmonic coupling function of the form sinθ+Λsin^{2}θ. This coupling was inferred from the phase response curve of entrainment experiments on cell cultures, leading to the formulation of a minimal Elliptic Radial Isochron Cycle (ERIC) phase model. We show that such ERIC-based coupling parsimoniously explains the emergence of self-organized concentric phase wave patterns around multiple foci for a range of weak couplings and wide distributions of initial random phases, closely mimicking experimental conditions. We further study extended modalities of this problem to derive an atlas of possible behaviors. In particular, we predict the dominant observation of spirals over target wave patterns for initial phase distributions wider than approximately π. Since PSM cells further display properties of an excitable system, we also introduce excitability into our simple model and show that it also supports the observation of concentric phase waves for the conditions of the experiment. Our work suggests important modifications that can be made to the simple phase model with Kuramoto coupling, which can provide further layers of complexity and aid in the explanation of the spatial aspects of self-organization in the segmentation clock.
Modern deep learning is increasingly characterized by the use of open-weight foundation models that can be fine-tuned on specialized dataset… (voir plus)s. This has led to a proliferation of expert models and adapters, often shared via platforms like HuggingFace and AdapterHub. Model merging has recently emerged as an effective way to leverage these existing resources, enabling the composition of capabilities from different model checkpoints. A natural pipeline has thus formed to harness the benefits of transfer learning and amortize sunk training costs: models are pre-trained on general data, fine-tuned on specific tasks, and then multiple checkpoints are merged to obtain a more capable model. A prevailing assumption is that improvements at one stage of this pipeline propagate downstream, leading to gains at subsequent steps. In this work, we challenge that assumption by examining how expert fine-tuning affects model merging. We show that long fine-tuning of experts that optimizes for their individual performance leads to degraded merging performance across vision and language modalities, multiple model scales, and both fully fine-tuned and LoRA-adapted models. We trace this degradation to the memorization of a small set of difficult examples that dominate late fine-tuning steps. This causes negative parameter interference and encodes knowledge that is forgotten during merging. Finally, we demonstrate that task-dependent aggressive early stopping strategies can significantly improve model merging performance.
2025-12-31
International Conference on Machine Learning (Accept (regular))
Human level agentic intelligence transcends low-level geometric perception, evolving from knowing where things are to understanding what the… (voir plus)y are for. While existing benchmarks effectively evaluate this foundational geometric perception capabilites of multimodal LLMs, they fall short of probing the higher-order cognitive abilities essential for grounded intelligence. To bridge this gap, we introduce the Spatial-Functional Intelligence Benchmark (SFI-Bench), a video-based benchmark with over 1500 expert-annotated questions derived from diverse, egocentric indoor video scans. SFI-Bench is designed to systematically evaluate two complementary dimensions of advanced reasoning: 1) Structured Spatial Reasoning, understanding complex layouts and forming coherent spatial representations, and 2) Functional Reasoning, inferring object affordances and context-dependent utility. Its tasks, including conditional counting, multi-hop relational reasoning, functional pairing, and knowledge-grounded troubleshooting, directly challenge a model's ability to integrate perception, memory, and inference. Our experiments reveal that current MLLMs consistently struggle to integrate spatial memory with functional and external knowledge, highlighting a critical bottleneck. SFI-Bench thus provides an essential tool for measuring and driving progress towards more cognitively capable and truly grounded multimodal agents.
Gait training combined with transcutaneous spinal stimulation to enhance lower limbs motor recovery in people with spinal cord injury: Pilot Study
Most applications of generative AI involve a sequential interaction in which a person inputs a prompt and waits for a response, and where re… (voir plus)action time and adaptivity are not important factors. In contrast, live jamming is a collaborative interaction that requires real-time coordination and adaptation without access to the other player’s future moves, while preserving diversity to sustain a creative flow. Reinforcement learning post-training enables effective adaptation through on-policy interaction, yet it often reduces output diversity by exploiting coherence-based rewards. This collapse, known as ``reward hacking'', affects many RL post-training pipelines, but is especially harmful in live jamming, where musical creativity relies on dynamic variation and mutual responsiveness. In this paper, we propose a novel adversarial training method on policy-generated trajectories to mitigate reward hacking in RL post-training for melody-to-chord accompaniment. A co-evolving discriminator separates policy trajectories from the data distribution, while the policy maximizes the discriminator output in addition to coherence rewards to prevent collapse to trivial outputs. We evaluate accompaniment quality and output diversity in simulation with both fixed test melodies and learned melody agents, and we conduct a user study with the model deployed in a real-time interactive system with expert musicians. Quantitative evaluation and user feedback demonstrate improved output diversity, harmonic coherence, adaptation speed and user agency. Our results demonstrate a simple yet effective method to mitigate reward hacking in RL post-training of generative sequence models.
2025-12-31
International Conference on Learning Representations (Accept (Poster))