Développez des compétences fondamentales en intelligence artificielle (IA) responsable grâce à des cours autodirigés, animés par des expert·e·s de Mila reconnu·e·s à l’échelle internationale.
Le Fellowship Mila en politiques de l'IA transforme l'expertise approfondie en IA en politiques rigoureuses d'intérêt public. Découvrez la dernière publication Combler la disparité en matière d’expertise : mécanismes de transfert des connaissances pour la réglementation de l’IA par Moritz von Knebel.
Ce programme soutient les startups spécialisées en IA à tout moment de l'année. Bénéficiez de ressources de pointe et d'un accompagnement sur mesure pour accélérer le développement de votre technologie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Overparameterization is central to the success of deep learning, yet the mechanisms by which it improves optimization remain incompletely un… (voir plus)derstood. We analyze weight-space symmetries in neural networks and show that overparameterization introduces additional symmetries that benefit optimization in two distinct ways. First, we prove that these symmetries act as a form of diagonal preconditioning on the Hessian, enabling the existence of better-conditioned minima within each equivalence class of functionally identical solutions. Second, we show that overparameterization increases the probability mass of global minima near typical initializations, making these favorable solutions more reachable. Teacher-student network experiments validate our theoretical predictions: as width increases, the Hessian trace decreases, condition numbers improve, and convergence accelerates. Our analysis provides a unified framework for understanding overparameterization and width growth as a geometric transformation of the loss landscape.
Adapting a pretrained diffusion model to new objectives at inference time remains an open problem in generative modeling. Existing steering … (voir plus)methods suffer from inaccurate value estimation, especially at high noise levels, which biases guidance. Moreover, information from past runs is not reused to improve sample quality, resulting in inefficient use of compute. Inspired by the success of Monte Carlo Tree Search, we address these limitations by casting inference-time alignment as a search problem that reuses past computations. We introduce a tree-based approach that samples from the reward-aligned target density by propagating terminal rewards back through the diffusion chain and iteratively refining value estimates with each additional generation. Our proposed method, Diffusion Tree Sampling (DTS), produces asymptotically exact samples from the target distribution in the limit of infinite rollouts, and its greedy variant, Diffusion Tree Search (DTS
2025-12-02
Neural Information Processing Systems (Accept (poster))
This paper introduces the framework of multi-armed sampling, as the sampling counterpart to the optimization problem of multi-arm bandits. O… (voir plus)ur primary motivation is to rigorously examine the exploration-exploitation trade-off in the context of sampling. We systematically define plausible notions of regret for this framework and establish corresponding lower bounds. We then propose a simple algorithm that achieves these optimal regret bounds. Our theoretical results demonstrate that in contrast to optimization, sampling does not require exploration. To further connect our findings with those of multi-armed bandits, we define a continuous family of problems and associated regret measures that smoothly interpolates and unifies multi-armed sampling and multi-armed bandit problems using a temperature parameter. We believe the multi-armed sampling framework, and our findings in this setting can have a foundational role in the study of sampling including recent neural samplers, akin to the role of multi-armed bandits in reinforcement learning. In particular, our work sheds light on the need for exploration and the convergence properties of algorithm for entropy-regularized reinforcement learning, fine-tuning of pretrained models and reinforcement learning with human feedback (RLHF).