Perspectives sur l’IA pour les responsables des politiques
Co-dirigé par Mila et le CIFAR, ce programme met en relation les décideur·euse·s avec des chercheur·euse·s de pointe en IA grâce à une combinaison de consultations ouvertes et d'exercices de test de faisabilité des politiques. La prochaine session aura lieu les 9 et 10 octobre.
Hugo Larochelle nommé directeur scientifique de Mila
Professeur associé à l’Université de Montréal et ancien responsable du laboratoire de recherche en IA de Google à Montréal, Hugo Larochelle est un pionnier de l’apprentissage profond et fait partie des chercheur·euses les plus respecté·es au Canada.
Mila organise son premier hackathon en informatique quantique le 21 novembre. Une journée unique pour explorer le prototypage quantique et l’IA, collaborer sur les plateformes de Quandela et IBM, et apprendre, échanger et réseauter dans un environnement stimulant au cœur de l’écosystème québécois en IA et en quantique.
Une nouvelle initiative pour renforcer les liens entre la communauté de recherche, les partenaires et les expert·e·s en IA à travers le Québec et le Canada, grâce à des rencontres et événements en présentiel axés sur l’adoption de l’IA dans l’industrie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Cervical Spinal Cord Magnetization Transfer Ratio and Its Relationship With Clinical Outcomes in Multiple Sclerosis
Augmenting large language models (LLMs) with external context significantly improves their performance in natural language processing (NLP) … (voir plus)tasks. However, LLMs struggle to answer queries reliably when the provided context lacks information, often resorting to ungrounded speculation or internal knowledge. Groundedness - generating responses strictly supported by the context - is essential for ensuring factual consistency and trustworthiness. This study focuses on detecting whether a given query is grounded in a document provided in context before the costly answer generation by LLMs. Such a detection mechanism can significantly reduce both inference time and resource consumption. We show that lightweight, task specific encoder models such as RoBERTa and NomicBERT, fine-tuned on curated datasets, can achieve accuracy comparable to state-of-the-art LLMs, such as Llama3 8B and GPT4o, in groundedness detection while reducing inference latency by orders of magnitude. The code is available at : https://github.com/chandarlab/Hallucinate-less
Dynamic graph learning methods have recently emerged as powerful tools for modelling relational data evolving through time. However, despite… (voir plus) extensive benchmarking efforts, it remains unclear whether current Temporal Graph Neural Networks (TGNNs) effectively capture core temporal patterns such as periodicity, cause-and-effect, and long-range dependencies. In this work, we introduce the Temporal Graph Reasoning Benchmark (T-GRAB), a comprehensive set of synthetic tasks designed to systematically probe the capabilities of TGNNs to reason across time. T-GRAB provides controlled, interpretable tasks that isolate key temporal skills: counting/memorizing periodic repetitions, inferring delayed causal effects, and capturing long-range dependencies over both spatial and temporal dimensions. We evaluate 11 temporal graph learning methods on these tasks, revealing fundamental shortcomings in their ability to generalize temporal patterns. Our findings offer actionable insights into the limitations of current models, highlight challenges hidden by traditional real-world benchmarks, and motivate the development of architectures with stronger temporal reasoning abilities. The code for T-GRAB can be found at: https://github.com/alirezadizaji/T-GRAB.
Pixel-wise annotations are notoriously labourious and costly to obtain in the medical domain. To mitigate this burden, weakly supervised app… (voir plus)roaches based on bounding box annotations-much easier to acquire-offer a practical alternative. Vision foundation models have recently shown noteworthy segmentation performance when provided with prompts such as points or bounding boxes. Prompt learning exploits these models by adapting them to downstream tasks and automating segmentation, thereby reducing user intervention. However, existing prompt learning approaches depend on fully annotated segmentation masks. This paper proposes a novel framework that combines the representational power of foundation models with the annotation efficiency of weakly supervised segmentation. More specifically, our approach automates prompt generation for foundation models using only bounding box annotations. Our proposed optimization scheme integrates multiple constraints derived from box annotations with pseudo-labels generated by the prompted foundation model. Extensive experiments across multi-modal datasets reveal that our weakly supervised method achieves an average Dice score of 84.90% in a limited data setting, outperforming existing fully-supervised and weakly-supervised approaches. The code will be available upon acceptance
2025-06-24
IEEE transactions on bio-medical engineering (publié)
Coordination between independent learning agents in a multi-agent environment is an important problem where AI systems may impact each other… (voir plus)s learning process. In this paper, we study how individual agents converge to optimal equilibrium in multi-agent where coordination is necessary to achieve optimality.
Specifically, we cover the case of coordination to maximize every individual payoffs and coordination to maximize the collective payoff (cooperation). We study the emergence of such coordination behaviours in two-players matrix games with unknown payoff matrices and noisy bandit feedback. We consider five different environments along with widely used deterministic and stochastic bandit strategies. We study how different learning strategies and observation noise influence convergence to the optimal equilibrium. Our results indicate that coordination often emerge more easily from interactions between deterministic agents, especially when they follow the same learning behaviour. However, stochastic learning strategies appear to be more robust in the presence of many optimal joint actions. Overall, noisy observations often help stabilizing learning behaviours.
Representation learning methods are an important tool for addressing the challenges posed by complex observations spaces in sequential decis… (voir plus)ion making problems. Recently, many methods have used a wide variety of types of approaches for learning meaningful state representations in reinforcement learning, allowing better sample efficiency, generalization, and performance. This survey aims to provide a broad categorization of these methods within a model-free online setting, exploring how they tackle the learning of state representations differently. We categorize the methods into six main classes, detailing their mechanisms, benefits, and limitations. Through this taxonomy, our aim is to enhance the understanding of this field and provide a guide for new researchers. We also discuss techniques for assessing the quality of representations, and detail relevant future directions.
Cooperation between people is not always obvious. Sometimes we benefit from actions that others have taken even when we are unaware that the… (voir plus)y took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These “hidden gifts” represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit to your own actions correctly when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a “hidden gift”. We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show how credit assignment in multi-agent settings can be particularly challenging in the presence of “hidden gifts”, and demonstrate that learning awareness can benefit these settings
Cooperation between people is not always obvious. Sometimes we benefit from actions that others have taken even when we are unaware that the… (voir plus)y took those actions. For example, if your neighbor chooses not to take a parking spot in front of your house when you are not there, you can benefit, even without being aware that they took this action. These “hidden gifts” represent an interesting challenge for multi-agent reinforcement learning (MARL), since assigning credit to your own actions correctly when the beneficial actions of others are hidden is non-trivial. Here, we study the impact of hidden gifts with a very simple MARL task. In this task, agents in a grid-world environment have individual doors to unlock in order to obtain individual rewards. As well, if all the agents unlock their door the group receives a larger collective reward. However, there is only one key for all of the doors, such that the collective reward can only be obtained when the agents drop the key for others after they use it. Notably, there is nothing to indicate to an agent that the other agents have dropped the key, thus the act of dropping the key for others is a “hidden gift”. We show that several different state-of-the-art RL algorithms, including MARL algorithms, fail to learn how to obtain the collective reward in this simple task. Interestingly, we find that independent model-free policy gradient agents can solve the task when we provide them with information about their action history, but MARL agents still cannot solve the task with action history. Finally, we derive a correction term for these independent agents, inspired by learning aware approaches, which reduces the variance in learning and helps them to converge to collective success more reliably. These results show how credit assignment in multi-agent settings can be particularly challenging in the presence of “hidden gifts”, and demonstrate that learning awareness can benefit these settings