Le Studio d'IA pour le climat de Mila vise à combler l’écart entre la technologie et l'impact afin de libérer le potentiel de l'IA pour lutter contre la crise climatique rapidement et à grande échelle.
Le programme a récemment publié sa première note politique, intitulée « Considérations politiques à l’intersection des technologies quantiques et de l’intelligence artificielle », réalisée par Padmapriya Mohan.
Hugo Larochelle nommé directeur scientifique de Mila
Professeur associé à l’Université de Montréal et ancien responsable du laboratoire de recherche en IA de Google à Montréal, Hugo Larochelle est un pionnier de l’apprentissage profond et fait partie des chercheur·euses les plus respecté·es au Canada.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
The Roles of Neural Networks in Language Acquisition
How can modern neural networks like language models be useful to the field of language acquisition, and more broadly cognitive science, if t… (voir plus)hey are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT‐4, the question of how they can inform our understanding of human language acquisition has re‐emerged. As such, it is critical to examine how in practice linking hypotheses between models and human learners can be safely established. To address these questions, we propose a model taxonomy, including four modelling approaches, each having differing goals, from exploratory hypothesis generation to hypothesis differentiation and testing. We show how the goals of these approaches align with the overarching goals of science and linguistics by connecting our taxonomy to the realist versus instrumentalist approaches in philosophy of science. We survey recent work having adopted each of our modelling approaches and address the importance of computational modelling in language acquisition studies.
The dominant paradigm for RLHF is online and on-policy RL: synchronously generating from the large language model (LLM) policy, labelling wi… (voir plus)th a reward model, and learning using feedback on the LLM's own outputs. While performant, this paradigm is computationally inefficient. Inspired by classical deep RL literature, we propose separating generation and learning in RLHF. This enables asynchronous generation of new samples while simultaneously training on old samples, leading to faster training and more compute-optimal scaling. However, asynchronous training relies on an underexplored regime, online but off-policy RLHF: learning on samples from previous iterations of our model. To understand the challenges in this regime, we investigate a fundamental question: how much off-policyness can we tolerate for asynchronous training to speed up learning but maintain performance? Among several RLHF algorithms we tested, we find that online DPO is most robust to off-policy data, and robustness increases with the scale of the policy model. We study further compute optimizations for asynchronous RLHF but find that they come at a performance cost, giving rise to a trade-off. Finally, we verify the scalability of asynchronous RLHF by training LLaMA 3.1 8B on an instruction-following task 40% faster than a synchronous run while matching final performance.
The primary motor cortex is involved in initiation and adaptive control of locomotion. However, the role of the motor cortex in controlling … (voir plus)gait trajectories remains unclear. In animals, cortical neuromodulation allows for precise control of step height. We hypothesized that a similar control framework applies to humans, whereby cortical stimulation would primarily increase foot elevation. Transcranial magnetic stimulation (TMS) was applied over the motor cortex to assess the involvement of the corticospinal tract over the limb trajectory during human walking. Eight healthy adults (aged 20-32 years) participated in treadmill walking at 1.5 km/h. TMS was applied over the left motor cortex at an intensity of 120% of the threshold to elicit a dorsiflexion of the right ankle during the swing phase of gait. Electromyographic (EMG) measurements and three-dimensional (3D) lower limb kinematics were collected. When delivered during the early swing phase, TMS led to a significant increase in the maximum height of the right toe by a mean of 40.7% ± 14.9% (25.6mm ± 9.4 mm, p = 0.0352) and knee height by 57.8%± 16.8%; (32mm ± 9.3 mm; p = 0.008) across participants. These findings indicate that TMS can influence limb trajectory during walking, highlighting its potential as a tool for studying cortical control of locomotion.
Current multi-task reinforcement learning (MTRL) methods have the ability to perform a large number of tasks with a single policy. However w… (voir plus)hen attempting to interact with a new domain, the MTRL agent would need to be re-trained due to differences in domain dynamics and structure. Because of these limitations, we are forced to train multiple policies even though tasks may have shared dynamics, leading to needing more samples and is thus sample inefficient. In this work, we explore the ability of MTRL agents to learn in various domains with various dynamics by simultaneously learning in multiple domains, without the need to fine-tune extra policies. In doing so we find that a MTRL agent trained in multiple domains induces an increase in sample efficiency of up to 70\% while maintaining the overall success rate of the MTRL agent.
Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (voir plus)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.
Recent advancements in large language models (LLMs) have sparked interest in developing autonomous web agents capable of performing digital … (voir plus)tasks through web interfaces in a human-like manner. However, even the strongest closed-source models often struggle to achieve robust results on several benchmarks, while a notable performance gap exists between them and open-source counterparts. This study investigates the potential of fine-tuning to enhance the performance of a smaller, lower-performing but cost-efficient LLM by leveraging successful traces from stronger LLMs, referred to as experts. We outline a comprehensive pipeline for data collection, filtering, and supervised fine-tuning and explore various behavior cloning parameters. Our experiments provide key insights into the challenges of fine-tuning LLMs into web agents on benchmarks like MiniWoB and WorkArena. Notably, we find that the fine-tuned agents' ability to predict expert trajectories does not consistently lead to improved downstream task performance. This raises issues such as off-policy bias and the loss of reasoning abilities during fine-tuning. We discuss potential solutions to these challenges and make both the codebase and a dataset of 140M tokens open-source for the community to build upon.
Autonomous mobile clinics (AMCs) have the potential to revolutionize healthcare delivery by bringing healthcare services to patients at the … (voir plus)order of patient’s fingertips. Particularly, AMCs can act as an essential touch point of integrated care, which is a worldwide response to the fragmented delivery of health by focusing on more coordinated and integrated forms of care provision. However, the impact of AMCs on the health satisfaction outcome effectiveness still remains unknown. In this article, in collaboration with United Family Healthcare (UFH), we study the potential effectiveness improvement of integrated care delivery through AMCs. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-024-75611-x.
The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some … (voir plus)of this is also owed to the risks and costs associated with their use. On one front is their tendency to hallucinate false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.