Mila organise son premier hackathon en informatique quantique le 21 novembre. Une journée unique pour explorer le prototypage quantique et l’IA, collaborer sur les plateformes de Quandela et IBM, et apprendre, échanger et réseauter dans un environnement stimulant au cœur de l’écosystème québécois en IA et en quantique.
Une nouvelle initiative pour renforcer les liens entre la communauté de recherche, les partenaires et les expert·e·s en IA à travers le Québec et le Canada, grâce à des rencontres et événements en présentiel axés sur l’adoption de l’IA dans l’industrie.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Inverse Reinforcement Learning aims to recover reward models from expert demonstrations, but traditional methods yield"black-box"models that… (voir plus) are difficult to interpret and debug. In this work, we introduce GRACE (Generating Rewards As CodE), a method for using Large Language Models within an evolutionary search to reverse-engineer an interpretable, code-based reward function directly from expert trajectories. The resulting reward function is executable code that can be inspected and verified. We empirically validate GRACE on the BabyAI and AndroidWorld benchmarks, where it efficiently learns highly accurate rewards, even in complex, multi-task settings. Further, we demonstrate that the resulting reward leads to strong policies, compared to both competitive Imitation Learning and online RL approaches with ground-truth rewards. Finally, we show that GRACE is able to build complex reward APIs in multi-task setups.
We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language … (voir plus)and vision tasks these models are typically trained on. Specifically, our focus lies in areas such as Embodied AI, Games, UI Control, and Planning. To this end, we introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA). GEA is a single unified model capable of grounding itself across these varied domains through a multi-embodiment action tokenizer. GEA is trained with supervised learning on a large dataset of embodied experiences and with online RL in interactive simulators. We explore the data and algorithmic choices necessary to develop such a model. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents. The final GEA model achieves strong generalization performance to unseen tasks across diverse benchmarks compared to other generalist models and benchmark-specific approaches.
We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language … (voir plus)and vision tasks these models are typically trained on. Specifically, our focus lies in areas such as Embodied AI, Games, UI Control, and Planning. To this end, we introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA). GEA is a single unified model capable of grounding itself across these varied domains through a multi-embodiment action tokenizer. GEA is trained with supervised learning on a large dataset of embodied experiences and with online RL in interactive simulators. We explore the data and algorithmic choices necessary to develop such a model. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents. The final GEA model achieves strong generalization performance to unseen tasks across diverse benchmarks compared to other generalist models and benchmark-specific approaches.
We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language … (voir plus)and vision tasks these models are typically trained on. Specifically, our focus lies in areas such as Embodied AI, Games, UI Control, and Planning. To this end, we introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA). GEA is a single unified model capable of grounding itself across these varied domains through a multi-embodiment action tokenizer. GEA is trained with supervised learning on a large dataset of embodied experiences and with online RL in interactive simulators. We explore the data and algorithmic choices necessary to develop such a model. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents. The final GEA model achieves strong generalization performance to unseen tasks across diverse benchmarks compared to other generalist models and benchmark-specific approaches.
Multimodal Large Language Models (MLLMs) have demonstrated a wide range of capabilities across many domains including Embodied AI. In this w… (voir plus)ork, we study how to best ground a MLLM into different embodiments and their associated action spaces, including both continuous and discrete actions. For continuous actions, a set of learned tokenizations that capture an action at various resolutions allows for sufficient modeling precision, yielding the best performance on downstream tasks. For discrete actions, semantically aligning these actions with the native output token space of the MLLM leads to the strongest performance. We arrive at these lessons via a thorough study of seven action grounding approaches on five different environments, encompassing over 114 embodied tasks.
Multimodal Large Language Models (MLLMs) have demonstrated a wide range of capabilities across many domains, including Embodied AI. In this … (voir plus)work, we study how to best ground a MLLM into different embodiments and their associated action spaces, with the goal of leveraging the multimodal world knowledge of the MLLM. We first generalize a number of methods through a unified architecture and the lens of action space adaptors. For continuous actions, we show that a learned tokenization allows for sufficient modeling precision, yielding the best performance on downstream tasks. For discrete actions, we demonstrate that semantically aligning these actions with the native output token space of the MLLM leads to the strongest performance. We arrive at these lessons via a thorough study of seven action space adapters on five different environments, encompassing over 114 embodied tasks.
Generative Artificial Intelligence (AI) has made significant advancements in recent years, particularly with the development of large langua… (voir plus)ge and diffusion models. These generative models have demonstrated impressive capabilities in various tasks, such as text generation and image and audio synthesis. Concurrently, Reinforcement Learning (RL) has made significant strides in solving complex sequential decision-making problems with the help of external knowledge sources . However, there remains untapped potential in combining generative models with RL algorithms to tackle real-world challenges, particularly to improve sample efficiency of tabula rasa training by introducing priors from related domains such as visual question-answering, image captioning and image generation.
This workshop aims to bring together researchers and practitioners from the fields of generative AI and reinforcement learning to explore the latest advances, methodologies, and applications. By fostering collaborations between these two domains, we intend to unlock new opportunities for addressing complex problems that lie at the intersection of both fields.