Publications

The Software Documentor Mindset

Deeksha M. Arya

Martin P. Robillard

Software technologies are used by programmers with diverse backgrounds. To fulfill programmers' need for information, enthusiasts contribute… (see more) numerous learning resources that vary in style and content, which act as documentation for the corresponding technology. We interviewed 26 volunteer documentation contributors, i.e. documentors, to understand why and how they create such documentation. From a qualitative analysis of our interviews, we identified a total of sixteen considerations that documentors have during the documentation contribution process, along three dimensions, namely motivations, topic selection techniques, and styling objectives. We grouped related considerations based on common underlying themes, to elicit five software documentor mindsets that occur during documentation contribution activities. We propose a structure of mindsets, and their associated considerations across the three dimensions, as a framework for reasoning about the documentation contribution process. This framework can inform information seeking as well as documentation creation tools about the context in which documentation was contributed.

2024-12-12

ArXiv (preprint)

doi.org

arxiv.org

The Software Documentor Mindset

Deeksha M. Arya

Jin Guo

Martin P. Robillard

Software technologies are used by programmers with diverse backgrounds. To fulfill programmers' need for information, enthusiasts contribute… (see more) numerous learning resources that vary in style and content, which act as documentation for the corresponding technology. We interviewed 26 volunteer documentation contributors, i.e. documentors, to understand why and how they create such documentation. From a qualitative analysis of our interviews, we identified a total of sixteen considerations that documentors have during the documentation contribution process, along three dimensions, namely motivations, topic selection techniques, and styling objectives. We grouped related considerations based on common underlying themes, to elicit five software documentor mindsets that occur during documentation contribution activities. We propose a structure of mindsets, and their associated considerations across the three dimensions, as a framework for reasoning about the documentation contribution process. This framework can inform information seeking as well as documentation creation tools about the context in which documentation was contributed.

2024-12-12

ArXiv (preprint)

arxiv.org

Effects of gene dosage on cognitive ability: A function-based association study across brain and non-brain processes

Guillaume Huguet

Thomas Renne

Cécile Poulain

Alma Dubuc

Kuldeep Kumar

Sayeh Kazem

Worrawat Engchuan

Omar Shanta

Elise Douard

Catherine Proulx

Martineau Jean-Louis

Zohra Saci

Josephine Mollon

Laura Schultz

Emma E M Knowles

Simon R. Cox

David Porteous

Gail Davies

Paul Redmond

Sarah E. Harris … (see 10 more)

Gunter Schumann

Guillaume Dumas

Aurélie Labbe

Zdenka Pausova

Tomas Paus

Stephen W Scherer

Jonathan Sebat

Laura Almasy

David C. Glahn

Sébastien Jacquemont

2024-12-11

Cell Genomics (published)

doi.org

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons

Andrew Szot

Bogdan Mazoure

Omar Attia

Aleksei Timofeev

Harsh Agrawal

(Rex) Devon Hjelm

Zhe Gan

Zsolt Kira

Alexander T Toshev

We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language … (see more)and vision tasks these models are typically trained on. Specifically, our focus lies in areas such as Embodied AI, Games, UI Control, and Planning. To this end, we introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA). GEA is a single unified model capable of grounding itself across these varied domains through a multi-embodiment action tokenizer. GEA is trained with supervised learning on a large dataset of embodied experiences and with online RL in interactive simulators. We explore the data and algorithmic choices necessary to develop such a model. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents. The final GEA model achieves strong generalization performance to unseen tasks across diverse benchmarks compared to other generalist models and benchmark-specific approaches.

2024-12-11

ArXiv (preprint)

arxiv.org

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons

Andrew Szot

Bogdan Mazoure

Omar Attia

Aleksei Timofeev

Harsh Agrawal

(Rex) Devon Hjelm

Zhe Gan

Zsolt Kira

Alexander T Toshev

We examine the capability of Multimodal Large Language Models (MLLMs) to tackle diverse domains that extend beyond the traditional language … (see more)and vision tasks these models are typically trained on. Specifically, our focus lies in areas such as Embodied AI, Games, UI Control, and Planning. To this end, we introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA). GEA is a single unified model capable of grounding itself across these varied domains through a multi-embodiment action tokenizer. GEA is trained with supervised learning on a large dataset of embodied experiences and with online RL in interactive simulators. We explore the data and algorithmic choices necessary to develop such a model. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents. The final GEA model achieves strong generalization performance to unseen tasks across diverse benchmarks compared to other generalist models and benchmark-specific approaches.

2024-12-11

ArXiv (preprint)

doi.org

arxiv.org

Harnessing pre-trained generalist agents for software engineering tasks

Paulina Stevia Nouwou Mindom

Amin Nikanjam

Foutse Khomh

2024-12-11

Empirical Software Engineering (published)

doi.org

Harnessing pre-trained generalist agents for software engineering tasks

Paulina Stevia Nouwou Mindom

Amin Nikanjam

Foutse Khomh

2024-12-11

Empirical Software Engineering (published)

doi.org

Harnessing Pre-trained Generalist Agents for Software Engineering Tasks

Paulina Stevia Nouwou Mindom

Amin Nikanjam

Foutse Khomh

Nowadays, we are witnessing an increasing adoption of Artificial Intelligence (AI) to develop techniques aimed at improving the reliability,… (see more) effectiveness, and overall quality of software systems. Deep reinforcement learning (DRL) has recently been successfully used for automation in complex tasks such as game testing and solving the job-shop scheduling problem. However, these specialized DRL agents, trained from scratch on specific tasks, suffer from a lack of generalizability to other tasks and they need substantial time to be developed and re-trained effectively. Recently, DRL researchers have begun to develop generalist agents, able to learn a policy from various environments and capable of achieving performances similar to or better than specialist agents in new tasks. In the Natural Language Processing or Computer Vision domain, these generalist agents are showing promising adaptation capabilities to never-before-seen tasks after a light fine-tuning phase and achieving high performance. This paper investigates the potential of generalist agents for solving SE tasks. Specifically, we conduct an empirical study aimed at assessing the performance of two generalist agents on two important SE tasks: the detection of bugs in games (for two games) and the minimization of makespan in a scheduling task, to solve the job-shop scheduling problem (for two instances). Our results show that the generalist agents outperform the specialist agents with very little effort for fine-tuning, achieving a 20% reduction of the makespan over specialized agent performance on task-based scheduling. In the context of game testing, some generalist agent configurations detect 85% more bugs than the specialist agents. Building on our analysis, we provide recommendations for researchers and practitioners looking to select generalist agents for SE tasks, to ensure that they perform effectively.

2024-12-11

Empirical Software Engineering (published)

doi.org

arxiv.org

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Martin Klissarov

Mikael Henaff

Roberta Raileanu

Marlos C. Machado

Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an… (see more) AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM's feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM's code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.

2024-12-11

ArXiv (preprint)

arxiv.org

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Martin Klissarov

Mikael Henaff

Roberta Raileanu

Marlos C. Machado

Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an… (see more) AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents. MaestroMotif leverages the capabilities of Large Language Models (LLMs) to effectively create and reuse skills. It first uses an LLM's feedback to automatically design rewards corresponding to each skill, starting from their natural language description. Then, it employs an LLM's code generation abilities, together with reinforcement learning, for training the skills and combining them to implement complex behaviors specified in language. We evaluate MaestroMotif using a suite of complex tasks in the NetHack Learning Environment (NLE), demonstrating that it surpasses existing approaches in both performance and usability.

2024-12-11

ArXiv (preprint)

doi.org

arxiv.org

TapeAgents: a Holistic Framework for Agent Development and Optimization

Dzmitry Bahdanau

Nicolas Gontier

Gabriel Huang

Ehsan Kamalloo

Rafael Pardinas

Alexandre Piché

Torsten Scholak

Oleh Shliazhko

Jordan Prince Tremblay

Karam Ghanem

Soham Parikh

Mitul Tiwari

Quaizar Vohra

We present TapeAgents, an agent framework built around a granular, structured log tape of the agent session that also plays the role of the … (see more)session's resumable state. In TapeAgents we leverage tapes to facilitate all stages of the LLM Agent development lifecycle. The agent reasons by processing the tape and the LLM output to produce new thought and action steps and append them to the tape. The environment then reacts to the agent's actions by likewise appending observation steps to the tape. By virtue of this tape-centred design, TapeAgents can provide AI practitioners with holistic end-to-end support. At the development stage, tapes facilitate session persistence, agent auditing, and step-by-step debugging. Post-deployment, one can reuse tapes for evaluation, fine-tuning, and prompt-tuning; crucially, one can adapt tapes from other agents or use revised historical tapes. In this report, we explain the TapeAgents design in detail. We demonstrate possible applications of TapeAgents with several concrete examples of building monolithic agents and multi-agent teams, of optimizing agent prompts and finetuning the agent's LLM. We present tooling prototypes and report a case study where we use TapeAgents to finetune a Llama-3.1-8B form-filling assistant to perform as well as GPT-4o while being orders of magnitude cheaper. Lastly, our comparative analysis shows that TapeAgents's advantages over prior frameworks stem from our novel design of the LLM agent as a resumable, modular state machine with a structured configuration, that generates granular, structured logs and that can transform these logs into training text -- a unique combination of features absent in previous work.

2024-12-11

ArXiv (preprint)

doi.org

arxiv.org

TapeAgents: a Holistic Framework for Agent Development and Optimization

Dzmitry Bahdanau

Nicolas Gontier

Gabriel Huang

Ehsan Kamalloo

Rafael Pardinas

Alex Pich'e

Torsten Scholak

Oleh Shliazhko

Jordan Prince Tremblay

Karam Ghanem

Soham Parikh

Mitul Tiwari

Quaizar Vohra

We present TapeAgents, an agent framework built around a granular, structured log tape of the agent session that also plays the role of the … (see more)session's resumable state. In TapeAgents we leverage tapes to facilitate all stages of the LLM Agent development lifecycle. The agent reasons by processing the tape and the LLM output to produce new thought and action steps and append them to the tape. The environment then reacts to the agent's actions by likewise appending observation steps to the tape. By virtue of this tape-centred design, TapeAgents can provide AI practitioners with holistic end-to-end support. At the development stage, tapes facilitate session persistence, agent auditing, and step-by-step debugging. Post-deployment, one can reuse tapes for evaluation, fine-tuning, and prompt-tuning; crucially, one can adapt tapes from other agents or use revised historical tapes. In this report, we explain the TapeAgents design in detail. We demonstrate possible applications of TapeAgents with several concrete examples of building monolithic agents and multi-agent teams, of optimizing agent prompts and finetuning the agent's LLM. We present tooling prototypes and report a case study where we use TapeAgents to finetune a Llama-3.1-8B form-filling assistant to perform as well as GPT-4o while being orders of magnitude cheaper. Lastly, our comparative analysis shows that TapeAgents's advantages over prior frameworks stem from our novel design of the LLM agent as a resumable, modular state machine with a structured configuration, that generates granular, structured logs and that can transform these logs into training text -- a unique combination of features absent in previous work.

2024-12-11

ArXiv (preprint)

arxiv.org

Speed Science

Leading in a New Era

Supervision Requests

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Publications