jingyue zhang

Doctorat - UdeM

Superviseur⋅e principal⋅e

Damien Masson

Co-supervisor

Ian Arawjo

Sujets de recherche

Alignement de l'IA

Interaction humain-IA

Publications

ChainBuddy: An AI Agent System for Generating LLM Pipelines

jingyue zhang

Ian Arawjo

As large language models (LLMs) advance, their potential applications have grown significantly. However, it remains difficult to evaluate LL… (voir plus)M behavior on user-specific tasks and craft effective pipelines to do so. Many users struggle with where to start, often referred to as the"blank page"problem. ChainBuddy, an AI assistant for generating evaluative LLM pipelines built into the ChainForge platform, aims to tackle this issue. ChainBuddy offers a straightforward and user-friendly way to plan and evaluate LLM behavior, making the process less daunting and more accessible across a wide range of possible tasks and use cases. We report a within-subjects user study comparing ChainBuddy to the baseline interface. We find that when using AI assistance, participants reported a less demanding workload and felt more confident setting up evaluation pipelines of LLM behavior. We derive insights for the future of interfaces that assist users in the open-ended evaluation of AI.

2024-09-20

ArXiv (prépublication)

doi.org

arxiv.org

ChainBuddy: An AI-assisted Agent System for Generating LLM Pipelines

jingyue zhang

Ian Arawjo

2024-09-20

ArXiv (prépublication)

doi.org

arxiv.org

ChainBuddy: An AI-assisted Agent System for Generating LLM Pipelines