Portrait de Ian Arawjo

Ian Arawjo

Membre académique associé
Professeur adjoint, Université de Montréal, Département d'informatique et de recherche opérationnelle

Biographie

Ian Arawjo est professeur adjoint au Département d'informatique et de recherche opérationnelle (DIRO) de l'Université de Montréal. Il détient un doctorat en sciences de l'information de l'Université Cornell, réalisé sous la supervision du professeur Tapan Parikh.

Sa thèse portait sur l'intersection de la programmation informatique et de la culture, explorant la programmation en tant que pratique sociale et culturelle. Il a acquis de l'expérience dans l'application d'une vaste gamme de méthodes liées aux interfaces homme-machine (IHM), allant du travail de terrain ethnographique à la recherche archivistique, en passant par le développement de systèmes novateurs (utilisés par des milliers de personnes) et la réalisation d'études de convivialité. Actuellement, il travaille sur des projets au carrefour de la programmation, de l'IA et de l'IHM, notamment sur la manière dont les nouvelles capacités de l'IA peuvent nous aider à réimaginer la pratique de la programmation.

Il travaille également sur l'évaluation de grands modèles de langage (LLM), à travers des projets en code source libre à forte visibilité tels que ChainForge. Les articles auxquels il a contribué comme premier auteur ont remporté des prix lors de grandes conférences portant sur l’IHM, notamment la Conference on Human Factors in Computing Systems (CHI), la Conference on Computer-Supported Cooperative Work and Social Computing (CSCW) et le Symposium on User Interface Software and Technology (UIST).

Étudiants actuels

Doctorat - UdeM
Doctorat - UdeM
Superviseur⋅e principal⋅e :

Publications

Reporting and Reviewing LLM-Integrated Systems in HCI: Challenges and Considerations
What should HCI scholars consider when reporting and reviewing papers that involve LLM-integrated systems? We interview 18 authors of LLM-in… (voir plus)tegrated system papers on their authoring and reviewing experiences. We find that norms of trust-building between authors and reviewers appear to be eroded by the uncertainty of LLM behavior and hyperbolic rhetoric surrounding AI. Authors perceive that reviewers apply uniquely skeptical and inconsistent standards towards papers that report LLM-integrated systems, and mitigate mistrust by adding technical evaluations, justifying usage, and de-emphasizing LLM presence. Authors'views challenge blanket directives to report all prompts and use open models, arguing that prompt reporting is context-dependent and justifying proprietary model usage despite ethical concerns. Finally, some tensions in peer review appear to stem from clashes between the norms and values of HCI and ML/NLP communities, particularly around what constitutes a contribution and an appropriate level of technical rigor. Based on our findings and additional feedback from six expert HCI researchers, we present a set of guidelines and considerations for authors, reviewers, and HCI communities around reporting and reviewing papers that involve LLM-integrated systems.
How Notations Evolve: A Historical Analysis with Implications for Supporting User-Defined Abstractions
J.D. Zamfirescu-Pereira
Elena L. Glassman
Damien Masson
Democratizing Game Modding with GenAI: A Case Study of StarCharM, a Stardew Valley Character Maker
Hamid Zand Miralvand
Mohammad Ronagh Nikghalb
Mohammad Darandeh
Abidullah Khan
Jinghui Cheng
Game modding offers unique and personalized gaming experiences, but the technical complexity of creating mods often limits participation to … (voir plus)skilled users. We envision a future where every player can create personalized mods for their games. To explore this space, we designed StarCharM, a GenAI-based non-player character (NPC) creator for Stardew Valley. Our tool enables players to iteratively create new NPC mods, requiring minimal user input while allowing for fine-grained adjustments through user control. We conducted a user study with ten Stardew Valley players who had varied mod usage experiences to understand the impacts of StarCharM and provide insights into how GenAI tools may reshape modding, particularly in NPC creation. Participants expressed excitement in bringing their character ideas to life, although they noted challenges in generating rich content to fulfill complex visions. While they believed GenAI tools like StarCharM can foster a more diverse modding community, some voiced concerns about diminished originality and community engagement that may come with such technology. Our findings provided implications and guidelines for the future of GenAI-powered modding tools and co-creative modding practices.
Semantic Commit: Helping Users Update Intent Specifications for AI Memory at Scale
Priyan Vaithilingam
Daniel Lee
Elena L. Glassman
Dynamic Abstractions: Building the Next Generation of Cognitive Tools and Interfaces
Sangho Suh
Hai Dang
Ryan Yen
Josh M. Pollock
Rubaiat Habib Kazi
Hariharan Subramonyam
Jingyi Li
Nazmus Saquib
Arvind Satyanarayan
ChainBuddy: An AI Agent System for Generating LLM Pipelines
Imagining a Future of Designing with AI: Dynamic Grounding, Constructive Negotiation, and Sustainable Motivation
Priyan Vaithilingam
Elena L. Glassman
An AI-Resilient Text Rendering Technique for Reading and Skimming Documents
Ziwei Gu
Kenneth Li
Jonathan K. Kummerfeld
Elena L. Glassman
ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing
Chelse Swoopes
Priyan Vaithilingam
Martin Wattenberg
Elena L. Glassman
Evaluating outputs of large language models (LLMs) is challenging, requiring making -- and making sense of -- many responses. Yet tools that… (voir plus) go beyond basic prompting tend to require knowledge of programming APIs, focus on narrow domains, or are closed-source. We present ChainForge, an open-source visual toolkit for prompt engineering and on-demand hypothesis testing of text generation LLMs. ChainForge provides a graphical interface for comparison of responses across models and prompt variations. Our system was designed to support three tasks: model selection, prompt template design, and hypothesis testing (e.g., auditing). We released ChainForge early in its development and iterated on its design with academics and online users. Through in-lab and interview studies, we find that a range of people could use ChainForge to investigate hypotheses that matter to them, including in real-world settings. We identify three modes of prompt engineering and LLM hypothesis testing: opportunistic exploration, limited evaluation, and iterative refinement.
Schrödinger's Update: User Perceptions of Uncertainties in Proprietary Large Language Model Updates
Zilin Ma
Yiyang Mei
Krzysztof Z. Gajos
Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences
Shreya Shankar
J.D. Zamfirescu-Pereira
Bjorn Hartmann
Aditya G Parameswaran
Antagonistic AI
Alice Cai
Elena L. Glassman
The vast majority of discourse around AI development assumes that subservient,"moral"models aligned with"human values"are universally benefi… (voir plus)cial -- in short, that good AI is sycophantic AI. We explore the shadow of the sycophantic paradigm, a design space we term antagonistic AI: AI systems that are disagreeable, rude, interrupting, confrontational, challenging, etc. -- embedding opposite behaviors or values. Far from being"bad"or"immoral,"we consider whether antagonistic AI systems may sometimes have benefits to users, such as forcing users to confront their assumptions, build resilience, or develop healthier relational boundaries. Drawing from formative explorations and a speculative design workshop where participants designed fictional AI technologies that employ antagonism, we lay out a design space for antagonistic AI, articulating potential benefits, design techniques, and methods of embedding antagonistic elements into user experience. Finally, we discuss the many ethical challenges of this space and identify three dimensions for the responsible design of antagonistic AI -- consent, context, and framing.