Publications

Deep Learning Model Reuse in the HuggingFace Community: Challenges, Benefit and Trends
Mina Taraghi
Gianolli Dorcelus
Armstrong Foundjem
Florian Tambon
The ubiquity of large-scale Pre-Trained Models (PTMs) is on the rise, sparking interest in model hubs, and dedicated platforms for hosting P… (see more)TMs. Despite this trend, a comprehensive exploration of the challenges that users encounter and how the community leverages PTMs remains lacking. To address this gap, we conducted an extensive mixed-methods empirical study by focusing on discussion forums and the model hub of HuggingFace, the largest public model hub. Based on our qualitative analysis, we present a taxonomy of the challenges and benefits associated with PTM reuse within this community. We then conduct a quantitative study to track model-type trends and model documentation evolution over time. Our findings highlight prevalent challenges such as limited guidance for beginner users, struggles with model output comprehensibility in training or inference, and a lack of model understanding. We also identified interesting trends among models where some models maintain high upload rates despite a decline in topics related to them. Additionally, we found that despite the introduction of model documentation tools, its quantity has not increased over time, leading to difficulties in model comprehension and selection among users. Our study sheds light on new challenges in reusing PTMs that were not reported before and we provide recommendations for various stakeholders involved in PTM reuse.
Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons
Simon Dufort-Labbé
Pierluca D'Oro
Evgenii Nikishin
Razvan Pascanu
Aristide Baratin
Refining GPT-3 Embeddings with a Siamese Structure for Technical Post Duplicate Detection
Xingfang Wu
Heng Li
Nobukazu Yoshioka
Hironori Washizaki
Rethinking Machine Learning Benchmarks in the Context of Professional Codes of Conduct
Peter Henderson
Jieru Hu
Mona Diab
Simulating Weighted Automata over Sequences and Trees with Transformers
Michael Rizvi
Maude Lizaire
Clara Lacroce
Guillaume Rabusseau
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Massimo Caccia
Issam Hadj Laradji
Manuel Del Verme
Tom Marty
Léo Boisvert
Megh Thakkar
David Vazquez
Alexandre Lacoste
We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuri… (see more)ng the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterprise software systems. To this end, we propose WorkArena, a remote-hosted benchmark of 29 tasks based on the widely-used ServiceNow platform. We also introduce BrowserGym, an environment for the design and evaluation of such agents, offering a rich set of actions as well as multimodal observations. Our empirical evaluation reveals that while current agents show promise on WorkArena, there remains a considerable gap towards achieving full task automation. Notably, our analysis uncovers a significant performance disparity between open and closed-source LLMs, highlighting a critical area for future exploration and development in the field.
Ant Colony Sampling with GFlowNets for Combinatorial Optimization
Minsu Kim
Sanghyeok Choi
Jiwoo Son
Hyeon-Seob Kim
Jinkyoo Park
Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport
Alexander Tong
Kilian FATRAS
Nikolay Malkin
Guillaume Huguet
Yanlei Zhang
Jarrid Rector-Brooks
Guy Wolf
Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their si… (see more)mulation-based maximum likelihood training. We introduce the generalized \textit{conditional flow matching} (CFM) technique, a family of simulation-free training objectives for CNFs. CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models. In contrast to both diffusion models and prior CNF training algorithms, CFM does not require the source distribution to be Gaussian or require evaluation of its density. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference, as evaluated in our experiments. Furthermore, OT-CFM is the first method to compute dynamic OT in a simulation-free way. Training CNFs with CFM improves results on a variety of conditional and unconditional generation tasks, such as inferring single cell dynamics, unsupervised image translation, and Schrödinger bridge inference.
IntentGPT: Few-shot Intent Discovery with Large Language Models
Juan A. Rodriguez
Nicholas Botzer
David Vazquez
Marco Pedersoli
Issam Hadj Laradji
Investigating Robot Influence on Human Behaviour By Leveraging Entrainment Effects
Lixiao Zhu
Language-guided Skill Learning with Temporal Variational Inference
Haotian Fu
Pratyusha Sharma
Elias Stengel-Eskin
George Konidaris
Marc-Alexandre Côté
Xingdi Yuan
We present an algorithm for skill discovery from expert demonstrations. The algorithm first utilizes Large Language Models (LLMs) to propose… (see more) an initial segmentation of the trajectories. Following that, a hierarchical variational inference framework incorporates the LLM-generated segmentation information to discover reusable skills by merging trajectory segments. To further control the trade-off between compression and reusability, we introduce a novel auxiliary objective based on the Minimum Description Length principle that helps guide this skill discovery process. We test our system on BabyAI, a grid world navigation environment, as well as ALFRED, a household simulation environment.Our results demonstrate that agents equipped with our method can discover skills that help accelerate learning and outperform baseline skill learning approaches on new long-horizon tasks.
Perspectives on Robotic Systems for the Visually Impaired
Christopher Yee Wong
Rahatul Amin Ananto
Tanaka Akiyama
Joseph Paul Nemargut