Publications

Symmetry-Aware Generative Modeling through Learned Canonicalization

Kusha Sareen

Daniel Levy

Arnab Kumar Mondal

Sékou-Oumar Kaba

Tara Akhound-Sadegh

Siamak Ravanbakhsh

Generative modeling of symmetric densities has a range of applications in AI for science, from drug discovery to physics simulations. The ex… (see more)isting generative modeling paradigm for invariant densities combines an invariant prior with an equivariant generative process. However, we observe that this technique is not necessary and has several drawbacks resulting from the limitations of equivariant networks. Instead, we propose to model a learned slice of the density so that only one representative element per orbit is learned. To accomplish this, we learn a group-equivariant canonicalization network that maps training samples to a canonical pose and train a non-equivariant generative model over these canonicalized samples. We implement this idea in the context of diffusion models. Our preliminary experimental results on molecular modeling are promising, demonstrating improved sample quality and faster inference time.

2024-10-23

NeurIPS.cc/2024/Workshop/NeurReps (poster)

FairLoRA: Unpacking Bias Mitigation in Vision Models with Fairness-Driven Low-Rank Adaptation

Rohan Sukumaran

Aarash Feizi

Adriana Romero-Sorian

Golnoosh Farnadi

2024-10-22

ArXiv (preprint)

Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think

Massimo Caccia

Megh Thakkar

Léo Boisvert

Thibault Le Sellier de Chezelles

Alexandre Piché

Nicolas Chapados

Alexandre Drouin

Maxime Gasse

Alexandre Lacoste

Recent advancements in large language models (LLMs) have sparked interest in developing autonomous web agents capable of performing digital … (see more)tasks through web interfaces in a human-like manner. However, even the strongest closed-source models often struggle to achieve robust results on several benchmarks, while a notable performance gap exists between them and open-source counterparts. This study investigates the potential of fine-tuning to enhance the performance of a smaller, lower-performing but cost-efficient LLM by leveraging successful traces from stronger LLMs, referred to as experts. We outline a comprehensive pipeline for data collection, filtering, and supervised fine-tuning and explore various behavior cloning parameters. Our experiments provide key insights into the challenges of fine-tuning LLMs into web agents on benchmarks like MiniWoB and WorkArena. Notably, we find that the fine-tuned agents' ability to predict expert trajectories does not consistently lead to improved downstream task performance. This raises issues such as off-policy bias and the loss of reasoning abilities during fine-tuning. We discuss potential solutions to these challenges and make both the codebase and a dataset of 140M tokens open-source for the community to build upon.

2024-10-22

NeurIPS.cc/2024/Workshop/OWA (poster)

Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think

Massimo Caccia

Megh Thakkar

Léo Boisvert

Thibault Le Sellier de Chezelles

Alexandre Piché

Nicolas Chapados

Alexandre Drouin

Maxime Gasse

Alexandre Lacoste

Recent advancements in large language models (LLMs) have sparked interest in developing autonomous web agents capable of performing digital … (see more)tasks through web interfaces in a human-like manner. However, even the strongest closed-source models often struggle to achieve robust results on several benchmarks, while a notable performance gap exists between them and open-source counterparts. This study investigates the potential of fine-tuning to enhance the performance of a smaller, lower-performing but cost-efficient LLM by leveraging successful traces from stronger LLMs, referred to as experts. We outline a comprehensive pipeline for data collection, filtering, and supervised fine-tuning and explore various behavior cloning parameters. Our experiments provide key insights into the challenges of fine-tuning LLMs into web agents on benchmarks like MiniWoB and WorkArena. Notably, we find that the fine-tuned agents' ability to predict expert trajectories does not consistently lead to improved downstream task performance. This raises issues such as off-policy bias and the loss of reasoning abilities during fine-tuning. We discuss potential solutions to these challenges and make both the codebase and a dataset of 140M tokens open-source for the community to build upon.

2024-10-22

NeurIPS.cc/2024/Workshop/OWA (poster)

Graph Knowledge Distillation to Mixture of Experts

Pavel Rumiantsev

Mark Coates

2024-10-22

TMLR (accepted)

Health satisfaction outcome from integrated autonomous mobile clinics

Yuzhang Huang

Shaoshan Liu

Zhongying Pan

Carl Wu

Herng-Chia Chiu

Xue (Steve) Liu

Leiyu Shi

2024-10-22

Scientific Reports (published)

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination

Jerry Huang

Prasanna Parthasarathi

Mehdi Rezagholizadeh

Boxing Chen

Sarath Chandar

The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some … (see more)of this is also owed to the risks and costs associated with their use. On one front is their tendency to \textit{hallucinate} false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.

2024-10-22

ArXiv (preprint)

GFlowNets for Hamiltonian decomposition in groups of compatible operators

Isaac L. Huidobro-Meezs

Jun Dai

Guillaume Rabusseau

R. A. Vargas-Hern'andez

Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (see more)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.

2024-10-21

ArXiv (preprint)

GFlowNets for Hamiltonian decomposition in groups of compatible operators

Isaac L. Huidobro-Meezs

Jun Dai

Guillaume Rabusseau

R. A. Vargas-Hern'andez

Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (see more)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.

2024-10-21

ArXiv (preprint)

Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching

Ange-Cl'ement Akazan

Alexia Jolicoeur-Martineau

Ioannis Mitliagkas

2024-10-20

ArXiv (preprint)

Hallucination Detox: Sensitive Neuron Dropout (SeND) for Large Language Model Training

Shahrad Mohammadzadeh

Juan David Guerra

Marco Bonizzato

Reihaneh Rabbany

Golnoosh Farnadi

As large language models (LLMs) are increasingly deployed across various industries, concerns regarding their reliability, particularly due … (see more)to hallucinations - outputs that are factually inaccurate or irrelevant to user input - have grown. Our research investigates the relationship between the training process and the emergence of hallucinations to address a key gap in existing research that focuses primarily on post hoc detection and mitigation strategies. Using models from the Pythia suite (70M - 12B parameters) and several hallucination detection metrics, we analyze hallucination trends throughout training and explore LLM internal dynamics. We introduce Sensitivity Dropout (SenD), a novel training protocol designed to mitigate hallucinations by reducing variance during training. SenD achieves this by deterministically dropping embedding indices with significant variability, referred to as Sensitive Embedding Indices. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore at 2x speed. This efficient metric is integrated into our protocol, allowing SenD to be both computationally scalable and effective at reducing hallucinations. Our empirical evaluation demonstrates that our approach improves LLM reliability at test time by up to 40% compared to normal training while also providing an efficient method to improve factual accuracy when adapting LLMs to Wikipedia, Medical, and LegalBench domains.

2024-10-20

ArXiv (preprint)