Publications

GFlowNets for Hamiltonian decomposition in groups of compatible operators
Isaac L. Huidobro-Meezs
R. A. Vargas-Hern'andez
Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (voir plus)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.
GFlowNets for Hamiltonian decomposition in groups of compatible operators
Isaac L. Huidobro-Meezs
R. A. Vargas-Hern'andez
Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (voir plus)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
As large language models (LLMs) are increasingly deployed across various industries, concerns regarding their reliability, particularly due … (voir plus)to hallucinations - outputs that are factually inaccurate or irrelevant to user input - have grown. Our research investigates the relationship between the training process and the emergence of hallucinations to address a key gap in existing research that focuses primarily on post hoc detection and mitigation strategies. Using models from the Pythia suite (70M - 12B parameters) and several hallucination detection metrics, we analyze hallucination trends throughout training and explore LLM internal dynamics. We introduce Sensitivity Dropout (SenD), a novel training protocol designed to mitigate hallucinations by reducing variance during training. SenD achieves this by deterministically dropping embedding indices with significant variability, referred to as Sensitive Embedding Indices. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore at 2x speed. This efficient metric is integrated into our protocol, allowing SenD to be both computationally scalable and effective at reducing hallucinations. Our empirical evaluation demonstrates that our approach improves LLM reliability at test time by up to 40% compared to normal training while also providing an efficient method to improve factual accuracy when adapting LLMs to Wikipedia, Medical, and LegalBench domains.
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Convergence of Manifold Filter-Combine Networks
David R. Johnson
Joyce Chew
Siddharth Viswanath
Edward De Brouwer
Deanna Needell
Michael Perlmutter
In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine fra… (voir plus)mework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.
Convergence of Manifold Filter-Combine Networks
David R. Johnson
Joyce A. Chew
Siddharth Viswanath
Edward De Brouwer
Deanna Needell
Michael Perlmutter
In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine fra… (voir plus)mework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.
Assessment of the Climate Trace global powerplant CO<sub>2</sub> emissions
Kevin R Gurney
Bilal Aslam
Pawlok Dass
Lech Gawuc
Jarrett J Barber
Anna Kato
A Simulation System Towards Solving Societal-Scale Manipulation
Maximilian Puelma Touzel
Austin Welch
Gayatri K
Dan Zhao
Hao Yu
Ethan Kosak-Hine
Tom Gibbs
Busra Tugce Gurbuz
The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. We improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys. We demonstrate the simulator with a tailored example in which we track agents' political positions and show how partisan manipulation of agents can affect election results.
A Simulation System Towards Solving Societal-Scale Manipulation
Maximilian Puelma Touzel
Austin Welch
Gayatri Krishnakumar
Dan Zhao
Hao Yu
Ethan Kosak-Hine
Tom Gibbs
Busra Tugce Gurbuz
The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. We improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys. We demonstrate the simulator with a tailored example in which we track agents' political positions and show how partisan manipulation of agents can affect election results.
The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse
Ekansh Sharma
Daniel M. Roy
Model merging aims to efficiently combine the weights of multiple expert models, each trained on a specific task, into a single multi-task m… (voir plus)odel, with strong performance across all tasks. When applied to all but the last layer of weights, existing methods -- such as Task Arithmetic, TIES-merging, and TALL mask merging -- work well to combine expert models obtained by fine-tuning a common foundation model, operating within a"local"neighborhood of the foundation model. This work explores the more challenging scenario of"non-local"merging, which we find arises when an expert model changes significantly during pretraining or where the expert models do not even share a common foundation model. We observe that standard merging techniques often fail to generalize effectively in this non-local setting, even when accounting for permutation symmetries using standard techniques. We identify that this failure is, in part, due to"variance collapse", a phenomenon identified also in the setting of linear mode connectivity by Jordan et al. (2023). To address this, we propose a multi-task technique to re-scale and shift the output activations of the merged model for each task, aligning its output statistics with those of the corresponding task-specific expert models. Our experiments demonstrate that this correction significantly improves the performance of various model merging approaches in non-local settings, providing a strong baseline for future research on this problem.
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Genta Indra Winata
Frederikus Hudi
Patrick Amadeus Irawan
David Anugraha
Rifki Afina Putri
Yutong Wang
Adam Nohejl
Ubaidillah Ariq Prathama
Nedjma OUSIDHOUM
Afifa Amriani
Anar Rzayev
Anirban Das
Ashmari Pramodya
Aulia Adila
Bryan Wilie
Candy Olivia Mawalim
Ching Lam Cheng
Daud Abolade
Emmanuele Chersoni
Enrico Santus … (voir 31 de plus)
Fariz Ikhwantri
Garry Kuwanto
Hanyang Zhao
Haryo Akbarianto Wibowo
Holy Lovenia
Jan Christian Blaise Cruz
Jan Wira Gotama Putra
Junho Myung
Lucky Susanto
Maria Angelica Riera Machin
Marina Zhukova
Michael Anugraha
Muhammad Farid Adilazuarda
Natasha Santosa
Peerat Limkonchotiwat
Raj Dabre
Rio Alexander Audino
Samuel Cahyawijaya
Shi-Xiong Zhang
Stephanie Yulia Salim
Yi Zhou
Yinxuan Gui
En-Shiun Annie Lee
Shogo Okada
Ayu Purwarianti
Alham Fikri Aji
Taro Watanabe
Derry Tanti Wijaya
Alice Oh
Chong-Wah Ngo
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Genta Indra Winata
Frederikus Hudi
Patrick Amadeus Irawan
David Anugraha
Rifki Afina Putri
Yutong Wang
Adam Nohejl
Ubaidillah Ariq Prathama
Nedjma OUSIDHOUM
Afifa Amriani
Anar Rzayev
Anirban Das
Ashmari Pramodya
Aulia Adila
Bryan Wilie
Candy Olivia Mawalim
Ching Lam Cheng
Daud Abolade
Emmanuele Chersoni
Enrico Santus … (voir 31 de plus)
Fariz Ikhwantri
Garry Kuwanto
Hanyang Zhao
Haryo Akbarianto Wibowo
Holy Lovenia
Jan Christian Blaise Cruz
Jan Wira Gotama Putra
Junho Myung
Lucky Susanto
Maria Angelica Riera Machin
Marina Zhukova
Michael Anugraha
Muhammad Farid Adilazuarda
Natasha Santosa
Peerat Limkonchotiwat
Raj Dabre
Rio Alexander Audino
Samuel Cahyawijaya
Shi-Xiong Zhang
Stephanie Yulia Salim
Yi Zhou
Yinxuan Gui
En-Shiun Annie Lee
Shogo Okada
Ayu Purwarianti
Alham Fikri Aji
Taro Watanabe
Derry Tanti Wijaya
Alice Oh
Chong-Wah Ngo