Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some … (voir plus)of this is also owed to the risks and costs associated with their use. On one front is their tendency to \textit{hallucinate} false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.
GFlowNets for Hamiltonian decomposition in groups of compatible operators
Isaac L. Huidobro-Meezs
Jun Dai
R. A. Vargas-Hern'andez
Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (voir plus)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.
GFlowNets for Hamiltonian decomposition in groups of compatible operators
Isaac L. Huidobro-Meezs
Jun Dai
R. A. Vargas-Hern'andez
Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (voir plus)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.
Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching
Ange-Cl'ement Akazan
Alexia Jolicoeur-Martineau
Hallucination Detox: Sensitive Neuron Dropout (SeND) for Large Language Model Training
Shahrad Mohammadzadeh
Juan David Guerra
As large language models (LLMs) are increasingly deployed across various industries, concerns regarding their reliability, particularly due … (voir plus)to hallucinations - outputs that are factually inaccurate or irrelevant to user input - have grown. Our research investigates the relationship between the training process and the emergence of hallucinations to address a key gap in existing research that focuses primarily on post hoc detection and mitigation strategies. Using models from the Pythia suite (70M - 12B parameters) and several hallucination detection metrics, we analyze hallucination trends throughout training and explore LLM internal dynamics. We introduce Sensitivity Dropout (SenD), a novel training protocol designed to mitigate hallucinations by reducing variance during training. SenD achieves this by deterministically dropping embedding indices with significant variability, referred to as Sensitive Embedding Indices. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore at 2x speed. This efficient metric is integrated into our protocol, allowing SenD to be both computationally scalable and effective at reducing hallucinations. Our empirical evaluation demonstrates that our approach improves LLM reliability at test time by up to 40% compared to normal training while also providing an efficient method to improve factual accuracy when adapting LLMs to Wikipedia, Medical, and LegalBench domains.
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Shahrad Mohammadzadeh
Juan David Guerra
Beyond Causal Discovery for Astronomy: Learning Meaningful Representations with Independent Component Analysis
Zehao Jin
Mario Pasquato
Benjamin L. Davis
A. Macciò
Circulating IL-17F, but not IL-17A, is elevated in severe COVID-19 and leads to an ERK1/2 and p38 MAPK-dependent increase in ICAM-1 cell surface expression and neutrophil adhesion on endothelial cells
Jérôme Bédard-Matteau
Antoine Soulé
Katelyn Yixiu Liu
Lyvia Fourcade
Douglas D. Fraser
Simon Rousseau
Background Severe COVID-19 is associated with neutrophilic inflammation and immunothrombosis. Several members of the IL-17 cytokine family h… (voir plus)ave been associated with neutrophilic inflammation and activation of the endothelium. Therefore, we investigated whether these cytokines were associated with COVID-19. Methods We investigated the association between COVID-19 and circulating plasma levels of IL-17 cytokine family members in participants to the Biobanque québécoise de la COVID-19 (BQC19), a prospective observational cohort and an independent cohort from Western University (London, Ontario). We measured the in vitro impact of IL-17F on intercellular adhesion molecule 1 (ICAM-1) cell surface expression and neutrophil adhesion on endothelial cells in culture. The contribution of two Mitogen Activated Protein Kinase (MAPK) pathways was determined using small molecule inhibitors PD184352 (a MKK1/MKK2 inhibitor) and BIRB0796 (a p38 MAPK inhibitor). Results We found increased IL-17D and IL-17F plasma levels when comparing SARS-CoV-2-positive vs negative hospitalized participants. Moreover, increased plasma levels of IL-17D, IL-17E and IL-17F were noted when comparing severe versus mild COVID-19. IL-17F, but not IL-17A, was significantly elevated in people with COVID-19 compared to healthy controls and with more severe disease. In vitro work on endothelial cells treated with IL-17F for 24h showed an increase cell surface expression of ICAM-1 accompanied by neutrophil adhesion. The introduction of two MAPK inhibitors significantly reduced the binding of neutrophils while also reducing ICAM-1 expression at the surface level of endothelial cells, but not its intracellular expression. Discussion Overall, these results have identified an association between two cytokines of the IL-17 family (IL-17D and IL-17F) with COVID-19 and disease severity. Considering that IL-17F stimulation promotes neutrophil adhesion to the endothelium in a MAPK-dependent manner, it is attractive to speculate that this pathway may contribute to pathogenic immunothrombosis in concert with other molecular effectors.
Circulating IL-17F, but not IL-17A, is elevated in severe COVID-19 and leads to an ERK1/2 and p38 MAPK-dependent increase in ICAM-1 cell surface expression and neutrophil adhesion on endothelial cells
Jérôme Bédard-Matteau
Antoine Soulé
Katelyn Yixiu Liu
Lyvia Fourcade
Douglas D. Fraser
Simon Rousseau
Severe COVID-19 is associated with neutrophilic inflammation and immunothrombosis. Several members of the IL-17 cytokine family have been as… (voir plus)sociated with neutrophilic inflammation and activation of the endothelium. Therefore, we investigated whether these cytokines were associated with COVID-19.We investigated the association between COVID-19 and circulating plasma levels of IL-17 cytokine family members in participants to the Biobanque québécoise de la COVID-19 (BQC19), a prospective observational cohort and an independent cohort from Western University (London, Ontario). We measured the in vitro impact of IL-17F on intercellular adhesion molecule 1 (ICAM-1) cell surface expression and neutrophil adhesion on endothelial cells in culture. The contribution of two Mitogen Activated Protein Kinase (MAPK) pathways was determined using small molecule inhibitors PD184352 (a MKK1/MKK2 inhibitor) and BIRB0796 (a p38 MAPK inhibitor).We found increased IL-17D and IL-17F plasma levels when comparing SARS-CoV-2-positive vs negative hospitalized participants. Moreover, increased plasma levels of IL-17D, IL-17E and IL-17F were noted when comparing severe versus mild COVID-19. IL-17F, but not IL-17A, was significantly elevated in people with COVID-19 compared to healthy controls and with more severe disease. In vitro work on endothelial cells treated with IL-17F for 24h showed an increase cell surface expression of ICAM-1 accompanied by neutrophil adhesion. The introduction of two MAPK inhibitors significantly reduced the binding of neutrophils while also reducing ICAM-1 expression at the surface level of endothelial cells, but not its intracellular expression.Overall, these results have identified an association between two cytokines of the IL-17 family (IL-17D and IL-17F) with COVID-19 and disease severity. Considering that IL-17F stimulation promotes neutrophil adhesion to the endothelium in a MAPK-dependent manner, it is attractive to speculate that this pathway may contribute to pathogenic immunothrombosis in concert with other molecular effectors.
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Compositionality is believed to be fundamental to intelligence. In humans, it underlies the structure of thought, language, and higher-level… (voir plus) reasoning. In AI, compositional representations can enable a powerful form of out-of-distribution generalization, in which a model systematically adapts to novel combinations of known concepts. However, while we have strong intuitions about what compositionality is, there currently exists no formal definition for it that is measurable and mathematical. Here, we propose such a definition, which we call representational compositionality, that accounts for and extends our intuitions about compositionality. The definition is conceptually simple, quantitative, grounded in algorithmic information theory, and applicable to any representation. Intuitively, representational compositionality states that a compositional representation satisfies three properties. First, it must be expressive. Second, it must be possible to re-describe the representation as a function of discrete symbolic sequences with re-combinable parts, analogous to sentences in natural language. Third, the function that relates these symbolic sequences to the representation, analogous to semantics in natural language, must be simple. Through experiments on both synthetic and real world data, we validate our definition of compositionality and show how it unifies disparate intuitions from across the literature in both AI and cognitive science. We also show that representational compositionality, while theoretically intractable, can be readily estimated using standard deep learning tools. Our definition has the potential to inspire the design of novel, theoretically-driven models that better capture the mechanisms of compositional thought.
Convergence of Manifold Filter-Combine Networks
David R. Johnson
Joyce Chew
Siddharth Viswanath
Edward De Brouwer
Deanna Needell
Michael Perlmutter
In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine fra… (voir plus)mework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.