Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching
Ange-Cl'ement Akazan
Alexia Jolicoeur-Martineau
Hallucination Detox: Sensitive Neuron Dropout (SeND) for Large Language Model Training
Shahrad Mohammadzadeh
Juan David Guerra
As large language models (LLMs) are increasingly deployed across various industries, concerns regarding their reliability, particularly due … (voir plus)to hallucinations - outputs that are factually inaccurate or irrelevant to user input - have grown. Our research investigates the relationship between the training process and the emergence of hallucinations to address a key gap in existing research that focuses primarily on post hoc detection and mitigation strategies. Using models from the Pythia suite (70M - 12B parameters) and several hallucination detection metrics, we analyze hallucination trends throughout training and explore LLM internal dynamics. We introduce Sensitivity Dropout (SenD), a novel training protocol designed to mitigate hallucinations by reducing variance during training. SenD achieves this by deterministically dropping embedding indices with significant variability, referred to as Sensitive Embedding Indices. In addition, we develop an unsupervised hallucination detection metric, Efficient EigenScore (EES), which approximates the traditional EigenScore at 2x speed. This efficient metric is integrated into our protocol, allowing SenD to be both computationally scalable and effective at reducing hallucinations. Our empirical evaluation demonstrates that our approach improves LLM reliability at test time by up to 40% compared to normal training while also providing an efficient method to improve factual accuracy when adapting LLMs to Wikipedia, Medical, and LegalBench domains.
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Shahrad Mohammadzadeh
Juan David Guerra
Beyond Causal Discovery for Astronomy: Learning Meaningful Representations with Independent Component Analysis
Zehao Jin
Mario Pasquato
Benjamin L. Davis
A. Macciò
Circulating IL-17F, but not IL-17A, is elevated in severe COVID-19 and leads to an ERK1/2 and p38 MAPK-dependent increase in ICAM-1 cell surface expression and neutrophil adhesion on endothelial cells
Jérôme Bédard-Matteau
Antoine Soulé
Katelyn Yixiu Liu
Lyvia Fourcade
Douglas D. Fraser
Simon Rousseau
Severe COVID-19 is associated with neutrophilic inflammation and immunothrombosis. Several members of the IL-17 cytokine family have been as… (voir plus)sociated with neutrophilic inflammation and activation of the endothelium. Therefore, we investigated whether these cytokines were associated with COVID-19.We investigated the association between COVID-19 and circulating plasma levels of IL-17 cytokine family members in participants to the Biobanque québécoise de la COVID-19 (BQC19), a prospective observational cohort and an independent cohort from Western University (London, Ontario). We measured the in vitro impact of IL-17F on intercellular adhesion molecule 1 (ICAM-1) cell surface expression and neutrophil adhesion on endothelial cells in culture. The contribution of two Mitogen Activated Protein Kinase (MAPK) pathways was determined using small molecule inhibitors PD184352 (a MKK1/MKK2 inhibitor) and BIRB0796 (a p38 MAPK inhibitor).We found increased IL-17D and IL-17F plasma levels when comparing SARS-CoV-2-positive vs negative hospitalized participants. Moreover, increased plasma levels of IL-17D, IL-17E and IL-17F were noted when comparing severe versus mild COVID-19. IL-17F, but not IL-17A, was significantly elevated in people with COVID-19 compared to healthy controls and with more severe disease. In vitro work on endothelial cells treated with IL-17F for 24h showed an increase cell surface expression of ICAM-1 accompanied by neutrophil adhesion. The introduction of two MAPK inhibitors significantly reduced the binding of neutrophils while also reducing ICAM-1 expression at the surface level of endothelial cells, but not its intracellular expression.Overall, these results have identified an association between two cytokines of the IL-17 family (IL-17D and IL-17F) with COVID-19 and disease severity. Considering that IL-17F stimulation promotes neutrophil adhesion to the endothelium in a MAPK-dependent manner, it is attractive to speculate that this pathway may contribute to pathogenic immunothrombosis in concert with other molecular effectors.
Circulating IL-17F, but not IL-17A, is elevated in severe COVID-19 and leads to an ERK1/2 and p38 MAPK-dependent increase in ICAM-1 cell surface expression and neutrophil adhesion on endothelial cells
Jérôme Bédard-Matteau
Antoine Soulé
Katelyn Yixiu Liu
Lyvia Fourcade
Douglas D. Fraser
Simon Rousseau
Background Severe COVID-19 is associated with neutrophilic inflammation and immunothrombosis. Several members of the IL-17 cytokine family h… (voir plus)ave been associated with neutrophilic inflammation and activation of the endothelium. Therefore, we investigated whether these cytokines were associated with COVID-19. Methods We investigated the association between COVID-19 and circulating plasma levels of IL-17 cytokine family members in participants to the Biobanque québécoise de la COVID-19 (BQC19), a prospective observational cohort and an independent cohort from Western University (London, Ontario). We measured the in vitro impact of IL-17F on intercellular adhesion molecule 1 (ICAM-1) cell surface expression and neutrophil adhesion on endothelial cells in culture. The contribution of two Mitogen Activated Protein Kinase (MAPK) pathways was determined using small molecule inhibitors PD184352 (a MKK1/MKK2 inhibitor) and BIRB0796 (a p38 MAPK inhibitor). Results We found increased IL-17D and IL-17F plasma levels when comparing SARS-CoV-2-positive vs negative hospitalized participants. Moreover, increased plasma levels of IL-17D, IL-17E and IL-17F were noted when comparing severe versus mild COVID-19. IL-17F, but not IL-17A, was significantly elevated in people with COVID-19 compared to healthy controls and with more severe disease. In vitro work on endothelial cells treated with IL-17F for 24h showed an increase cell surface expression of ICAM-1 accompanied by neutrophil adhesion. The introduction of two MAPK inhibitors significantly reduced the binding of neutrophils while also reducing ICAM-1 expression at the surface level of endothelial cells, but not its intracellular expression. Discussion Overall, these results have identified an association between two cytokines of the IL-17 family (IL-17D and IL-17F) with COVID-19 and disease severity. Considering that IL-17F stimulation promotes neutrophil adhesion to the endothelium in a MAPK-dependent manner, it is attractive to speculate that this pathway may contribute to pathogenic immunothrombosis in concert with other molecular effectors.
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Compositionality is believed to be fundamental to intelligence. In humans, it underlies the structure of thought, language, and higher-level… (voir plus) reasoning. In AI, compositional representations can enable a powerful form of out-of-distribution generalization, in which a model systematically adapts to novel combinations of known concepts. However, while we have strong intuitions about what compositionality is, there currently exists no formal definition for it that is measurable and mathematical. Here, we propose such a definition, which we call representational compositionality, that accounts for and extends our intuitions about compositionality. The definition is conceptually simple, quantitative, grounded in algorithmic information theory, and applicable to any representation. Intuitively, representational compositionality states that a compositional representation satisfies three properties. First, it must be expressive. Second, it must be possible to re-describe the representation as a function of discrete symbolic sequences with re-combinable parts, analogous to sentences in natural language. Third, the function that relates these symbolic sequences to the representation, analogous to semantics in natural language, must be simple. Through experiments on both synthetic and real world data, we validate our definition of compositionality and show how it unifies disparate intuitions from across the literature in both AI and cognitive science. We also show that representational compositionality, while theoretically intractable, can be readily estimated using standard deep learning tools. Our definition has the potential to inspire the design of novel, theoretically-driven models that better capture the mechanisms of compositional thought.
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Convergence of Manifold Filter-Combine Networks
David R. Johnson
Joyce Chew
Siddharth Viswanath
Edward De Brouwer
Deanna Needell
Michael Perlmutter
In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine fra… (voir plus)mework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.
Convergence of Manifold Filter-Combine Networks
David R. Johnson
Joyce A. Chew
Siddharth Viswanath
Edward De Brouwer
Deanna Needell
Michael Perlmutter
In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine fra… (voir plus)mework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.
In-context learning and Occam's razor
Eric Elmoznino
Tom Marty
Tejas Kasetty
Leo Gagnon
Sarthak Mittal
Mahan Fathi
A central goal of machine learning is generalization. While the No Free Lunch Theorem states that we cannot obtain theoretical guarantees fo… (voir plus)r generalization without further assumptions, in practice we observe that simple models which explain the training data generalize best: a principle called Occam's razor. Despite the need for simple models, most current approaches in machine learning only minimize the training error, and at best indirectly promote simplicity through regularization or architecture design. Here, we draw a connection between Occam's razor and in-context learning: an emergent ability of certain sequence models like Transformers to learn at inference time from past observations in a sequence. In particular, we show that the next-token prediction loss used to train in-context learners is directly equivalent to a data compression technique called prequential coding, and that minimizing this loss amounts to jointly minimizing both the training error and the complexity of the model that was implicitly learned from context. Our theory and the empirical experiments we use to support it not only provide a normative account of in-context learning, but also elucidate the shortcomings of current in-context learning methods, suggesting ways in which they can be improved. We make our code available at https://github.com/3rdCore/PrequentialCode.
MixEHR-Nest: Identifying Subphenotypes within Electronic Health Records through Hierarchical Guided-Topic Modeling
Ruohan Wang
Zilong Wang
Ziyang Song
Automatic subphenotyping from electronic health records (EHRs)provides numerous opportunities to understand diseases with unique subgroups a… (voir plus)nd enhance personalized medicine for patients. However, existing machine learning algorithms either focus on specific diseases for better interpretability or produce coarse-grained phenotype topics without considering nuanced disease patterns. In this study, we propose a guided topic model, MixEHR-Nest, to infer sub-phenotype topics from thousands of disease using multi-modal EHR data. Specifically, MixEHR-Nest detects multiple subtopics from each phenotype topic, whose prior is guided by the expert-curated phenotype concepts such as Phenotype Codes (PheCodes) or Clinical Classification Software (CCS) codes. We evaluated MixEHR-Nest on two EHR datasets: (1) the MIMIC-III dataset consisting of over 38 thousand patients from intensive care unit (ICU) from Beth Israel Deaconess Medical Center (BIDMC) in Boston, USA; (2) the healthcare administrative database PopHR, comprising 1.3 million patients from Montreal, Canada. Experimental results demonstrate that MixEHR-Nest can identify subphenotypes with distinct patterns within each phenotype, which are predictive for disease progression and severity. Consequently, MixEHR-Nest distinguishes between type 1 and type 2 diabetes by inferring subphenotypes using CCS codes, which do not differentiate these two subtype concepts. Additionally, MixEHR-Nest not only improved the prediction accuracy of short-term mortality of ICU patients and initial insulin treatment in diabetic patients but also revealed the contributions of subphenotypes. For longitudinal analysis, MixEHR-Nest identified subphenotypes of distinct age prevalence under the same phenotypes, such as asthma, leukemia, epilepsy, and depression. The MixEHR-Nest software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-Nest.