Publications

The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation

Kushal Arora

Timothy O'Donnell

Doina Precup

Jason Aaron Edward Weston

Jackie C.K.Cheung

State-of-the-art language generation models can degenerate when applied to open-ended generation problems such as text completion, story gen… (voir plus)eration, or dialog modeling. This degeneration usually shows up in the form of incoherence, lack of vocabulary diversity, and self-repetition or copying from the context. In this paper, we postulate that ``human-like'' generations usually lie in a narrow and nearly flat entropy band, and violation of these entropy bounds correlates with degenerate behavior. Our experiments show that this stable narrow entropy zone exists across models, tasks, and domains and confirm the hypothesis that violations of this zone correlate with degeneration. We then use this insight to propose an entropy-aware decoding algorithm that respects these entropy bounds resulting in less degenerate, more contextual, and"human-like"language generation in open-ended text generation settings.

2023-02-14

ArXiv (prépublication)

DEUP: Direct Epistemic Uncertainty Prediction

Moksh J. Jain

Salem Lahlou

Hadi Nekoei

Victor I Butoi

Paul Bertin

Jarrid Rector-Brooks

Maksym Korablyov

Epistemic Uncertainty is a measure of the lack of knowledge of a learner which diminishes with more evidence. While existing work focuses on… (voir plus) using the variance of the Bayesian posterior due to parameter uncertainty as a measure of epistemic uncertainty, we argue that this does not capture the part of lack of knowledge induced by model misspecification. We discuss how the excess risk, which is the gap between the generalization error of a predictor and the Bayes predictor, is a sound measure of epistemic uncertainty which captures the effect of model misspecification. We thus propose a principled framework for directly estimating the excess risk by learning a secondary predictor for the generalization error and subtracting an estimate of aleatoric uncertainty, i.e., intrinsic unpredictability. We discuss the merits of this novel measure of epistemic uncertainty, and highlight how it differs from variance-based measures of epistemic uncertainty and addresses its major pitfall. Our framework, Direct Epistemic Uncertainty Prediction (DEUP) is particularly interesting in interactive learning environments, where the learner is allowed to acquire novel examples in each round. Through a wide set of experiments, we illustrate how existing methods in sequential model optimization can be improved with epistemic uncertainty estimates from DEUP, and how DEUP can be used to drive exploration in reinforcement learning. We also evaluate the quality of uncertainty estimates from DEUP for probabilistic image classification and predicting synergies of drug combinations.

2023-02-13

TMLR (accepté)

openreview.net

Interpersonal attunement in social interactions: from collective psychophysiology to inter-personalized psychiatry and beyond

Dimitris Bolis

Leonhard Schilbach

In this article, we analyse social interactions, drawing on diverse points of views, ranging from dialectics, second-person neuroscience and… (voir plus) enactivism to dynamical systems, active inference and machine learning. To this end, we define interpersonal attunement as a set of multi-scale processes of building up and materializing social expectations—put simply, anticipating and interacting with others and ourselves. While cultivating and negotiating common ground, via communication and culture-building activities, are indispensable for the survival of the individual, the relevant multi-scale mechanisms have been largely considered in isolation. Here, collective psychophysiology, we argue, can lend itself to the fine-tuned analysis of social interactions, without neglecting the individual. On the other hand, an interpersonal mismatch of expectations can lead to a breakdown of communication and social isolation known to negatively affect mental health. In this regard, we review psychopathology in terms of interpersonal misattunement, conceptualizing psychiatric disorders as disorders of social interaction, to describe how individual mental health is inextricably linked to social interaction. By doing so, we foresee avenues for an inter-personalized psychiatry, which moves from a static spectrum of disorders to a dynamic relational space, focusing on how the multi-faceted processes of social interaction can help to promote mental health. This article is part of the theme issue ‘Concepts in interaction: social engagement and inner experiences’.

2023-02-13

Philosophical Transactions of the Royal Society B: Biological Sciences (publié)

Gintare Karolina Dziugaite

Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization

MAHDI HAGHIFAM

Borja Rodr'iguez-G'alvez

Ragnar Thobaben

Mikael Skoglund

Daniel M. Roy

2023-02-13

Proceedings of The 34th International Conference on Algorithmic Learning Theory (publié)

Sources of richness and ineffability for phenomenally conscious states

Xu Ji

Eric Elmoznino

George Deane

Axel Constant

Guillaume Lajoie

Jonathan Simon

Abstract Conscious states—state that there is something it is like to be in—seem both rich or full of detail and ineffable or hard to fu… (voir plus)lly describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theoretic dynamical systems perspective on the richness and ineffability of consciousness. In our framework, the richness of conscious experience corresponds to the amount of information in a conscious state and ineffability corresponds to the amount of information lost at different stages of processing. We describe how attractor dynamics in working memory would induce impoverished recollections of our original experiences, how the discrete symbolic nature of language is insufficient for describing the rich and high-dimensional structure of experiences, and how similarity in the cognitive function of two individuals relates to improved communicability of their experiences to each other. While our model may not settle all questions relating to the explanatory gap, it makes progress toward a fully physicalist explanation of the richness and ineffability of conscious experience—two important aspects that seem to be part of what makes qualitative character so puzzling.

2023-02-13

ArXiv (prépublication)

Sources of richness and ineffability for phenomenally conscious states

Xu Ji

Eric Elmoznino

George Deane

Axel Constant

Guillaume Lajoie

Jonathan Simon

Abstract Conscious states—state that there is something it is like to be in—seem both rich or full of detail and ineffable or hard to fu… (voir plus)lly describe or recall. The problem of ineffability, in particular, is a longstanding issue in philosophy that partly motivates the explanatory gap: the belief that consciousness cannot be reduced to underlying physical processes. Here, we provide an information theoretic dynamical systems perspective on the richness and ineffability of consciousness. In our framework, the richness of conscious experience corresponds to the amount of information in a conscious state and ineffability corresponds to the amount of information lost at different stages of processing. We describe how attractor dynamics in working memory would induce impoverished recollections of our original experiences, how the discrete symbolic nature of language is insufficient for describing the rich and high-dimensional structure of experiences, and how similarity in the cognitive function of two individuals relates to improved communicability of their experiences to each other. While our model may not settle all questions relating to the explanatory gap, it makes progress toward a fully physicalist explanation of the richness and ineffability of conscious experience—two important aspects that seem to be part of what makes qualitative character so puzzling.

2023-02-13

ArXiv (prépublication)

Distributional GFlowNets with Quantile Flows

Dinghuai Zhang

Ling Pan

Ricky T. Q. Chen

Aaron Courville

Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating com… (voir plus)plex combinatorial structure through a series of decision-making steps. Despite being inspired from reinforcement learning, the current GFlowNet framework is relatively limited in its applicability and cannot handle stochasticity in the reward function. In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training. By parameterizing each edge flow through their quantile functions, our proposed \textit{quantile matching} GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty. Moreover, we find that the distributional approach can achieve substantial improvement on existing benchmarks compared to prior methods due to our enhanced training algorithm, even in settings with deterministic rewards.

2023-02-11

ArXiv (prépublication)

Characterization Of Inpaint Residuals In Interferometric Measurements of the Epoch Of Reionization

Michael Pagano

Jing Liu

Adrian Liu

Nicholas S Kern

Aaron Ewall-Wice

Philip Bull

Robert Pascua

Siamak Ravanbakhsh

Zara Abdurashidova

Tyrone Adams

James E Aguirre

Paul Alexander

Zaki S Ali

Rushelle Baartman

Yanga Balfour

Adam P Beardsley

Gianni Bernardi

Tashalee S Billings

Judd D Bowman

Richard F Bradley … (voir 58 de plus)

Jacob Burba

Steven Carey

Chris L Carilli

Carina Cheng

David R DeBoer

Eloy de Lera Acedo

Matt Dexter

Joshua S Dillon

Nico Eksteen

John Ely

Nicolas Fagnoni

Randall Fritz

Steven R Furlanetto

Kingsley Gale-Sides

Brian Glendenning

Deepthi Gorthi

Bradley Greig

Jasper Grobbelaar

Ziyaad Halday

Bryna J Hazelton

Jacqueline N Hewitt

Jack Hickish

Daniel C Jacobs

Austin Julius

MacCalvin Kariseb

Joshua Kerrigan

Piyanat Kittiwisit

Saul A Kohn

Matthew Kolopanis

Adam Lanman

Paul La Plante

Anita Loots

David Harold Edward MacMahon

Lourence Malan

Cresshim Malgas

Keith Malgas

Bradley Marero

Zachary E Martinot

Andrei Mesinger

Mathakane Molewa

Miguel F Morales

Tshegofalang Mosiane

Abraham R Neben

Bojan Nikolic

Hans Nuwegeld

Aaron R Parsons

Nipanjana Patra

Samantha Pieterse

Nima Razavi-Ghods

James Robnett

Kathryn Rosie

Peter Sims

Craig Smith

Hilton Swarts

Nithyanandan Thyagarajan

Pieter van Wyngaarden

Peter K G Williams

Haoxuan Zheng

To mitigate the effects of Radio Frequency Interference (RFI) on the data analysis pipelines of 21cm interferometric instruments, numerous i… (voir plus)npaint techniques have been developed. In this paper we examine the qualitative and quantitative errors introduced into the visibilities and power spectrum due to inpainting. We perform our analysis on simulated data as well as real data from the Hydrogen Epoch of Reionization Array (HERA) Phase 1 upper limits. We also introduce a convolutional neural network that is capable of inpainting RFI corrupted data. We train our network on simulated data and show that our network is capable at inpainting real data without requiring to be retrained. We find that techniques that incorporate high wavenumbers in delay space in their modeling are best suited for inpainting over narrowband RFI. We show that with our fiducial parameters Discrete Prolate Spheroidal Sequences (DPSS) and CLEAN provide the best performance for intermittent RFI while Gaussian Progress Regression (GPR) and Least Squares Spectral Analysis (LSSA) provide the best performance for larger RFI gaps. However we caution that these qualitative conclusions are sensitive to the chosen hyperparameters of each inpainting technique. We show that all inpainting techniques reliably reproduce foreground dominated modes in the power spectrum. Since the inpainting techniques should not be capable of reproducing noise realizations, we find that the largest errors occur in the noise dominated delay modes. We show that as the noise level of the data comes down, CLEAN and DPSS are most capable of reproducing the fine frequency structure in the visibilities.

2023-02-10

Monthly Notices of the Royal Astronomical Society (publié)

8-channel Tx dipole and 20-channel Rx loop coil array for MRI of the cervical spinal cord at 7 Tesla

Nibardo Lopez Rios

Kyle M. Gilbert

Daniel Papp

Gaspard Cereza

Alexandru Foias

D. Rangaprakash

Markus W. May

Bastien Guerin

Lawrence L. Wald

Boris Keil

Jason P. Stockmann

Robert L. Barry

Julien Cohen-Adad

2023-02-09

bioRxiv (prépublication)

Restoring the missing person to personalized medicine and precision psychiatry

Ana Gómez-Carrillo

Vincent Paquin

Laurence J. Kirmayer

2023-02-09

Frontiers in Neuroscience (publié)

DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with GFlowNets

Lazar Atanackovic

Alexander Tong

Jason Hartford

Leo J Lee

Bo Wang

One of the grand challenges of cell biology is inferring the gene regulatory network (GRN) which describes interactions between genes and th… (voir plus)eir products that control gene expression and cellular function. We can treat this as a causal discovery problem but with two non-standard challenges: (1) regulatory networks are inherently cyclic so we should not model a GRN as a directed acyclic graph (DAG), and (2) observations have significant measurement noise, so for typical sample sizes there will always be a large equivalence class of graphs that are likely given the data, and we want methods that capture this uncertainty. Existing methods either focus on challenge (1), identifying cyclic structure from dynamics, or on challenge (2) learning complex Bayesian posteriors over DAGs, but not both. In this paper we leverage the fact that it is possible to estimate the"velocity"of gene expression with RNA velocity techniques to develop an approach that addresses both challenges. Because we have access to velocity information, we can treat the Bayesian structure learning problem as a problem of sparse identification of a dynamical system, capturing cyclic feedback loops through time. Since our objective is to model uncertainty over discrete structures, we leverage Generative Flow Networks (GFlowNets) to estimate the posterior distribution over the combinatorial space of possible sparse dependencies. Our results indicate that our method learns posteriors that better encapsulate the distributions of cyclic structures compared to counterpart state-of-the-art Bayesian structure learning approaches.

2023-02-08

ArXiv (prépublication)