Conditions for indexability of restless bandits and an algorithm to compute whittle index – CORRIGENDUM
Nima Akbarzadeh
Distinctive whole-brain cell types predict tissue damage patterns in thirteen neurodegenerative conditions
Veronika Pak
Quadri Adewale
Mahsa Dadar
Yashar Zeighami
Yasser Iturria-Medina
For over a century, brain research narrative has mainly centered on neuron cells. Accordingly, most neurodegenerative studies focus on neuro… (voir plus)nal dysfunction and their selective vulnerability, while we lack comprehensive analyses of other major cell types’ contribution. By unifying spatial gene expression, structural MRI, and cell deconvolution, here we describe how the human brain distribution of canonical cell types extensively predicts tissue damage in thirteen neurodegenerative conditions, including early-and late-onset Alzheimer’s disease, Parkinson’s disease, dementia with Lewy bodies, amyotrophic lateral sclerosis, mutations in presenilin-1, and three clinical variants of frontotemporal lobar degeneration (behavioural variant, semantic and non-fluent primary progressive aphasia) along with associated 3-repeat and 4-repeat tauopathies and TDP43 proteinopathies types A and C. We reconstructed comprehensive whole-brain reference maps of cellular abundance for six major cell types and identified characteristic axes of spatial overlapping with atrophy. Our results support the strong mediating role of non-neuronal cells, primarily microglia and astrocytes, in spatial vulnerability to tissue loss in neurodegeneration, with distinct and shared across-disorders pathomechanisms. These observations provide critical insights into the multicellular pathophysiology underlying spatiotemporal advance in neurodegeneration. Notably, they also emphasize the need to exceed the current neuro-centric view of brain diseases, supporting the imperative for cell-specific therapeutic targets in neurodegeneration.
Robust Data-driven Prescriptiveness Optimization
Mehran Poursoltani
Angelos Georghiou
The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information t… (voir plus)o provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.
Value function estimation using conditional diffusion models for control
Walter Talbott
Miguel Ángel Bautista
Alexander T Toshev
Joshua M. Susskind
Dynamic Routing and Wavelength Assignment with Reinforcement Learning
Peyman Kafaei
Hamed Pouya
Louis-Martin Rousseau
With the rapid developments in communication systems, and considering their dynamic nature, all-optical networks are becoming increasingly c… (voir plus)omplex. This study proposes a novel method based on deep reinforcement learning for the routing and wavelength assignment problem in all-optical wavelength-decision-multiplexing networks. We consider dynamic incoming requests, in which their arrival and holding times are not known in advance. The objective is to devise a strategy that minimizes the number of rejected packages due to the lack of resources in the long term. We use graph neural networks to capture crucial latent information from the graph-structured input to develop the optimal strategy. The proposed deep reinforcement learning algorithm selects a route and a wavelength simultaneously for each incoming traffic connection as they arrive. The results demonstrate that the learned agent outperforms the methods used in practice and can be generalized on network topologies that did not participate in training.
Invariant Causal Set Covering Machines
Thibaud Godon
Baptiste Bauvin
Beyond Gaussian Noise: A Generalized Approach to Likelihood Analysis with Non-Gaussian Noise
Ronan Legin
Alexandre Adam
A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection
Eduardo Dadalto Câmara Gomes
Pierre Colombo
Guillaume Staerman
Nathan Noiry
GEO-Bench: Toward Foundation Models for Earth Monitoring
Alexandre Lacoste
Nils Lehmann
Pau Rodriguez
Evan David Sherwin
Hannah Kerner
Björn Lütjens
Jeremy Andrew Irvin
David Dao
Hamed Alemohammad
Mehmet Gunturkun
Gabriel Huang
David Vazquez
Dava Newman
Stefano Ermon
Xiao Xiang Zhu
Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to subst… (voir plus)antial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.
GEO-Bench: Toward Foundation Models for Earth Monitoring
Alexandre Lacoste
Nils Lehmann
Pau Rodriguez
Evan David Sherwin
Hannah Kerner
Björn Lütjens
Jeremy Andrew Irvin
David Dao
Hamed Alemohammad
Mehmet Gunturkun
Gabriel Huang
David Vazquez
Dava Newman
Stefano Ermon
Xiao Xiang Zhu
Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to subst… (voir plus)antial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.
GEO-Bench: Toward Foundation Models for Earth Monitoring
Alexandre Lacoste
Nils Lehmann
Pau Rodriguez
Evan David Sherwin
Hannah Kerner
Björn Lütjens
Jeremy Andrew Irvin
David Dao
Hamed Alemohammad
Mehmet Gunturkun
Gabriel Huang
David Vazquez
Dava Newman
Stefano Ermon
Xiao Xiang Zhu
Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to subst… (voir plus)antial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.
The Stack: 3 TB of permissively licensed source code
Denis Kocetkov
Raymond Li
Loubna Ben allal
Jia LI
Chenghao Mou
Carlos Muñoz Ferrandis
Yacine Jernite
Margaret Mitchell
Sean Hughes
Thomas Wolf
Leandro Von Werra
Harm de Vries
Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)--not only for natural language proces… (voir plus)sing but also for code understanding and generation. To stimulate open and responsible research on LLMs for code, we introduce The Stack, a 3.1 TB dataset consisting of permissively licensed source code in 30 programming languages. We describe how we collect the full dataset, construct a permissively licensed subset, present a data governance plan, discuss limitations, and show promising results on text2code benchmarks by training 350M-parameter decoders on different Python subsets. We find that (1) near-deduplicating the data significantly boosts performance across all experiments, and (2) it is possible to match previously reported HumanEval and MBPP performance using only permissively licensed data. We make the dataset available at https://hf.co/BigCode, provide a tool called"Am I in The Stack"(https://hf.co/spaces/bigcode/in-the-stack) for developers to search The Stack for copies of their code, and provide a process for code to be removed from the dataset by following the instructions at https://www.bigcode-project.org/docs/about/the-stack/.