Publications

Thinker: Learning to Plan and Act
Stephen Chung
Ivan Anokhin
We propose the Thinker algorithm, a novel approach that enables reinforcement learning agents to autonomously interact with and utilize a le… (voir plus)arned world model. The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model. These model-interaction actions enable agents to perform planning by proposing alternative plans to the world model before selecting a final action to execute in the environment. This approach eliminates the need for handcrafted planning algorithms by enabling the agent to learn how to plan autonomously and allows for easy interpretation of the agent's plan with visualization. We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark, where the Thinker algorithm achieves state-of-the-art performance and competitive results, respectively. Visualizations of agents trained with the Thinker algorithm demonstrate that they have learned to plan effectively with the world model to select better actions. Thinker is the first work showing that an RL agent can learn to plan with a learned world model in complex environments.
Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network
Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Weihong Luo
Liang Chen
xiuqiang He
Deep sparse networks are widely investigated as a neural network architecture for prediction tasks with high-dimensional sparse features, wi… (voir plus)th which feature interaction selection is a critical component. While previous methods primarily focus on how to search feature interaction in a coarse-grained space, less attention has been given to a finer granularity. In this work, we introduce a hybrid-grained feature interaction selection approach that targets both feature field and feature value for deep sparse networks. To explore such expansive space, we propose a decomposed space which is calculated on the fly. We then develop a selection algorithm called OptFeature, which efficiently selects the feature interaction from both the feature field and the feature value simultaneously. Results from experiments on three large real-world benchmark datasets demonstrate that OptFeature performs well in terms of accuracy and efficiency. Additional studies support the feasibility of our method. All source code are publicly available\footnote{https://anonymous.4open.science/r/OptFeature-Anonymous}.
Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network
Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Weihong Luo
Liang Chen
xiuqiang He
A Unified, Scalable Framework for Neural Population Decoding
Mehdi Azabou
Vinam Arora
Venkataramana Ganesh
Ximeng Mao
Santosh B Nachimuthu
Michael Jacob Mendelson
Eva L Dyer
Our ability to use deep learning approaches to decipher neural activity would likely benefit from greater scale, in terms of both the model … (voir plus)size and the datasets. However, the integration of many neural recordings into one unified model is challenging, as each recording contains the activity of different neurons from different individual animals. In this paper, we introduce a training framework and architecture designed to model the population dynamics of neural activity across diverse, large-scale neural recordings. Our method first tokenizes individual spikes within the dataset to build an efficient representation of neural events that captures the fine temporal structure of neural activity. We then employ cross-attention and a PerceiverIO backbone to further construct a latent tokenization of neural population activities. Utilizing this architecture and training framework, we construct a large-scale multi-session model trained on large datasets from seven nonhuman primates, spanning over 158 different sessions of recording from over 27,373 neural units and over 100 hours of recordings. In a number of different tasks, we demonstrate that our pretrained model can be rapidly adapted to new, unseen sessions with unspecified neuron correspondence, enabling few-shot performance with minimal labels. This work presents a powerful new approach for building deep learning tools to analyze neural data and stakes out a clear path to training at scale for neural decoding models.
Versatile Energy-Based Probabilistic Models for High Energy Physics
Taoli Cheng
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment
Tianwei Ni
Michel Ma
Benjamin Eysenbach
Reinforcement learning (RL) algorithms face two distinct challenges: learning effective representations of past and present observations, an… (voir plus)d determining how actions influence future returns. Both challenges involve modeling long-term dependencies. The Transformer architecture has been very successful to solve problems that involve long-term dependencies, including in the RL domain. However, the underlying reason for the strong performance of Transformer-based RL methods remains unclear: is it because they learn effective memory, or because they perform effective credit assignment? After introducing formal definitions of memory length and credit assignment length, we design simple configurable tasks to measure these distinct quantities. Our empirical results reveal that Transformers can enhance the memory capability of RL algorithms, scaling up to tasks that require memorizing observations
Conserving avian evolutionary history can effectively safeguard future benefits for people
Rikki Gumbs
Claudia L. Gray
Michael Hoffmann
Rafael Molina-Venegas
Nisha Owen
Phylogenetic diversity (PD)—the evolutionary history of a set of species—is conceptually linked to the maintenance of yet-to-be-discover… (voir plus)ed benefits from biodiversity or “option value.” We used global phylogenetic and utilization data for birds to test the PD option value link, under the assumption that the performance of sets of PD-maximizing species at capturing known benefits is analogous to selecting the same species at a point in human history before these benefits were realized. PD performed better than random at capturing utilized bird species across 60% of tests, with performance linked to the phylogenetic dispersion and prevalence of each utilization category. Prioritizing threatened species for conservation by the PD they encapsulate performs comparably to prioritizing by their functional distinctiveness. However, species selected by each metric show low overlap, indicating that we should conserve both components of biodiversity to effectively conserve a variety of uses. Our findings provide empirical support for the link between evolutionary history and benefits for future generations.
In-Context Learning for Text Classification with Many Labels
Aristides Milios
M-TAG: A modular teaching-aid for Geant4
Liam Carroll
Graph topological property recovery with heat and wave dynamics-based features on graphs
Dhananjay Bhaskar
Yanlei Zhang
Charles Xu
Xingzhi Sun
Oluwadamilola Fasina
Maximilian Nickel
Michael Perlmutter
Smita Krishnaswamy
Estimating the population effectiveness of interventions against COVID-19 in France: a modelling study
Iris Ganser
Jane M. Heffernan
M. Prague
Rodolphe Thiébaut
Background Non-pharmaceutical interventions (NPIs) and vaccines have been widely used to manage the COVID-19 pandemic. However, uncertainty … (voir plus)persists regarding the effectiveness of these interventions due to data quality issues, methodological challenges, and differing contextual factors. Accurate estimation of their effects is crucial for future epidemic preparedness. Methods To address this, we developed a population-based mechanistic model that includes the impact of NPIs and vaccines on SARS-CoV-2 transmission and hospitalization rates. Our statistical approach estimated all parameters in one step, accurately propagating uncertainty. We fitted the model to comprehensive epidemiological data in France from March 2020 to October 2021. With the same model, we simulated scenarios of vaccine rollout. Results The first lockdown was the most effective, reducing transmission by 84% (95% confidence interval (CI) 83-85). Subsequent lockdowns had diminished effectiveness (reduction of 74% (69-77) and 11% (9-18), respectively). A 6pm curfew was more effective than one at 8 pm (68% (66-69) vs. 48% (45-49) reduction), while school closures reduced transmission by 15% (12-18). In a scenario without vaccines before November 2021, we predicted 159,000 or 194% (95% prediction interval (PI) 74-424) more deaths and 1,488,000 or 340% (136-689) more hospitalizations. If a vaccine had been available after 100 days, over 71,000 deaths (16,507-204,249) and 384,000 (88,579-1,020,386) hospitalizations could have been averted. Conclusion Our results highlight the substantial impact of NPIs, including lockdowns and curfews, in controlling the COVID-19 pandemic. We also demonstrate the value of the 100 days objective of the CEPI initiative for vaccine availability.
Addressing uncertainty when projecting marine species' distributions under climate change
Sarah C. Davies
Patrick L. Thompson
Catalina Gómez
Jessica Nephin
Anders Knudby
Ashley E. Park
Sarah K. Friesen
Emily M. Rubidge
Sean C. Anderson
Josephine C. Iacarella
Devin A. Lyons
Andrew MacDonald
Andrew McMillan
Eric J. Ward
Amber M. Holdsworth
Neil Swart
Jeff Price
Karen L. Hunter