Towards Climate Variable Prediction with Conditioned Spatio-Temporal Normalizing Flows
Christina Winkler
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies
Shiva Kanth Sujit
Pedro Braga
Jorg Bornschein
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from … (see more)scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive, such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there isn't a single established protocol for evaluating offline RL methods. In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of the learning process and the robustness of algorithms to distribution changes in the dataset while also harmonizing the visualization of the offline and online learning phases. Our approach is generally applicable and easy to implement. We compare several existing offline RL algorithms using this approach and present insights from a variety of tasks and offline datasets.
CD3ζ ITAMs enable ligand discrimination and antagonism by inhibiting TCR signaling in response to low-affinity peptides
Guillaume Gaud
Sooraj R. Achar
François X. P. Bourassa
John S. Davies
Teri Hatzihristidis
Seeyoung Choi
Taisuke Kondo
Selamawit Gossa
Jan Lee
Paul Juneau
Naomi Taylor
Christian S. Hinrichs
Dorian B. McGavern
Grégoire Altan-Bonnet
Paul E. Love
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan D'Orazio
Nicolas Loizou
Issam Hadj Laradji
We investigate the convergence of stochastic mirror descent (SMD) under interpolation in relatively smooth and smooth convex optimization. I… (see more)n relatively smooth convex optimization we provide new convergence guarantees for SMD with a constant stepsize. For smooth convex optimization we propose a new adaptive stepsize scheme --- the mirror stochastic Polyak stepsize (mSPS). Notably, our convergence results in both settings do not make bounded gradient assumptions or bounded variance assumptions, and we show convergence to a neighborhood that vanishes under interpolation. Consequently, these results correspond to the first convergence guarantees under interpolation for the exponentiated gradient algorithm for fixed or adaptive stepsizes. mSPS generalizes the recently proposed stochastic Polyak stepsize (SPS) (Loizou et al. 2021) to mirror descent and remains both practical and efficient for modern machine learning applications while inheriting the benefits of mirror descent. We complement our results with experiments across various supervised learning tasks and different instances of SMD, demonstrating the effectiveness of mSPS.
Capture the Flag: Uncovering Data Insights with Large Language Models
Issam Hadj Laradji
Perouz Taslakian
Sai Rajeswar
Valentina Zantedeschi
Alexandre Lacoste
David Vazquez
Multi-resolution Time-Series Transformer for Long-term Forecasting
Yitian Zhang
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
The Unsolved Challenges of LLMs as Generalist Web Agents: A Case Study
Rim Assouel
Tom Marty
Massimo Caccia
Issam Hadj Laradji
Sai Rajeswar
Hector Palacios
David Vazquez
Alexandre Lacoste
30×30 biodiversity gains rely on national coordination
Isaac Eckert
Andrea Brown
Dominique Caron
Federico Riva
Knowledge by omission: the significance of omissions in the 5-choice serial reaction time task
Caroline Vouillac-Mendoza
Serge H. Ahmed
Karine Guillem
The 5-choice serial reaction time task (5-CSRTT) is commonly used to assess attention in rodents. Manipulation of this task by decreasing th… (see more)e light stimulus duration is often used to probe attentional capacity and causes a decrease in accuracy and an increase in omissions. However, although a decrease in response accuracy is commonly interpreted as a decrease in attention, it is more difficult to interpret an increase in omissions in terms of attentional performance. Here we present a series of experiments in rats that seeks to investigate the origins of these key behavioral measures of attention in the 5-CSRTT. After an initial training in the 5-CSRTT, rats were tested in a variable stimulus duration procedure to increase task difficulty and probe visual attentional capacity under several specific controlled conditions. We found that response accuracy reflects visuospatial sustained attentional processing, as commonly interpreted, while response omission reflects rats’ ignorance about the stimulus location, presumably due to failure to pay attention to the curved wall during its presentation. Moreover, when rats lack of relevant information, they choose not to respond instead of responding randomly. Overall, our results indicate that response accuracy and response omission thus correspond to two distinct attentional states.
Coordination among leaf and fine root traits across a strong natural soil fertility gradient
Xavier Guilbeault-Mayers
Hans Lambers
Player-Guided AI outperforms standard AI in Sequence Alignment Puzzles
Renata Mutalova
Roman Sarrazin-Gendron
Parham Ghasemloo Gheidari
Eddie Cai
Gabriel Richard
Sébastien Caisse
Rob Knight
Attila Szantner
Jérôme Waldispühl
The feature landscape of visual cortex
Rudi Tong
Ronan da Silva
Dongyan Lin
Arna Ghosh
James Wilsenach
Erica Cianfarano
Stuart Trenholm
Understanding computations in the visual system requires a characterization of the distinct feature preferences of neurons in different visu… (see more)al cortical areas. However, we know little about how feature preferences of neurons within a given area relate to that area’s role within the global organization of visual cortex. To address this, we recorded from thousands of neurons across six visual cortical areas in mouse and leveraged generative AI methods combined with closed-loop neuronal recordings to identify each neuron’s visual feature preference. First, we discovered that the mouse’s visual system is globally organized to encode features in a manner invariant to the types of image transformations induced by self-motion. Second, we found differences in the visual feature preferences of each area and that these differences generalized across animals. Finally, we observed that a given area’s collection of preferred stimuli (‘own-stimuli’) drive neurons from the same area more effectively through their dynamic range compared to preferred stimuli from other areas (‘other-stimuli’). As a result, feature preferences of neurons within an area are organized to maximally encode differences among own-stimuli while remaining insensitive to differences among other-stimuli. These results reveal how visual areas work together to efficiently encode information about the external world.