Publications

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Jesse Farebrother

Jordi Orbay

Quan Vuong

Adrien Ali Taiga

Yevgen Chebotar

Ted Xiao

Alex Irpan

Sergey Levine

Pablo Samuel Castro

Aleksandra Faust

Aviral Kumar

Rishabh Agarwal

Value functions are a central component of deep reinforcement learning (RL). These functions, parameterized by neural networks, are trained … (see more)using a mean squared error regression objective to match bootstrapped target values. However, scaling value-based RL methods that use regression to large networks, such as high-capacity Transformers, has proven challenging. This difficulty is in stark contrast to supervised learning: by leveraging a cross-entropy classification loss, supervised methods have scaled reliably to massive networks. Observing this discrepancy, in this paper, we investigate whether the scalability of deep RL can also be improved simply by using classification in place of regression for training value functions. We demonstrate that value functions trained with categorical cross-entropy significantly improves performance and scalability in a variety of domains. These include: single-task RL on Atari 2600 games with SoftMoEs, multi-task RL on Atari with large-scale ResNets, robotic manipulation with Q-transformers, playing Chess without search, and a language-agent Wordle task with high-capacity Transformers, achieving state-of-the-art results on these domains. Through careful analysis, we show that the benefits of categorical cross-entropy primarily stem from its ability to mitigate issues inherent to value-based RL, such as noisy targets and non-stationarity. Overall, we argue that a simple shift to training value functions with categorical cross-entropy can yield substantial improvements in the scalability of deep RL at little-to-no cost.

2024-03-06

ArXiv (preprint)

doi.org

arxiv.org

Efficient Causal Graph Discovery Using Large Language Models

Thomas Jiralerspong

Xiaoyin Chen

Yash More

Vedant Shah

Yoshua Bengio

2024-03-05

ICLR.cc/2024/Workshop/AGI (poster)

doi.org

openreview.net

Explicit Knowledge Factorization Meets In-Context Learning: What Do We Gain?

Sarthak Mittal

Eric Elmoznino

Leo Gagnon

Sangnie Bhardwaj

Dhanya Sridhar

Guillaume Lajoie

2024-03-05

ICLR.cc/2024/Workshop/R2-FM (poster)

openreview.net

Optimisation of quantitative brain diffusion-relaxation MRI acquisition protocols with physics-informed machine learning.

Álvaro Planchuelo-Gómez

Maxime Descoteaux

Hugo Larochelle

Jana Hutter

Derek K. Jones

C. Tax

2024-03-05

Medical Image Analysis (published)

doi.org

Plant invasion in Mediterranean Europe: current hotspots and future scenarios

Luigi Cao Pinna

Laure Gallien

Laura J. Pollock

Irena Axmanová

Milan Chytrý

Marco Malavasi

Alicia T. R. Acosta

Juan Antonio Campos

Marta Carboni

The Mediterranean Basin has historically been subject to alien plant invasions that threaten its unique biodiversity. This seasonally dry an… (see more)d densely populated region is undergoing severe climatic and socioeconomic changes, and it is unclear whether these changes will worsen or mitigate plant invasions. Predictions are often biased, as species may not be in equilibrium in the invaded environment, depending on their invasion stage and ecological characteristics. To address future predictions uncertainty, we identified invasion hotspots across multiple biased modelling scenarios and ecological characteristics of successful invaders. We selected 92 alien plant species widespread in Mediterranean Europe and compiled data on their distribution in the Mediterranean and worldwide. We combined these data with environmental and propagule pressure variables to model global and regional species niches, and map their current and future habitat suitability. We identified invasion hotspots, examined their potential future shifts, and compared the results of different modelling strategies. Finally, we generalised our findings by using linear models to determine the traits and biogeographic features of invaders most likely to benefit from global change. Currently, invasion hotspots are found near ports and coastlines throughout Mediterranean Europe. However, many species occupy only a small portion of the environmental conditions to which they are preadapted, suggesting that their invasion is still an ongoing process. Future conditions will lead to declines in many currently widespread aliens, which will tend to move to higher elevations and latitudes. Our trait models indicate that future climates will generally favour species with conservative ecological strategies that can cope with reduced water availability, such as those with short stature and low specific leaf area. Taken together, our results suggest that in future environments, these conservative aliens will move farther from the introduction areas and upslope, threatening mountain ecosystems that have been spared from invasions so far.

2024-03-05

Ecography (published)

doi.org

Plant invasion in Mediterranean Europe: current hotspots and future scenarios

Luigi Cao Pinna

Laure Gallien

Laura J. Pollock

Irena Axmanová

Milan Chytrý

Marco Malavasi

Alicia T. R. Acosta

Juan Antonio Campos

Marta Carboni

The Mediterranean Basin has historically been subject to alien plant invasions that threaten its unique biodiversity. This seasonally dry an… (see more)d densely populated region is undergoing severe climatic and socioeconomic changes, and it is unclear whether these changes will worsen or mitigate plant invasions. Predictions are often biased, as species may not be in equilibrium in the invaded environment, depending on their invasion stage and ecological characteristics. To address future predictions uncertainty, we identified invasion hotspots across multiple biased modelling scenarios and ecological characteristics of successful invaders. We selected 92 alien plant species widespread in Mediterranean Europe and compiled data on their distribution in the Mediterranean and worldwide. We combined these data with environmental and propagule pressure variables to model global and regional species niches, and map their current and future habitat suitability. We identified invasion hotspots, examined their potential future shifts, and compared the results of different modelling strategies. Finally, we generalised our findings by using linear models to determine the traits and biogeographic features of invaders most likely to benefit from global change. Currently, invasion hotspots are found near ports and coastlines throughout Mediterranean Europe. However, many species occupy only a small portion of the environmental conditions to which they are preadapted, suggesting that their invasion is still an ongoing process. Future conditions will lead to declines in many currently widespread aliens, which will tend to move to higher elevations and latitudes. Our trait models indicate that future climates will generally favour species with conservative ecological strategies that can cope with reduced water availability, such as those with short stature and low specific leaf area. Taken together, our results suggest that in future environments, these conservative aliens will move farther from the introduction areas and upslope, threatening mountain ecosystems that have been spared from invasions so far.

2024-03-05

Ecography (published)

doi.org

Plant invasion in Mediterranean Europe: current hotspots and future scenarios

Luigi Cao Pinna

Laure Gallien

Laura J. Pollock

Irena Axmanová

Milan Chytrý

Marco Malavasi

Alicia T. R. Acosta

Juan Antonio Campos

Marta Carboni

The Mediterranean Basin has historically been subject to alien plant invasions that threaten its unique biodiversity. This seasonally dry an… (see more)d densely populated region is undergoing severe climatic and socioeconomic changes, and it is unclear whether these changes will worsen or mitigate plant invasions. Predictions are often biased, as species may not be in equilibrium in the invaded environment, depending on their invasion stage and ecological characteristics. To address future predictions uncertainty, we identified invasion hotspots across multiple biased modelling scenarios and ecological characteristics of successful invaders. We selected 92 alien plant species widespread in Mediterranean Europe and compiled data on their distribution in the Mediterranean and worldwide. We combined these data with environmental and propagule pressure variables to model global and regional species niches, and map their current and future habitat suitability. We identified invasion hotspots, examined their potential future shifts, and compared the results of different modelling strategies. Finally, we generalised our findings by using linear models to determine the traits and biogeographic features of invaders most likely to benefit from global change. Currently, invasion hotspots are found near ports and coastlines throughout Mediterranean Europe. However, many species occupy only a small portion of the environmental conditions to which they are preadapted, suggesting that their invasion is still an ongoing process. Future conditions will lead to declines in many currently widespread aliens, which will tend to move to higher elevations and latitudes. Our trait models indicate that future climates will generally favour species with conservative ecological strategies that can cope with reduced water availability, such as those with short stature and low specific leaf area. Taken together, our results suggest that in future environments, these conservative aliens will move farther from the introduction areas and upslope, threatening mountain ecosystems that have been spared from invasions so far.

2024-03-05

Ecography (published)

doi.org

Smoothness-Adaptive Sharpness-Aware Minimization for Finding Flatter Minima

Hiroki Naganuma

Junhyung Lyle Kim

Anastasios Kyrillidis

Ioannis Mitliagkas

The sharpness-aware minimization (SAM) procedure recently gained increasing attention due to its favorable generalization ability to unseen … (see more)data. SAM aims to find flatter (local) minima, utilizing a minimax objective. An immediate challenge in the application of SAM is the adjustment of two pivotal step sizes, which significantly influence its effectiveness. We introduce a novel, straightforward approach for adjusting step sizes that adapts to the smoothness of the objective function, thereby reducing the necessity for manual tuning. This method, termed Smoothness-Adaptive SAM (SA-SAM), not only simplifies the optimization process but also promotes the method's inherent tendency to converge towards flatter minima, enhancing performance in specific models.

2024-03-05

ICLR.cc/2024/Workshop/PML4LRS (poster)

openreview.net

The World Health Organization as an engine of ideational robustness

Jean-Louis Denis

Gaelle Foucault

Pierre Larouche

Catherine Régis

Miriam Cohen

Marie-Andree Girard

2024-03-05

Policy and Society (published)

doi.org

F$^3$low: Frame-to-Frame Coarse-grained Molecular Dynamics with SE(3) Guided Flow Matching

Shaoning Li

Yusong Wang

Mingyu Li

Bin Shao

Nanning Zheng

Zhang Jian

Jian Tang

2024-03-04

ICLR.cc/2024/Workshop/GEM (poster)

openreview.net

Enhancing and Evaluating Logical Reasoning Abilities of Large Language Models

Shujie Deng

Honghua Dong

Xujie Si

2024-03-04

ICLR.cc/2024/Workshop/SeT_LLM (published)

openreview.net

Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations

Jiarui Lu

Zuobai Zhang

Bozitao Zhong

Chence Shi

Jian Tang

The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consum… (see more)ing molecular dynamics (MD) simulations *in silico*. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). However, being agnostic of the underlying energy landscape, the accuracy of such generative model may still be limited. In this work, we explore the few-shot setting of such pre-trained generative sampler which incorporates MD simulations in a tractable manner. Specifically, given a target protein of interest, we first acquire some seeding conformations from the pre-trained sampler followed by a number of physical simulations in parallel starting from these seeding samples. Then we fine-tuned the generative model using the simulation trajectories above to become a target-specific sampler. Experimental results demonstrated the superior performance of such few-shot conformation sampler at a tractable computational cost.

2024-03-04

ICLR.cc/2024/Workshop/GEM (poster)

openreview.net

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications