Faisal Mohamed

Curiosity-Driven Exploration via Temporal \\ Contrastive Learning

Catherine Ji

Benjamin Eysenbach

Exploration remains a key challenge in reinforcement learning (RL), especially in long-horizon tasks and environments with high-dimensional … (see more)observations. A common strategy for effective exploration is to promote state coverage or novelty, which often involves estimating the agent's state visitation distribution. In this paper, we propose \textbf{C}uriosity-Driven Exploration via \textbf{Te}mporal \textbf{C}ontrastive Learning (\methodName), an exploration method based on temporal contrastive learning that rewards agents for reaching states with unexpected futures. This incentivizes uncovering meaningful less-visited states. \methodName is simple and does not require explicit density or uncertainty estimation, while learning representations aligned with the RL objective. It consistently outperforms standard baselines in complex mazes using different embodiments (Ant and Humanoid) and robotic manipulation tasks, while also yielding more diverse behaviors in Craftax without requiring task-specific information.

2025-07-01

rl-conference.cc/RLC/2025/Workshop/RLBrew (published)

openreview.net

Curiosity-Driven Exploration via Temporal Contrastive Learning

Faisal Mohamed

Catherine Ji

Benjamin Eysenbach

Glen Berseth

Effective exploration in reinforcement learning requires keeping track not just of where the agent has been, but also of how the agent think… (see more)s about and represents the world: an agent should explore states that enable it to learn powerful representations. Temporal representations can include the information required to solve any potential task while avoiding the computational cost of reconstruction. In this paper, we propose an exploration method that uses temporal contrastive representations to drive exploration, maximizing coverage as seen through the lens of these temporal representations. We demonstrate complex exploration behaviors in locomotion, manipulation, and embodied-AI tasks, revealing previously unknown capabilities and behaviors once achievable only via extrinsic rewards.

2025-07-01

rl-conference.cc/RLC/2025/Workshop/RLBrew (published)

openreview.net

Learning Robust Representations for Transfer in Reinforcement Learning

Faisal Mohamed

Roger Creus Castanyer

Hongyao Tang

Zahra Sheikhbahaee

Glen Berseth

Learning transferable representations for deep reinforcement learning (RL) is a challenging problem due to the inherent non-stationarity, di… (see more)stribution shift, and unstable training dynamics. To be useful, a transferable representation needs to be robust to such factors. In this work, we introduce a new architecture and training strategy for learning robust representations for transfer learning in RL. We propose leveraging multiple CNN encoders and training them not to specialize in areas of the state space but instead to match each other's representation. We find that learned representations transfer well across many Atari tasks, resulting in better transfer learning performance and data efficiency than training from scratch.

2024-10-10

NeurIPS.cc/2024/Workshop/FITML (poster)

openreview.net

Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning

Adriana Hugessen

Roger Creus Castanyer

Faisal Mohamed

Glen Berseth

Both entropy-minimizing and entropy-maximizing (curiosity) objectives for unsupervised reinforcement learning (RL) have been shown to be eff… (see more)ective in different environments, depending on the environment's level of natural entropy. However, neither method alone results in an agent that will consistently learn intelligent behavior across environments. In an effort to find a single entropy-based method that will encourage emergent behaviors in any environment, we propose an agent that can adapt its objective online, depending on the entropy conditions by framing the choice as a multi-armed bandit problem. We devise a novel intrinsic feedback signal for the bandit, which captures the agent's ability to control the entropy in its environment. We demonstrate that such agents can learn to control entropy and exhibit emergent behaviors in both high- and low-entropy regimes and can learn skillful behaviors in benchmark tasks. Videos of the trained agents and summarized findings can be found on our project page https://sites.google.com/view/surprise-adaptive-agents

2024-05-14

rl-conference.cc/RLC/2024/Conference (published)

doi.org

openreview.net

Speed Science

Leading in a New Era

Supervision Requests

Faisal Mohamed

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Faisal Mohamed

Publications