Homayoun Honari

Collaborating researcher - Université de Montréal

Supervisor

Glen Berseth

Research Topics

AGI (Artificial General Intelligence)

Brain-inspired AI

Causality

Causality-Inspired Methods

Cognition

Consciousness

Generalization

Machine Learning Theory

Reasoning

Reinforcement Learning

Representation Learning

Robotics

Google Scholar

GitHub

Publications

Training PPO-Clip with Parallelized Data Generation: A Case of Fixed-Point Convergence

Homayoun Honari

Roger Creus Castanyer

Pablo Samuel Castro

Glen Berseth

In recent years, with the increase in the compute power of GPUs, parallelized data collection has become the dominant approach for training … (see more)reinforcement learning (RL) agents. Proximal Policy Optimization (PPO) is one of the widely-used on-policy methods for training RL agents. In this paper, we focus on the training behavior of PPO-Clip with the increase in the number of parallel environments. In particular, we show that as we increase the amount of data used to train PPO-Clip, the optimized policy would converge to a fixed distribution. We use the results to study the behavior of PPO-Clip in two case studies: the effect of change in the minibatch size and the effect of increase in the number of parallel environments versus the increase in the rollout lengths. The experiments show that settings with high-return PPO runs result in slower convergence to the fixed-distribution and higher consecutive KL divergence changes. Our results aim to offer a better understanding for the prediction of the performance of PPO with the scaling of the parallel environments.

2025-06-22

rl-conference.cc/RLC/2025/Workshop/IBRL (published)

openreview.net

Mila AI Policy Conference

Leading in a New Era

TRAIL: Responsible AI for Professionals and Leaders

Homayoun Honari

Publications

Mila AI Policy Conference

Leading in a New Era

TRAIL: Responsible AI for Professionals and Leaders

Popular keywords:

Homayoun Honari

Publications