Portrait of Carolyne Pelletier is unavailable

Carolyne Pelletier

Alumni

Publications

Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs