Portrait de Carolyne Pelletier n'est pas disponible

Carolyne Pelletier

Alumni

Publications

Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs