Portrait of Éric Thibodeau-Laufer is unavailable

Éric Thibodeau-Laufer

Alumni

Publications

Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs