Ivan Anokhin

Doctorat - UdeM

Superviseur⋅e principal⋅e

Irina Rish

Co-supervisor

Samira Ebrahimi Kahou

Sujets de recherche

Apprentissage par renforcement

Google Scholar

GitHub

Billets de blogue

Deux robots dans une cuisine, en train de préparer le dîner. L'un coupe les légumes et l'autre fait une omelette.

20 juin 2025

L'apprentissage par renforcement en temps réel

par

Ivan Anokhin

Matthew Riemer

Rishav Rishav

Gopeshh Subbaraj

Glen Berseth

Lire l'article

Publications

Learning From the Past with Cascading Eligibility Traces

Tokiniaina Raharison Ralambomihanta

Ivan Anokhin

Roman Pogodin

Samira Ebrahimi Kahou

Jonathan Cornford

Blake A. Richards

2025-06-16

arXiv (prépublication)

doi.org

openreview.net

AIF-GEN: Open-Source Platform and Synthetic Dataset Suite for Reinforcement Learning on Large Language Models

Jacob Chmura

Shahrad Mohammadzadeh

Taz Scott-Talib

Nishanth Anand

2025-06-08

CODEML @ International Conference on Machine Learning (publié)

openreview.net

Handling Delay in Reinforcement Learning Caused by Parallel Computations of Neurons

Ivan Anokhin

Rishav

Stephen Chung

Irina Rish

S Ebrahimi Kahou

Biological neural networks operate in parallel, a feature that sets them apart from artificial neural networks and can significantly enhance… (voir plus) inference speed. However, this parallelism introduces challenges: when each neuron operates asynchronously with a fixed execution time, an

2024-06-18

ICML.cc/2024/Workshop/ARLET (poster)

openreview.net

Thinker: Learning to Plan and Act

Stephen Chung

Ivan Anokhin

David Krueger

We propose the Thinker algorithm, a novel approach that enables reinforcement learning agents to autonomously interact with and utilize a le… (voir plus)arned world model. The Thinker algorithm wraps the environment with a world model and introduces new actions designed for interacting with the world model. These model-interaction actions enable agents to perform planning by proposing alternative plans to the world model before selecting a final action to execute in the environment. This approach eliminates the need for handcrafted planning algorithms by enabling the agent to learn how to plan autonomously and allows for easy interpretation of the agent's plan with visualization. We demonstrate the algorithm's effectiveness through experimental results in the game of Sokoban and the Atari 2600 benchmark, where the Thinker algorithm achieves state-of-the-art performance and competitive results, respectively. Visualizations of agents trained with the Thinker algorithm demonstrate that they have learned to plan effectively with the world model to select better actions. Thinker is the first work showing that an RL agent can learn to plan with a learned world model in complex environments.

2023-09-20

NeurIPS.cc/2023/Conference (poster)

doi.org

openreview.net

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Ivan Anokhin

Billets de blogue

Publications

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boussole des politiques en IA

Mots-clés populaires:

Ivan Anokhin

Billets de blogue

Publications