Publications

Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs

Nicolas Le Roux

Marc Gendron-Bellemare

Jonathan Lebensold

Arnaud Bergeron

Joshua Greaves

Alex Fr'echette

Carolyne Pelletier

Eric Thibodeau-Laufer

S'andor Toth

Sam Work

2025-03-18

ArXiv (preprint)

arxiv.org

Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs

Nicolas Le Roux

Marc Gendron-Bellemare

Jonathan Lebensold

Arnaud Bergeron

Joshua Greaves

Alex Fr'echette

Carolyne Pelletier

Eric Thibodeau-Laufer

S'andor Toth

Sam Work

2025-03-18

ArXiv (preprint)

doi.org

arxiv.org

Meta-learning Optimizers for Communication-Efficient Learning

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

Eugene Belilovsky

2025-03-17

TMLR (accepted)

openreview.net

Sparse Decomposition of Graph Neural Networks

Yaochen Hu

Mai Zeng

Ge Zhang

Pavel Rumiantsev

Liheng Ma

Yingxue Zhang

Mark Coates

2025-03-17

TMLR (accepted)

doi.org

openreview.net

Negotiative Alignment: Embracing Disagreement to Achieve Fairer Outcomes -- Insights from Urban Studies

Rashid A. Mushkani

Hugo Berard

Shin (Alexandre) Koseki

2025-03-16

ArXiv (preprint)

arxiv.org

Sample Compression for Continual Learning

Jacob Comeau

Mathieu Bazinet

Pascal Germain

Cem Subakan

2025-03-13

ArXiv (preprint)

doi.org

arxiv.org

Sample Compression for Continual Learning

Jacob Comeau

Mathieu Bazinet

Pascal Germain

Cem Subakan

2025-03-13

ArXiv (preprint)

arxiv.org

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

Parishad BehnamGhader

Nicholas Meade

Siva Reddy

2025-03-11

ArXiv (preprint)

arxiv.org

Learning Decision Trees as Amortized Structure Inference

Mohammed Mahfoud

Ghait Boukachab

Michał Koziarski

Alex Hernandez-Garcia

Stefan Bauer

Yoshua Bengio

Nikolay Malkin

2025-03-10

ArXiv (preprint)

arxiv.org

Relative biological effectiveness of 31 meV thermal neutrons in peripheral blood lymphocytes

Laura C Paterson

Fawaz Ali

Mohsen Naseri

David Perez Loureiro

Amy Festarini

Marilyne Stuart

Chad Boyer

Ronald Rogge

Christie Costello

Norma Ybarra

John Kildea

Richard B Richardson

2025-03-10

Radiation Protection Dosimetry (published)

doi.org

SemEval-2025 Task 11: Bridging the Gap in Text-Based Emotion Detection

Shamsuddeen Hassan Muhammad

Nedjma OUSIDHOUM

Idris Abdulmumin

Seid Muhie Yimam

Jan Philip Wahle

Terry Lima Ruas

Meriem Beloucif

Christine de Kock

Tadesse Belay

Ibrahim Ahmad

Nirmal Surange

Daniela Teodorescu

David Ifeoluwa Adelani

Alham Fikri Aji

Felermino Ali

Vladimir Araujo

Abinew Ayele

Oana Ignat

Alexander Panchenko

Yi Zhou … (see 1 more)

Saif M. Mohammad

2025-03-10

ArXiv (preprint)

arxiv.org

Understanding the impact of IoT security patterns on CPU usage and energy consumption: a dynamic approach for selecting patterns with deep reinforcement learning

Saeid Jamshidi

Amin Nikanjam

Kawser Wazed Nafi

Foutse Khomh

2025-03-10

International Journal of Information Security (published)

doi.org

AI Advantage

The Development of the UN Scientific Panel on AI

Mila AI Policy Fellowship

AI Advantage

The Development of the UN Scientific Panel on AI

Publications

AI Advantage

The Development of the UN Scientific Panel on AI

Mila AI Policy Fellowship

AI Advantage

The Development of the UN Scientific Panel on AI

Popular keywords:

Publications