Sarath Chandar

Biography

Sarath Chandar is an associate professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Chandar Research Lab. He is also a Core Academic Member at Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Ista Abbes

Master's Research - Université de Montréal

Alex Aselstyne

Research Intern - Polytechnique Montréal

Davide Baldelli

PhD - Polytechnique Montréal

Co-supervisor :

joe Ben

Research Intern - Polytechnique Montréal

joumenbensaid@gmail.com

Milan Bhan

Collaborating researcher

Antoine Clavaud

Master's Research - Polytechnique Montréal

Naga Karthik Enamundram

PhD - Polytechnique Montréal

Principal supervisor :

Julien Cohen-Adad

emvnagakarthik@gmail.com

Prashant Govindarajan

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Principal supervisor :

Collaborating researcher - Université de Montréal

Principal supervisor :

Liam Paull

Maryam Hashemzadeh

PhD - Université de Montréal

David Heurtel--Depeiges

PhD - Polytechnique Montréal

Amir Ardalan Kalantari Dehaghi

Jerry Huang

PhD - Université de Montréal

Collaborating Alumni

Lola Le Breton

Master's Research - Polytechnique Montréal

Ekaterina Lobacheva

Postdoctorate - Université de Montréal

PhD - Polytechnique Montréal

Roshan Munirathinam Sankaran Balaji

Mohamed Amine Merzouk

Postdoctorate - Polytechnique Montréal

Principal supervisor :

Research Intern - Polytechnique Montréal

Hadi NekoeiQachkanloo

PhD - Université de Montréal

Darshan Patil

PhD - Université de Montréal

Gabriele Prato

PhD - Université de Montréal

Postdoctorate

Independent visiting researcher

Mohammad R. Samsami

Master's Research - Université de Montréal

Master's Research - Polytechnique Montréal

Arjun Vaithilingam Sudhakar

Megh Thakkar

Master's Research - Université de Montréal

PhD - Polytechnique Montréal

Kowen Woo

Research Intern - Polytechnique Montréal

Abdelrahman Zayed

PhD - Polytechnique Montréal

Xutong Zhao

PhD - Polytechnique Montréal

Artem Zholus

PhD - Polytechnique Montréal

NeoBERT: A New Frontier for Open-Source Encoder Language Models

Blog Posts

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

March 3, 2025

Lola Le Breton

Quentin Fournier

Sarath Chandar

Read the article

October 1, 2024

How Do We Explain AI and Ensure the Explanation Is True? Faithfulness Measurable Models Tell You How

Andrea Madsen

Siva Reddy

Sarath Chandar

Read the article

Publications

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Mahmoud Assran

Adrien Bardes

David Fan

Quentin Garrido

Russell Howes

Mojtaba Komeili

Matthew J. Muckley

Ammar Rizvi

Claire Roberts

Koustuv Sinha

Artem Zholus

Sergio Arnaud

Abha Gejji

Ada Martin

Francois Robert Hogan

Daniel Dugas

Piotr Bojanowski

Vasil Khalidov

Patrick Labatut

Francisco Massa … (see 13 more)

Marc Szafraniec

K. Krishnakumar

Yong Li

Xiaodong Ma

Franziska Meier

Yann LeCun

Michael Rabbat

Nicolas Ballas

Fair at Meta

Mila - Québec

AI Institute

Polytechnique Montréal

A major challenge for modern AI is to learn to understand the world and learn to act largely by observation. This paper explores a self-supe… (see more)rvised approach that combines internet-scale video data with a small amount of interaction data (robot trajectories), to develop models capable of understanding, predicting, and planning in the physical world. We first pre-train an action-free joint-embedding-predictive architecture, V-JEPA 2, on a video and image dataset comprising over 1 million hours of internet video. V-JEPA 2 achieves strong performance on motion understanding (77.3 top-1 accuracy on Something-Something v2) and state-of-the-art performance on human action anticipation (39.7 recall-at-5 on Epic-Kitchens-100) surpassing previous task-specific models. Additionally, after aligning V-JEPA 2 with a large language model, we demonstrate state-of-the-art performance on multiple video question-answering tasks at the 8 billion parameter scale (e.g., 84.0 on PerceptionTest, 76.9 on TempCompass). Finally, we show how self-supervised learning can be applied to robotic planning tasks by post-training a latent action-conditioned world model, V-JEPA 2-AC, using less than 62 hours of unlabeled robot videos from the Droid dataset. We deploy V-JEPA 2-AC zero-shot on Franka arms in two different labs and enable picking and placing of objects using planning with image goals. Notably, this is achieved without collecting any data from the robots in these environments, and without any task-specific training or reward. This work demonstrates how self-supervised learning from web-scale data and a small amount of robot interaction data can yield a world model capable of planning in the physical world.

2025-06-11

ArXiv (preprint)

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Mahmoud Assran

Adrien Bardes

David Fan

Quentin Garrido

Russell Howes

Mojtaba Komeili

Matthew J. Muckley

Ammar Rizvi

Claire Roberts

Koustuv Sinha

Artem Zholus

Sergio Arnaud

Abha Gejji

Ada Martin

Francois Robert Hogan

Daniel Dugas

Piotr Bojanowski

Vasil Khalidov

Patrick Labatut

Francisco Massa … (see 13 more)

Marc Szafraniec

K. Krishnakumar

Yong Li

Xiaodong Ma

Franziska Meier

Yann LeCun

Michael Rabbat

Nicolas Ballas

Fair at Meta

Mila - Québec

AI Institute

Polytechnique Montréal

2025-06-11

ArXiv (preprint)

Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models

Milan Bhan

Jean-Noël Vittaut

Nicolas Chesneau

Marie-Jeanne Lesot

Large Language Models (LLM) have demonstrated the capability of generating free text self Natural Language Explanation (self-NLE) to justify… (see more) their answers. Despite their logical appearance, self-NLE do not necessarily reflect the LLM actual decision-making process, making such explanations unfaithful. While existing methods for measuring self-NLE faithfulness mostly rely on behavioral tests or computational block identification, none of them examines the neural activity underlying the model's reasoning. This work introduces a novel flexible framework for quantitatively measuring the faithfulness of LLM-generated self-NLE by directly comparing the latter with interpretations of the model's internal hidden states. The proposed framework is versatile and provides deep insights into self-NLE faithfulness by establishing a direct connection between self-NLE and model reasoning. This approach advances the understanding of self-NLE faithfulness and provides building blocks for generating more faithful self-NLE.

2025-06-10

ArXiv (preprint)

Revisiting the Goldilocks Zone in Inhomogeneous Networks

Zacharie Garnier Cuchet

Ekaterina Lobacheva

We investigate how architectural inhomogeneities—such as biases, layer normalization, and residual connections—affect the curvature of t… (see more)he loss landscape at initialization and its link to trainability. We focus on the Goldilocks zone, a region in parameter space with excess positive curvature, previously associated with improved optimization in homogeneous networks. To extend this analysis, we compare two scaling strategies: weight scaling and softmax temperature scaling. Our results show that in networks with biases or residual connections, both strategies identify a Goldilocks zone aligned with better training. In contrast, layer normalization leads to lower or negative curvature, yet stable optimization—revealing a disconnect between curvature and trainability. Softmax temperature scaling behaves more consistently across models, making it a more robust probe. Overall, the Goldilocks zone remains relevant in inhomogeneous networks, but its geometry and predictive power depend on architectural choices, particularly normalization.

2025-06-09

ICML.cc/2025/Workshop/HiLD (poster)

openreview.net

Boosting LLM Reasoning via Spontaneous Self-Correction

Xutong Zhao

Tengyu Xu

Xuewei Wang

Zhengxing Chen

Di Jin

Liang Tan

Zishun Yu

Zhuokai Zhao

Yun He

Sinong Wang

Si-Yuan Wang

Han Fang

Chen Zhu

MetaAI

Mila - Québec

AI Institute

Polytechnique Montréal

While large language models (LLMs) have demonstrated remarkable success on a broad range of tasks, math reasoning remains a challenging one.… (see more) One of the approaches for improving math reasoning is self-correction, which designs self-improving loops to let the model correct its own mistakes. However, existing self-correction approaches treat corrections as standalone post-generation refinements, relying on extra prompt and system designs to elicit self-corrections, instead of performing real-time, spontaneous self-corrections in a single pass. To address this, we propose SPOC, a spontaneous self-correction approach that enables LLMs to generate interleaved solutions and verifications in a single inference pass, with generation dynamically terminated based on verification outcomes, thereby effectively scaling inference time compute. SPOC considers a multi-agent perspective by assigning dual roles -- solution proposer and verifier -- to the same model. We adopt a simple yet effective approach to generate synthetic data for fine-tuning, enabling the model to develop capabilities for self-verification and multi-agent collaboration. We further improve its solution proposal and verification accuracy through online reinforcement learning. Experiments on mathematical reasoning benchmarks show that SPOC significantly improves performance. Notably, SPOC boosts the accuracy of Llama-3.1-8B and 70B Instruct models, achieving gains of 8.8% and 11.6% on MATH500, 10.0% and 20.0% on AMC23, and 3.3% and 6.7% on AIME24, respectively.

2025-06-07

ArXiv (preprint)

Boosting LLM Reasoning via Spontaneous Self-Correction

Xutong Zhao

Tengyu Xu

Xuewei Wang

Zhengxing Chen

Di Jin

Liang Tan

Zishun Yu

Zhuokai Zhao

Yun He

Sinong Wang

Han Fang

Chen Zhu

MetaAI

Mila - Québec

AI Institute

Polytechnique Montréal

2025-06-07

ArXiv (preprint)

Monitoring morphometric drift in lifelong learning segmentation of the spinal cord

Enamundram Naga Karthik

Sandrine B'edard

Jan Valovsek

Christoph Aigner

Elise Bannier

Josef Bednavr'ik

Virginie Callot

Anna Combes

Armin Curt

Gergely David

Falk Eippert

Lynn Farner

M. G. Fehlings

Patrick Freund

Tobias Granberg

Cristina Granziera

Rhscir Network Imaging Group

Ulrike Horn

Tom'avs Hor'ak

Suzanne Humphreys … (see 36 more)

Markus Hupp

Anne Kerbrat

Nawal Kinany

Shannon Kolind

Petr Kudlivcka

Anna Lebret

Lisa Eunyoung Lee

Caterina Mainero

Allan R. Martin

Megan McGrath

Govind Nair

Kristin P. O’Grady

Jiwon Oh

Russell Ouellette

Nikolai Pfender

Dario Pfyffer

P. Pradat

Alexandre Prat

Emanuele Pravatà

D. S. Reich

Ilaria Ricchi

Naama Rotem-Kohavi

Simon Schading-Sassenhausen

Maryam Seif

Andrew C. Smith

Seth Aaron Smith

Grace Sweeney

Roger Tam

Anthony Traboulsee

Constantina A. Treaba

Charidimos Tsagkas

Zachary Vavasour

Dimitri Van De Ville

Kenneth A. Weber

Julien Cohen-Adad

2025-05-02

ArXiv (preprint)

Monitoring morphometric drift in lifelong learning segmentation of the spinal cord

Enamundram Naga Karthik

Sandrine B'edard

Jan Valošek

Christoph Aigner

Elise Bannier

Josef Bednavr'ik

Virginie Callot

Anna Combes

Armin Curt

Gergely David

Falk Eippert

Lynn Farner

M. G. Fehlings

Patrick Freund

Tobias Granberg

Cristina Granziera

Rhscir Network Imaging Group

Ulrike Horn

Tom'avs Hor'ak

Suzanne Humphreys … (see 36 more)

Markus Hupp

Anne Kerbrat

Nawal Kinany

Shannon Kolind

Petr Kudlivcka

Anna Lebret

Lisa Eunyoung Lee

Caterina Mainero

Allan R. Martin

Megan McGrath

Govind Nair

Kristin P. O’Grady

Jiwon Oh

Russell Ouellette

Nikolai Pfender

Dario Pfyffer

P. Pradat

Alexandre Prat

Emanuele Pravatà

D. S. Reich

Ilaria Ricchi

Naama Rotem-Kohavi

Simon Schading-Sassenhausen

Maryam Seif

Andrew C. Smith

Seth Aaron Smith

Grace Sweeney

Roger Tam

Anthony Traboulsee

Constantina A. Treaba

Charidimos Tsagkas

Zachary Vavasour

Dimitri Van De Ville

Kenneth A. Weber

Julien Cohen-Adad

2025-05-01

arXiv (published)

doi.org

BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning

Artem Zholus

Maksim Kuznetsov

Roman Schutski

Shayakhmetov Rim

Daniil Polykovskiy

Alex Zhavoronkov

Generating novel active molecules for a given protein is an extremely challenging task for generative models that requires an understanding … (see more)of the complex physical interactions between the molecule and its environment. In this paper, we present a novel generative model, BindGPT which uses a conceptually simple but powerful approach to create 3D molecules within the protein's binding site. Our model produces molecular graphs and conformations jointly, eliminating the need for an extra graph reconstruction step. We pretrain BindGPT on a large-scale dataset and fine-tune it with reinforcement learning using scores from external simulation software. We demonstrate how a single pretrained language model can serve at the same time as a 3D molecular generative model, conformer generator conditioned on the molecular graph, and a pocket-conditioned 3D molecule generator. Notably, the model does not make any representational equivariance assumptions about the domain of generation. We show how such simple conceptual approach combined with pretraining and scaling can perform on par or better than the current best specialized diffusion models, language models, and graph neural networks while being two orders of magnitude cheaper to sample.

2025-04-11

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

TAPNext: Tracking Any Point (TAP) as Next Token Prediction

Artem Zholus

Carl Doersch

Yi Yang

Skanda Koppula

Viorica Patraucean

Xu Owen He

Ignacio Rocco

Mehdi S. M. Sajjadi

Ross Goroshin

2025-04-08

ArXiv (preprint)

TAPNext: Tracking Any Point (TAP) as Next Token Prediction

Artem Zholus

Carl Doersch

Yi Yang

Skanda Koppula

Viorica Patraucean

Xu Owen He

Ignacio Rocco

Mehdi S. M. Sajjadi

Ross Goroshin

2025-04-01

arXiv (published)

doi.org

CrystalGym: A New Benchmark for Materials Discovery Using Reinforcement Learning

Prashant Govindarajan

Mathieu Reymond

Antoine Clavaud

Mariano Phielipp

Santiago Miret