Hafez Ghaemi

PhD - Université de Montréal

Supervisor

Eilif B. Muller

Co-supervisor

Shahab Bakhtiari

Research Topics

Computer Vision

Deep Learning

Reinforcement Learning

Representation Learning

Website

Google Scholar

GitHub

Publications

Context-Aware World Models for Task-Agnostic Control

Busra Tugce Gurbuz

Hafez Ghaemi

Christopher C. Pack

Shahab Bakhtiari

Eilif Benjamin Muller

2025-09-22

NeurIPS.cc/2025/Workshop/UniReps (published)

openreview.net

Self-Supervised Learning from Structural Invariance

2025-09-22

UniReps @ Neural Information Processing Systems (oral)

doi.org

openreview.net

seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models

Hafez Ghaemi

Eilif B. Muller

Shahab Bakhtiari

Current self-supervised algorithms commonly rely on transformations such as data augmentation and masking to learn visual representations. T… (see more)his is achieved by enforcing invariance or equivariance with respect to these transformations after encoding two views of an image. This dominant two-view paradigm often limits the flexibility of learned representations for downstream adaptation by creating performance trade-offs between high-level invariance-demanding tasks such as image classification and more fine-grained equivariance-related tasks. In this work, we proposes \emph{seq-JEPA}, a world modeling framework that introduces architectural inductive biases into joint-embedding predictive architectures to resolve this trade-off. Without relying on dual equivariance predictors or loss terms, seq-JEPA simultaneously learns two architecturally segregated representations: one equivariant to specified transformations and another invariant to them. To do so, our model processes short sequences of different views (observations) of inputs. Each encoded view is concatenated with an embedding of the relative transformation (action) that produces the next observation in the sequence. These view-action pairs are passed through a transformer encoder that outputs an aggregate representation. A predictor head then conditions this aggregate representation on the upcoming action to predict the representation of the next observation. Empirically, seq-JEPA demonstrates strong performance on both equivariant and invariant benchmarks without sacrificing one for the other. Furthermore, it excels at tasks that inherently require aggregating a sequence of observations, such as path integration across actions and predictive learning across eye movements.

2025-09-17

NeurIPS.cc/2025/Conference (poster)

doi.org

openreview.net

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Hafez Ghaemi

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Hafez Ghaemi

Publications