Anirudh Goyal

A. Hartshorn

Aobo Yang

Archi Mitra

Archie Sravankumar

Artem Korenev

Arthur Hinsvark

Arun Rao

Aston Zhang

Aurelien Ro-driguez

Austen Gregerson

Ava Spataru

Baptiste Rozière

Bethany Biron

Binh Tang

Bobbie Chern

Charlotte Caucheteux

Chaya Nayak

Chloe Bi

Chris Marra

Chris McConnell

Christian Keller

Christophe Touret

Chunyang Wu

Corinne Wong

Cris-tian Cantón Ferrer

Cyrus Nikolaidis

Damien Al-lonsius

Daniel Song

Danielle Pintz

Danny Livshits

Danny Wyatt

David Esiobu

Dhruv Choudhary

Dhruv Mahajan 0001

Diego Garcia-Olano

Diego Perino

Dieuwke Hupkes

Egor Lakomkin

Ehab A. AlBadawy

Elina Lobanova

Emily Dinan

Eric Michael Smith

Filip Radenovic

Francisco Guzmán

Frank Zhang

Gabriele Synnaeve

Gabrielle Lee

Georgia Lewis

G. Thattai

Graeme Nail

Gregoire Mi-alon

Guan Pang

Guillem Cucurell

Hailey Nguyen

Han-nah Korevaar

Hu Xu

Hugo Touvron

Imanol Iliyan Zarov

Arrieta Ibarra

Is-abel Kloumann

Ishan Misra

Ivan Evtimov

Jack Zhang

Jade Copet

Jaewon Lee

Jan Geffert

Jana Vranes

Jason Park

Jay Mahadeokar

Jeet Shah

Jelmer van der Linde

Jennifer Billock

Jenny Hong

Jenya Lee

Jeremy Fu

J. Fu

Jianfeng Chi

Jianyu Huang

Jiawen Liu

Jie Wang

Jiecao Yu

Joanna Bitton

Joe Spisak

Jongsoo Park

Joseph Rocca

J. Johnstun

Joshua Saxe

Junteng Jia

Kalyan Vasuden Alwala

Karthik Prasad

Kartikeya Upasani

Kate Plawiak

Keqian Li

Kenneth Heafield

Kevin R. Stone

Khalid El-Arini

Krithika Iyer

Kshitiz Malik

Kuen-ley Chiu

Kunal Bhalla

Kushal Lakhotia

Lauren Rantala-Yeary

Laurens van der Maaten

Lawrence Chen

Liang Tan

Liz Jenkins

Louis Martin

Lovish Madaan

Lubo Malo

Lukas Blecher

Lukas Landzaat

Luke de Oliveira

Madeline Muzzi

Mahesh Pasupuleti

Mannat Singh

Manohar Paluri

Marcin Kardas

Maria Tsimpoukelli

Mathew Oldham

Mathieu Rita

Maya Pavlova

Melanie Kam-badur

Mike Lewis

Mitesh Min Si

Kumar Singh

Mona Hassan

Naman Goyal

Narjes Torabi

Niko-lay Bashlykov

Nikolay Bogoychev

Niladri S. Chatterji

Ning Zhang

Olivier Duchenne

Onur Çelebi

Patrick Alrassy

Petar Pengwei Li

Peter Weng

Prajjwal Bhargava

Pratik Dubal

Punit Praveen Krishnan

Singh Koura

Puxin Xu

Qing He

Qingxiao Dong

Ragavan Srinivasan

Raj Ganapathy

Ramon Calderer

Ricardo Silveira Cabral

Robert Stojnic

Roberta Raileanu

Rohan Maheswari

Rohit Girdhar

Rohit Patel

Ro-main Sauvestre

Ron-nie Polidoro

Roshan Sumbaly

Ross Taylor

Ruan Silva

Rui Hou

Rui Wang

S. Hosseini

Sa-hana Chennabasappa

Sanjay Singh

Sean Bell

Seo-hyun Sonia Kim

Sergey Edunov

Shaoliang Nie

Sharan Narang

Sharath Chandra Raparthy

Sheng Shen

Shengye Wan

Shruti Bhosale

Shun Zhang

Simon Van-denhende

Soumya Batra

Spencer Whitman

Sten Sootla

Stephane Collot

Suchin Gururangan

S. Borodinsky

Tamar Herman

Tara Fowler

Tarek Sheasha

Thomas Georgiou

Thomas Scialom

Tobias Speckbacher

Todor Mihaylov

Tong Xiao

Ujjwal Karn

Vedanuj Goswami

Vibhor Gupta

Vignesh Ramanathan

Viktor Kerkez

Vincent Gonguet

Vir-ginie Do

Vish Vogeti

Vitor Albiero

Vladan Petro-vic

Weiwei Chu

Wenhan Xiong

Wenyin Fu

2025-05-27

ArXiv (preprint)

arxiv.org

On the Transfer of Object-Centric Representation Learning.

Aniket Rajiv Didolkar

Andrii Zadaianchuk

Michael Curtis Mozer

Georg Martius

Maximilian Seitzer

2025-01-21

ICLR.cc/2025/Conference (poster)

Object-Centric Temporal Consistency via Conditional Autoregressive Inductive Biases

Cristian Meo

Akihiro Nakano

Mircea Tudor Lică

Aniket Rajiv Didolkar

Masahiro Suzuki

Mengmi Zhang

Justin Dauwels

Yutaka Matsuo

2024-10-09

NeurIPS.cc/2024/Workshop/Compositional_Learning (poster)

AI-Assisted Generation of Difficult Math Questions

Vedant Shah

Dingli Yu

Kaifeng Lyu

Simon Park

Nan Rosemary Ke

Jiatong Yu

Yinghui He

Michael Curtis Mozer

James Lloyd McClelland

Sanjeev Arora

Current LLM training positions mathematical reasoning as a core capability. With publicly available sources fully tapped, there is unmet dem… (see more)and for diverse and challenging math questions. Relying solely on human experts is both time-consuming and costly, while LLM-generated questions often lack the requisite diversity and difficulty. We present a design framework that combines the strengths of LLMs with a human-in-the-loop approach to generate a diverse array of challenging math questions. We leverage LLM metacognition skills [Didolkar et al., 2024] of a strong LLM to extract core"skills"from existing math datasets. These skills serve as the basis for generating novel and difficult questions by prompting the LLM with random pairs of core skills. The use of two different skills within each question makes finding such questions an"out of distribution"task for both LLMs and humans. Our pipeline employs LLMs to iteratively generate and refine questions and solutions through multiturn prompting. Human annotators then verify and further refine the questions, with their efficiency enhanced via further LLM interactions. Applying this pipeline on skills extracted from the MATH dataset [Hendrycks et al., 2021] resulted in MATH

2024-10-08

NeurIPS.cc/2024/Workshop/MATH-AI (accepted)

Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

Aniket Didolkar

Nan Rosemary Ke

Siyuan Guo

Michal Valko

Timothy Lillicrap

Danilo Rezende

Michael Mozer

Sanjeev Arora

Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly posse… (see more)ss some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, developing a prompt-guided interaction procedure to get a powerful LLM to assign sensible skill labels to math questions, followed by having it perform semantic clustering to obtain coarser families of skill labels. These coarse skill labels look interpretable to humans. To validate that these skill labels are meaningful and relevant to the LLM's reasoning processes we perform the following experiments. (a) We ask GPT-4 to assign skill labels to training questions in math datasets GSM8K and MATH. (b) When using an LLM to solve the test questions, we present it with the full list of skill labels and ask it to identify the skill needed. Then it is presented with randomly selected exemplar solved questions associated with that skill label. This improves accuracy on GSM8k and MATH for several strong LLMs, including code-assisted models. The methodology presented is domain-agnostic, even though this article applies it to math problems.

2024-09-24

NeurIPS.cc/2024/Conference (poster)

Zero-Shot Object-Centric Representation Learning

Aniket Rajiv Didolkar

Andrii Zadaianchuk

Michael Curtis Mozer

Georg Martius

Maximilian Seitzer

The goal of object-centric representation learning is to decompose visual scenes into a structured representation that isolates the entities… (see more). Recent successes have shown that object-centric representation learning can be scaled to real-world scenes by utilizing pre-trained self-supervised features. However, so far, object-centric methods have mostly been applied in-distribution, with models trained and evaluated on the same dataset. This is in contrast to the wider trend in machine learning towards general-purpose models directly applicable to unseen data and tasks. Thus, in this work, we study current object-centric methods through the lens of zero-shot generalization by introducing a benchmark comprising eight different synthetic and real-world datasets. We analyze the factors influencing zero-shot performance and find that training on diverse real-world images improves transferability to unseen scenarios. Furthermore, inspired by the success of task-specific fine-tuning in foundation models, we introduce a novel fine-tuning strategy to adapt pre-trained vision encoders for the task of object discovery. We find that the proposed approach results in state-of-the-art performance for unsupervised object discovery, exhibiting strong zero-shot transfer to unseen datasets.

2024-08-16

ArXiv (preprint)

arxiv.org

Cycle Consistency Driven Object Discovery

Aniket Didolkar

Developing deep learning models that effectively learn object-centric representations, akin to human cognition, remains a challenging task. … (see more)Existing approaches facilitate object discovery by representing objects as fixed-size vectors, called ``slots'' or ``object files''. While these approaches have shown promise in certain scenarios, they still exhibit certain limitations. First, they rely on architectural priors which can be unreliable and usually require meticulous engineering to identify the correct objects. Second, there has been a notable gap in investigating the practical utility of these representations in downstream tasks. To address the first limitation, we introduce a method that explicitly optimizes the constraint that each object in a scene should be associated with a distinct slot. We formalize this constraint by introducing consistency objectives which are cyclic in nature. By integrating these consistency objectives into various existing slot-based object-centric methods, we showcase substantial improvements in object-discovery performance. These enhancements consistently hold true across both synthetic and real-world scenes, underscoring the effectiveness and adaptability of the proposed approach. To tackle the second limitation, we apply the learned object-centric representations from the proposed method to two downstream reinforcement learning tasks, demonstrating considerable performance enhancements compared to conventional slot-based and monolithic representation learning methods. Our results suggest that the proposed approach not only improves object discovery, but also provides richer features for downstream tasks.

2024-01-15

ICLR.cc/2024/Conference (poster)

Gemini: A Family of Highly Capable Multimodal Models

Gemini Team Google Rohan Anil

Sebastian Borgeaud

Yonghui Wu

Jean-Baptiste Alayrac

Jiahui Yu

Radu Soricut

J. Schalkwyk

Andrew M. Dai

Anja Hauth

Katie Millican

David Silver

Slav Petrov

Melvin Johnson

Ioannis Antonoglou

Julian Schrittwieser

Amelia Glaese

Jilin Chen

Emily Pitler

Timothy P Lillicrap

Angeliki Lazaridou … (see 480 more)

Orhan Firat

James L. Molloy

Michael Acheson Isard

Paul R. Barham

Tom Hennigan

Benjamin Lee

Fabio Viola

Malcolm Reynolds

Yuanzhong Xu

Ryan Doherty

Eli Collins

Clemens Meyer

Eliza Rutherford

Erica Moreira

Kareem W. Ayoub

Megha Goel

George Tucker

Enrique Piqueras

M. Krikun

Iain Barr

Nikolay Savinov

Ivo Danihelka

Becca Roelofs

Anais White

Anders Johan Andreassen

Tamara von Glehn

Laksh-man Yagati

Mehran Kazemi

Lucas Gonzalez

Misha Khalman

Jakub Sygnowski

Alexandre Fréchette

Charlotte Smith

Laura Culp

Lev Proleev

Yi Luan

X. T. Chen

James Lottes

Nathan Schucher

Federico Lebron

Alban Rrustemi

Natalie Clay

Phil Crone

Tomas Kocisky

Jeffrey Zhao

Bartek Perz

Dian Yu

Heidi Howard

Adam E. Bloniarz

Jack W. Rae

Han Lu

Laurent Sifre

Marcello Maggioni

Fred Alcober

Dan Garrette

Megan Barnes

Shantanu Thakoor

Jacob Austin

Gabriel Barth-Maron

William Wong

Rishabh Joshi

Rahma Chaabouni

Deeni Fatiha

Arun Ahuja

Ruibo Liu

Yunxuan Li

Sarah Cogan

Jeremy Chen

Chao Jia

Chenjie Gu

Qiao Zhang

Jordan Grimstad

Ale Jakse Hartman

Martin J. Chadwick

Gaurav Singh Tomar

Xavier Garcia

Evan Senter

Emanuel Taropa

Thanumalayan Sankaranarayana Pillai

Jacob Devlin

Michael Laskin

Diego de Las Casas

Dasha Valter

Connie Tao

Lorenzo Blanco

Adrià Puigdomènech Badia

David Reitter

Mianna Chen

Jenny Brennan

Clara E. Rivera

Sergey Brin

Shariq Iqbal

Gabriela Surita

Jane Labanowski

Abhishek Rao

Stephanie Winkler

Emilio Parisotto

Yiming Gu

Kate Olszewska

Yujing Zhang

Ravichandra Addanki

Antoine Miech

Annie Louis

Laurent El Shafey

Denis Teplyashin

Geoff Brown

Elliot Catt

Nithya Attaluri

Jan Balaguer

Jackie Xiang

Pidong Wang

Zoe Ashwood

Anton Briukhov

Alex Webson

Sanjay Ganapathy

Smit Sanghavi

Ajay Kannan

Ming-Wei Chang

Axel Stjerngren

Josip Djolonga

Yuting Sun

Ankur Bapna

Matthew Aitchison

Pedram Pejman

Henryk Michalewski

Tianhe Yu

Cindy Wang

J Christopher Love

Junwhan Ahn

Dawn Bloxwich

Kehang Han

Peter Conway Humphreys

Thibault Sellam

James Bradbury

Varun Godbole

Sina Samangooei

Bogdan Damoc

Alex Kaskasoli

S'ebastien M. R. Arnold

Vijay Vasudevan

Shubham Agrawal

Jason Riesa

Dmitry Lepikhin

Richard Tanburn

Srivatsan Srinivasan

Hyeontaek Lim

Sarah Hodkinson

Pranav Shyam

Johan Ferret

Steven Hand

Ankush Garg

T. Paine

Jian Li

Yujia Li

Minh Giang

Alexander Neitz

Zaheer Abbas

Sarah York

Machel Reid

Elizabeth Cole

Aakanksha Chowdhery

Dipanjan Das

Dominika Rogozi'nska

Vitaly Nikolaev

Pablo G. Sprechmann

Zachary Nado

Lukáš Žilka

Flavien Prost

Luheng He

Marianne Monteiro

Gaurav Mishra

Christoper A. Welty

Joshua Newlan

Dawei Jia

Miltiadis Allamanis

Clara Huiyi Hu

Raoul de Liedekerke

Justin Gilmer

Carl Saroufim

Shruti Rijhwani

Shaobo Hou

Disha Shrivastava

Anirudh Baddepudi

Alex Goldin

Adnan Ozturel

Albin Cassirer

Yunhan Xu

Daniel Sohn

Devendra Singh Sachan

Reinald Kim Amplayo

Craig Swanson

Dessie Petrova

Shashi Narayan

Arthur Guez

Siddhartha Brahma

Jessica Landon

Miteyan Patel

Ruizhe Zhao

Kevin Villela

Luyu Wang

Wenhao Jia

Matthew Rahtz

Mai Gim'enez

Legg Yeung

Hanzhao Lin

James Keeling

Petko Georgiev

Diana Mincu

Boxi Wu

Salem Haykal

Rachel Saputro

Kiran N. Vodrahalli

James Qin

Zeynep Cankara

Abhanshu Sharma

Nicholas Fernando

Will Hawkins

Behnam Neyshabur

Solomon Kim

Adrian Hutter

Priyanka Agrawal

Alex Castro-Ros

George van den Driessche

Tao Wang

Fan Yang

Shuo-yiin Chang

Paul Komarek

Ross McIlroy

Mario Luvci'c

Guodong Zhang

Wael Farhan

Michael Sharman

Paul Natsev

Paul Michel

Yong Cheng

Yamini Bansal

Siyuan Qiao

Kris Cao

Siamak Shakeri

Christina Butterfield

Justin Chung

Paul Kishan Rubenstein

Shivani Agrawal

Arthur Mensch

Kedar Soparkar

Karel Lenc

Timothy Chung

Aedan Pope

Lorenzo Maggiore

Jackie Kay

Priya Jhakra

Shibo Wang

Joshua Maynez

Mary Phuong

Taylor Tobin

Andrea Tacchetti

Maja Trebacz

Kevin Robinson

Yash Katariya

Sebastian Riedel

Paige Bailey

Kefan Xiao

Nimesh Ghelani

Lora Aroyo

Ambrose Slone

Neil Houlsby

Xuehan Xiong

Zhen Yang

Elena Gribovskaya

Jonas Adler

Mateo Wirth

Lisa Lee

Music Li

Thais Kagohara

Jay Pavagadhi

Sophie Bridgers

Anna Bortsova

Sanjay Ghemawat

Zafarali Ahmed

Tianqi Liu

Richard Powell

Vijay Bolina

Mariko Iinuma

Polina Zablotskaia

James Besley

Da-Woon Chung

Timothy Dozat

Ramona Comanescu

Xiance Si

Jeremy Greer

Guolong Su

M. Polacek

Raphael Lopez Kaufman

Simon Tokumine

Hexiang Hu

Elena Buchatskaya

Yingjie Miao

Mohamed Elhawaty

Aditya Siddhant

Nenad Tomasev

Jinwei Xing

Christina Greer

Helen Miller

Shereen Ashraf

Aurko Roy

Zizhao Zhang

Ada Ma

Angelos Filos

Milos Besta

Rory Blevins

Ted Klimenko

Chih-Kuan Yeh

Soravit Changpinyo

Jiaqi Mu

Oscar Chang

Mantas Pajarskas

Carrie Muir

Vered Cohen

Charline Le Lan

Krishna S Haridasan

Amit Marathe

Steven Stenberg Hansen

Sholto Douglas

Rajkumar Samuel

Mingqiu Wang

Sophia Austin

Chang Lan

Jiepu Jiang

Justin Chiu

Jaime Alonso Lorenzo

Lars Lowe Sjosund

S'ebastien Cevey

Zach Gleicher

Thi Avrahami

Anudhyan Boral

Hansa Srinivasan

Vittorio Selo

Rhys May

Konstantinos Aisopos

L'eonard Hussenot

Livio Baldini Soares

Kate Baumli

Michael B. Chang

Adria Recasens

Benjamin Caine

Alexander Pritzel

Filip Pavetic

Fabio Pardo

Anita Gergely

Justin Frye

Vinay Venkatesh Ramasesh

Dan Horgan

Kartikeya Badola

Nora Kassner

Subhrajit Roy

Ethan Dyer

V'ictor Campos

Alex Tomala

Yunhao Tang

Dalia El Badawy

Elspeth White

Basil Mustafa

Oran Lang

Abhishek Jindal

Sharad Mandyam Vikram

Zhitao Gong

Sergi Caelles

Ross Hemsley

Gregory Thornton

Fangxiaoyu Feng

Wojciech Stokowiec

Ce Zheng

Phoebe Thacker

cCauglar Unlu

Zhishuai Zhang

Mohammad Saleh

James Svensson

Maxwell L. Bileschi

Piyush Patil

Ankesh Anand

Roman Ring

Katerina Tsihlas

Arpi Vezer

Marco Selvi

Toby Shevlane

Mikel Rodriguez

Tom Kwiatkowski

Samira Daruki

Keran Rong

Allan Dafoe

Nicholas Fitzgerald

Keren Gu-Lemberg

Mina Khan

Lisa Anne Hendricks

Marie Pellat

Vladimir Feinberg

James Cobon-Kerr

Tara N. Sainath

Maribeth Rauh

Sayed Hadi Hashemi

Richard Ives

Yana Hasson

YaGuang Li

Eric Noland

Yuan Cao

Nathan Byrd

Le Hou

Qingze Wang

Thibault Sottiaux

Michela Paganini

Jean-Baptiste Lespiau

Alexandre Moufarek

Samer Hassan

Kaushik Shivakumar

Joost Van Amersfoort

Amol Mandhane

Pratik M. Joshi

Matthew Tung

Andy Brock

Hannah Rachel Sheahan

Vedant Misra

Cheng Li

Nemanja Raki'cevi'c

Mostafa Dehghani

Fangyu Liu

Sid Mittal

Junhyuk Oh

Seb Noury

Eren Sezener

Fantine Huot

Matthew Lamm

Nicola De Cao

Charlie Chen

Gamaleldin Elsayed

Ed Huai-hsin Chi

Mahdis Mahdieh

Ian F. Tenney

Nan Hua

Ivan Petrychenko

Patrick Kane

Dylan Scandinaro

Rishub Jain

Jonathan Uesato

Romina Datta

Adam Sadovsky

Oskar Bunyan

Dominik Rabiej

Shimu Wu

John Zhang

Gautam Vasudevan

Edouard Leurent

Mahmoud Alnahlawi

Ionut-Razvan Georgescu

Nan Wei

Ivy Zheng

Betty Chan

Pam G Rabinovitch

Piotr Stańczyk

Ye Zhang

David Steiner

Subhajit Naskar

Michael Azzam

Matthew Johnson

Adam Paszke

Chung-Cheng Chiu

Jaume Sanchez Elias

Afroz Mohiuddin

Faizan Muhammad

Jin Miao

Andrew Lee

Nino Vieillard

Sahitya Potluri

Jane Park

Elnaz Davoodi

Jiageng Zhang

Jeff Stanway

Drew Garmon

Abhijit Karmarkar

Zhe Dong

2023-12-18

ArXiv (preprint)

arxiv.org

Low Compute Unlearning via Sparse Representations

Vedant Shah

Frederik Träuble

Ashish Malik

Hugo Larochelle

Michael Curtis Mozer

Sanjeev Arora

Machine unlearning, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible … (see more)using existing techniques. We propose a low-compute unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the dataset. We evaluate the proposed technique on the problem of class unlearning using four datasets: CIFAR-10, CIFAR-100, LACUNA-100 and ImageNet-1k. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all four datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.

2023-11-25

arXiv (preprint)

Neural Causal Structure Discovery from Interventions

Nan Rosemary Ke

Bernhard Schölkopf

Michael Curtis Mozer

Christopher Pal

Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (see more) However, there are theoretical limitations on the identifiability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository.

2023-09-09

TMLR (accepted)

Discrete Key-Value Bottleneck

Frederik Träuble

Nasim Rahaman

Michael Mozer

Kenji Kawaguchi

Bernhard Schölkopf

Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant. Challenges emerge with… (see more) non-stationary training data streams such as continual learning. One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning. Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks. In the present work, we propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes. Our paradigm will be to encode; process the representation via a discrete bottleneck; and decode. Here, the input is fed to the pre-trained encoder, the output of the encoder is used to select the nearest keys, and the corresponding values are fed to the decoder to solve the current task. The model can only fetch and re-use a sparse number of these key-value pairs during inference, enabling localized and context-dependent model updates. We theoretically investigate the ability of the discrete key-value bottleneck to minimize the effect of learning under distribution shifts and show that it reduces the complexity of the hypothesis class. We empirically verify the proposed method under challenging class-incremental learning scenarios and show that the proposed model - without any task boundaries - reduces catastrophic forgetting across a wide variety of pre-trained models, outperforming relevant baselines on this task.

2023-07-02

Proceedings of the 40th International Conference on Machine Learning (published)

proceedings.mlr.press

Spotlight Attention: Robust Object-Centric Learning With a Spatial Locality Prior

Ayush K Chakravarthy

Trang M. Nguyen