Portrait of Ross Goroshin

Ross Goroshin

Core Industry Member
Adjunct professor, Université de Montréal, Department of Computer Science and Operations Research
Google DeepMind
Research Topics
Applied AI
Computer Vision
Deep Learning
Dynamical Systems
Representation Learning

Biography

Ross Goroshin is a Research Scientist at Google DeepMind, Montreal and Core Industry Member at Mila - Quebec Artificial Intelligence Institute. He holds a PhD in Computer Science from NYU, where he was advised by Yann LeCun. He also earned a B.Eng. in Electrical Engineering from Concordia University and an M.S. in Electrical Engineering from Georgia Tech. His research focuses on computer vision, self-supervised learning, and optimal control.

In addition to his roles at Google DeepMind and Mila, Ross serves as an adjunct professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal.

Publications

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Jesse Farebrother
Joshua Greaves
Charline Le Lan
Auxiliary tasks improve the representations learned by deep reinforcement learning agents. Analytically, their effect is reasonably well-und… (see more)erstood; in practice, how-ever, their primary use remains in support of a main learning objective, rather than as a method for learning representations. This is perhaps surprising given that many auxiliary tasks are defined procedurally, and hence can be treated as an essentially infinite source of information about the environment. Based on this observation, we study the effectiveness of auxiliary tasks for learning rich representations, focusing on the setting where the number of tasks and the size of the agent’s network are simultaneously increased. For this purpose, we derive a new family of auxiliary tasks based on the successor measure. These tasks are easy to implement and have appealing theoretical properties. Combined with a suitable off-policy learning rule, the result is a representation learning algorithm that can be understood as extending Mahadevan & Maggioni (2007)’s proto-value functions to deep reinforcement learning – accordingly, we call the resulting object proto-value networks. Through a series of experiments on the Arcade Learning Environment, we demonstrate that proto-value networks produce rich features that may be used to obtain performance comparable to established algorithms, using only linear approximation and a small number (~4M) of interactions with the environment’s reward function.
Block-State Transformers
Jonathan Pilault
Mahan Fathi
Orhan Firat
Block-State Transformers
Mahan Fathi
Jonathan Pilault
Orhan Firat
Learned Image Compression for Machine Perception
Felipe Codevilla
Jean Gabriel Simard
Impact of Aliasing on Generalization in Deep Convolutional Networks
Cristina Vasconcelos
Vincent Dumoulin
Rob Romijnders
We investigate the impact of aliasing on generalization in Deep Convolutional Networks and show that data augmentation schemes alone are una… (see more)ble to prevent it due to structural limitations in widely used architectures. Drawing insights from frequency analysis theory, we take a closer look at ResNet and EfficientNet architectures and review the trade-off between aliasing and information loss in each of their major components. We show how to mitigate aliasing by inserting non-trainable low-pass filters at key locations, particularly where networks lack the capacity to learn them. These simple architectural changes lead to substantial improvements in generalization on i.i.d. and even more on out-of-distribution conditions, such as image classification under natural corruptions on ImageNet-C [11] and few-shot learning on Meta-Dataset [26]. State-of-the art results are achieved on both datasets without introducing additional trainable parameters and using the default hyper-parameters of open source codebases.
Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark
Vincent Dumoulin
Neil Houlsby
Utku Evci
Xiaohua Zhai
Sylvain Gelly
Meta and transfer learning are two successful families of approaches to few-shot learning. Despite highly related goals, state-of-the-art ad… (see more)vances in each family are measured largely in isolation of each other. As a result of diverging evaluation norms, a direct or thorough comparison of different approaches is challenging. To bridge this gap, we perform a cross-family study of the best transfer and meta learners on both a large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning benchmark (Visual Task Adaptation Benchmark, VTAB). We find that, on average, large-scale transfer methods (Big Transfer, BiT) outperform competing approaches on MD, even when trained only on ImageNet. In contrast, meta-learning approaches struggle to compete on VTAB when trained and validated on MD. However, BiT is not without limitations, and pushing for scale does not improve performance on highly out-of-distribution MD tasks. In performing this study, we reveal a number of discrepancies in evaluation norms and study some of these in light of the performance gap. We hope that this work facilitates sharing of insights from each community, and accelerates progress on few-shot learning.
A Unified Few-Shot Classification Benchmark to Compare Transfer and Meta Learning Approaches
Vincent Dumoulin
Neil Houlsby
Utku Evci
Xiaohua Zhai
Sylvain Gelly
Meta and transfer learning are two successful families of approaches to few-shot 1 learning. Despite highly related goals, state-of-the-art … (see more)advances in each family are 2 measured largely in isolation of each other. As a result of diverging evaluation 3 norms, a direct or thorough comparison of different approaches is challenging. 4 To bridge this gap, we introduce a few-shot classification evaluation protocol 5 named VTAB+MD with the explicit goal of facilitating sharing of insights from 6 each community. We demonstrate its accessibility in practice by performing a 7 cross-family study of the best transfer and meta learners which report on both a 8 large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning 9 benchmark (Visual Task Adaptation Benchmark, VTAB). We find that, on average, 10 large-scale transfer methods (Big Transfer, BiT) outperform competing approaches 11 on MD, even when trained only on ImageNet. In contrast, meta-learning approaches 12 struggle to compete on VTAB when trained and validated on MD. However, BiT 13 is not without limitations, and pushing for scale does not improve performance 14 on highly out-of-distribution MD tasks. We hope that this work contributes to 15 accelerating progress on few-shot learning research. 16
An Effective Anti-Aliasing Approach for Residual Networks
Cristina Vasconcelos
Vincent Dumoulin
Image pre-processing in the frequency domain has traditionally played a vital role in computer vision and was even part of the standard pipe… (see more)line in the early days of deep learning. However, with the advent of large datasets, many practitioners concluded that this was unnecessary due to the belief that these priors can be learned from the data itself. Frequency aliasing is a phenomenon that may occur when sub-sampling any signal, such as an image or feature map, causing distortion in the sub-sampled output. We show that we can mitigate this effect by placing non-trainable blur filters and using smooth activation functions at key locations, particularly where networks lack the capacity to learn them. These simple architectural changes lead to substantial improvements in out-of-distribution generalization on both image classification under natural corruptions on ImageNet-C [10] and few-shot learning on Meta-Dataset [17], without introducing additional trainable parameters and using the default hyper-parameters of open source codebases.