Simon Guiroy

Neural Coherence : Find higher performance to out-of-distribution tasks from few samples

Simon Guiroy

Mats Richter

A. Chandar

Christopher Pal

2025-11-30

arXiv (published)

doi.org

arxiv.org

Improving Meta-Learning Generalization with Activation-Based Early-Stopping

Simon Guiroy

Christopher Pal

Goncalo Mordido

A. Chandar

2022-11-27

Proceedings of The 1st Conference on Lifelong Learning Agents (published)

doi.org

proceedings.mlr.press

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

A. Chandar

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (see more) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-12

ArXiv (preprint)

openreview.net

Towards an Unsupervised Method for Model Selection in Few-Shot Learning

Simon Guiroy

Vikas Verma

Christopher Pal

The study of generalization of neural networks in gradient-based meta-learning has recently great research interest. Previous work on the st… (see more)udy of the objective landscapes within the scope of few-shot classiﬁcation empirically demonstrated that generalization to new tasks might be linked to the average inner product between their respective gradients vectors (Guiroy et al., 2019). Following that work, we study the effect that meta-training has on the learned space of representation of the network. Notably, we demonstrate that the global similarity in the space of representation, measured by the average inner product between the embeddings of meta-test examples, also correlates to generalization. Based on these observations, we propose a novel model-selection criterion for gradient-based meta-learning and experimentally validate its effectiveness.

2020-07-12

ICML.cc/2020/Workshop/LifelongML (unknown)

openreview.net

Towards Understanding Generalization in Gradient-Based Meta-Learning

Simon Guiroy

Vikas Verma

Christopher Pal

In this work we study generalization of neural networks in gradient-based meta-learning by analyzing various properties of the objective lan… (see more)dscapes. We experimentally demonstrate that as meta-training progresses, the meta-test solutions, obtained after adapting the meta-train solution of the model, to new tasks via few steps of gradient-based fine-tuning, become flatter, lower in loss, and further away from the meta-train solution. We also show that those meta-test solutions become flatter even as generalization starts to degrade, thus providing an experimental evidence against the correlation between generalization and flat minima in the paradigm of gradient-based meta-leaning. Furthermore, we provide empirical evidence that generalization to new tasks is correlated with the coherence between their adaptation trajectories in parameter space, measured by the average cosine similarity between task-specific trajectory directions, starting from a same meta-train solution. We also show that coherence of meta-test gradients, measured by the average inner product between the task-specific gradient vectors evaluated at meta-train solution, is also correlated with generalization. Based on these observations, we propose a novel regularizer for MAML and provide experimental evidence for its effectiveness.

2019-07-15

ArXiv (preprint)

openreview.net

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Simon Guiroy

Publications

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Popular keywords:

Simon Guiroy

Publications