Boris Knyazev

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

Pretraining a neural network on a large dataset is becoming a cornerstone in machine learning that is within the reach of only a few communi… (voir plus)ties with large-resources. We aim at an ambitious goal of democratizing pretraining. Towards that goal, we train and release a single neural network that can predict high quality ImageNet parameters of other neural networks. By using predicted parameters for initialization we are able to boost training of diverse ImageNet models available in PyTorch. When transferred to other datasets, models initialized with predicted parameters also converge faster and reach competitive final performance.

2023-07-02

Proceedings of the 40th International Conference on Machine Learning (publié)

doi.org

proceedings.mlr.press

Learning to Optimize with Recurrent Hierarchical Transformers

2023-06-18

ICML.cc/2023/Workshop/Frontiers4LCD (publié)

openreview.net

Pretrained Language Models to Solve Graph Tasks in Natural Language

Frederik Wenkel

Guy Wolf

Boris Knyazev

Pretrained large language models (LLMs) are powerful learners in a variety of language tasks. We explore if LLMs can learn from graph-struct… (voir plus)ured data when the graphs are described using natural language. We explore data augmentation and pretraining specific to the graph domain and show that LLMs such as GPT-2 and GPT-3 are promising alternatives to graph neural networks.

2023-06-18

ICML.cc/2023/Workshop/SPIGM (poster)

openreview.net

Generative Compositional Augmentations for Scene Graph Prediction

Boris Knyazev

Harm de Vries

Cătălina Cangea

Graham W. Taylor

Aaron Courville

Eugene Belilovsky

Inferring objects and their relationships from an image in the form of a scene graph is useful in many applications at the intersection of v… (voir plus)ision and language. We consider a challenging problem of compositional generalization that emerges in this task due to a long tail data distribution. Current scene graph generation models are trained on a tiny fraction of the distribution corresponding to the most frequent compositions, e.g. . However, test images might contain zero- and few-shot compositions of objects and relationships, e.g. . Despite each of the object categories and the predicate (e.g. 'on') being frequent in the training data, the models often fail to properly understand such unseen or rare compositions. To improve generalization, it is natural to attempt increasing the diversity of the training distribution. However, in the graph domain this is non-trivial. To that end, we propose a method to synthesize rare yet plausible scene graphs by perturbing real ones. We then propose and empirically study a model based on conditional generative adversarial networks (GANs) that allows us to generate visual features of perturbed scene graphs and learn from them in a joint fashion. When evaluated on the Visual Genome dataset, our approach yields marginal, but consistent improvements in zero- and few-shot metrics. We analyze the limitations of our approach indicating promising directions for future research.

2021-09-30

2021 IEEE/CVF International Conference on Computer Vision (ICCV) (publié)

doi.org

arxiv.org

Graph Density-Aware Losses for Novel Compositions in Scene Graph Generation

Boris Knyazev

Harm de Vries

Cătălina Cangea

Graham W. Taylor

Aaron Courville

Eugene Belilovsky

Scene graph generation (SGG) aims to predict graph-structured descriptions of input images, in the form of objects and relationships between… (voir plus) them. This task is becoming increasingly useful for progress at the interface of vision and language. Here, it is important - yet challenging - to perform well on novel (zero-shot) or rare (few-shot) compositions of objects and relationships. In this paper, we identify two key issues that limit such generalization. Firstly, we show that the standard loss used in this task is unintentionally a function of scene graph density. This leads to the neglect of individual edges in large sparse graphs during training, even though these contain diverse few-shot examples that are important for generalization. Secondly, the frequency of relationships can create a strong bias in this task, such that a blind model predicting the most frequent relationship achieves good performance. Consequently, some state-of-the-art models exploit this bias to improve results. We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA. To address these issues, we introduce a density-normalized edge loss, which provides more than a two-fold improvement in certain generalization metrics. Compared to other works in this direction, our enhancements require only a few lines of code and no added computational cost. We also highlight the difficulty of accurately evaluating models using existing metrics, especially on zero/few shots, and introduce a novel weighted metric.

2019-12-31

Proceedings of the British Machine Vision Conference 2020 (publié)

doi.org

arxiv.org

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Boris Knyazev

Publications

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Mots-clés populaires:

Boris Knyazev

Publications