David Vázquez

GEO-Bench: Toward Foundation Models for Earth Monitoring

Alexandre Lacoste

Nils Lehmann

Pau Rodríguez

Evan David Sherwin

Hannah Kerner

Björn Lütjens

Jeremy Irvin

David Dao

Hamed Alemohammad

Mehmet Gunturkun

Dava Newman

Stefano Ermon

Xiao Xiang Zhu

Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to subst… (see more)antial increases in generalization to downstream tasks. Such models, recently coined foundation models, have been transformational to the field of natural language processing. Variants have also been proposed for image data, but their applicability to remote sensing tasks is limited. To stimulate the development of foundation models for Earth monitoring, we propose a benchmark comprised of six classification and six segmentation tasks, which were carefully curated and adapted to be both relevant to the field and well-suited for model evaluation. We accompany this benchmark with a robust methodology for evaluating models and reporting aggregated results to enable a reliable assessment of progress. Finally, we report results for 20 baselines to gain information about the performance of existing models. We believe that this benchmark will be a driver of progress across a variety of Earth monitoring tasks.

2023-09-24

NeurIPS.cc/2023/Track/Datasets_and_Benchmarks (poster)

doi.org

openreview.net

CADet: Fully Self-Supervised Anomaly Detection With Contrastive Learning

Charles Guille-escuret

Pau Rodríguez

David Vázquez

Ioannis Mitliagkas

Joao Monteiro

Handling out-of-distribution (OOD) samples has become a major stake in the real-world deployment of machine learning systems. This work expl… (see more)ores the use of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples: unseen classes and adversarial perturbations. First, we pair self-supervised contrastive learning with the maximum mean discrepancy (MMD) two-sample test. This approach enables us to robustly test whether two independent sets of samples originate from the same distribution, and we demonstrate its effectiveness by discriminating between CIFAR-10 and CIFAR-10.1 with higher confidence than previous work. Motivated by this success, we introduce CADet (Contrastive Anomaly Detection), a novel method for OOD detection of single samples. CADet draws inspiration from MMD, but leverages the similarity between contrastive transformations of a same sample. CADet outperforms existing adversarial detection methods in identifying adversarially perturbed samples on ImageNet and achieves comparable performance to unseen label detection methods on two challenging benchmarks: ImageNet-O and iNaturalist. Significantly, CADet is fully self-supervised and requires neither labels for in-distribution samples nor access to OOD examples.

2023-09-20

NeurIPS.cc/2023/Conference (poster)

doi.org

openreview.net

Group Robust Classification Without Any Group Information

Christos Tsirigotis

Joao Monteiro

Pau Rodríguez

David Vázquez

Aaron Courville

Empirical risk minimization (ERM) is sensitive to spurious correlations in the training data, which poses a significant risk when deploying … (see more)systems trained under this paradigm in high-stake applications. While the existing literature focuses on maximizing group-balanced or worst-group accuracy, estimating these accuracies is hindered by costly bias annotations. This study contends that current bias-unsupervised approaches to group robustness continue to rely on group information to achieve optimal performance. Firstly, these methods implicitly assume that all group combinations are represented during training. To illustrate this, we introduce a systematic generalization task on the MPI3D dataset and discover that current algorithms fail to improve the ERM baseline when combinations of observed attribute values are missing. Secondly, bias labels are still crucial for effective model selection, restricting the practicality of these methods in real-world scenarios. To address these limitations, we propose a revised methodology for training and validating debiased models in an entirely bias-unsupervised manner. We achieve this by employing pretrained self-supervised models to reliably extract bias information, which enables the integration of a logit adjustment training loss with our validation criterion. Our empirical analysis on synthetic and real-world tasks provides evidence that our approach overcomes the identified challenges and consistently enhances robust accuracy, attaining performance which is competitive with or outperforms that of state-of-the-art methods, which, conversely, rely on bias labels for validation.

2023-09-20

NeurIPS.cc/2023/Conference (poster)

doi.org

openreview.net

Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans

Stefania Raimondo

Christopher Pal

Xiaotian Liu

David Vázquez

Hector. Palacios

2023-06-01

ArXiv (preprint)

doi.org

arxiv.org

A Survey of Self-Supervised and Few-Shot Object Detection

Gabriel Huang

Issam Hadj Laradji

David Vázquez

Simon Lacoste-Julien

Pau Rodríguez

Labeling data is often expensive and time-consuming, especially for tasks such as object detection and instance segmentation, which require … (see more)dense labeling of the image. While few-shot object detection is about training a model on novel (unseen) object classes with little data, it still requires prior training on many labeled examples of base (seen) classes. On the other hand, self-supervised methods aim at learning representations from unlabeled data which transfer well to downstream tasks such as object detection. Combining few-shot and self-supervised object detection is a promising research direction. In this survey, we review and characterize the most recent approaches on few-shot and self-supervised object detection. Then, we give our main takeaways and discuss future research directions. Project page: https://gabrielhuang.github.io/fsod-survey/.

2023-03-31

IEEE Transactions on Pattern Analysis and Machine Intelligence (published)

doi.org

arxiv.org

Language Decision Transformers with Exponential Tilt for Interactive Text Environments

Nicolas Gontier

Pau Rodríguez

Issam Hadj Laradji

David Vázquez

Christopher Pal

2023-02-09

ArXiv (preprint)

openreview.net

Workflow Discovery from Dialogues in the Low Data Regime

Amine El hattami

Stefania Raimondo

Issam Hadj Laradji

David Vázquez

Pau Rodríguez

Christopher Pal

Text-based dialogues are now widely used to solve real-world problems. In cases where solution strategies are already known, they can someti… (see more)mes be codified into workflows and used to guide humans or artificial agents through the task of helping clients. We introduce a new problem formulation that we call Workflow Discovery (WD) in which we are interested in the situation where a formal workflow may not yet exist. Still, we wish to discover the set of actions that have been taken to resolve a particular problem. We also examine a sequence-to-sequence (Seq2Seq) approach for this novel task. We present experiments where we extract workflows from dialogues in the Action-Based Conversations Dataset (ABCD). Since the ABCD dialogues follow known workflows to guide agents, we can evaluate our ability to extract such workflows using ground truth sequences of actions. We propose and evaluate an approach that conditions models on the set of possible actions, and we show that using this strategy, we can improve WD performance. Our conditioning approach also improves zero-shot and few-shot WD performance when transferring learned models to unseen domains within and across datasets. Further, on ABCD a modified variant of our Seq2Seq method achieves state-of-the-art performance on related but different problems of Action State Tracking (AST) and Cascading Dialogue Success (CDS) across many evaluation metrics.

2022-12-31

Trans. Mach. Learn. Res. (published)

doi.org

openreview.net

Implicit Offline Reinforcement Learning via Supervised Learning

Alexandre Piché

Rafael Pardinas

David Vázquez

Igor Mordatch

Christopher Pal

Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset of varied b… (see more)ehaviors. It is as simple as supervised learning and Behavior Cloning (BC) but takes advantage of the return information. On BC tasks, implicit models have been shown to match or outperform explicit ones. Despite the benefits of using implicit models to learn robotic skills via BC, Offline RL via Supervised Learning algorithms have been limited to explicit models. We show how implicit models leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets. Furthermore, we show how closely related our implicit methods are to other popular RL via Supervised Learning algorithms.

2022-12-08

NeurIPS.cc/2022/Workshop/DeepRL (unknown)

doi.org

openreview.net

Flaky Performances when Pretraining on Relational Databases

Shengchao Liu

David Vázquez

Jian Tang

Pierre-Andre Noel

2022-11-08

ArXiv (preprint)

doi.org

arxiv.org

Flaky Performances when Pre-Training on Relational Databases with a Plan for Future Characterization Efforts

Shengchao Liu

David Vázquez

Jian Tang

Pierre-Andre Noel

We explore the downstream task performances for graph neural network (GNN) self-supervised learning (SSL) methods trained on subgraphs extra… (see more)cted from relational databases (RDBs). Intu-itively, this joint use of SSL and GNNs allows us to leverage more of the available data, which could translate to better results. However, while we observe positive transfer in some cases, others showed systematic performance degradation, including some spectacular ones. We hypothesize a mechanism that could explain this behaviour and draft the plan for future work testing it by characterizing how much relevant information different strategies can (theoretically and/or empirically) extract from (synthetic and/or real) RDBs.

2022-07-21

ICML.cc/2022/Workshop/Pre-Training (accepted)

openreview.net

Multi-label Iterated Learning for Image Classification with Label Ambiguity

Sai Rajeswar

Pau Rodríguez

Soumye Singhal

David Vázquez

Aaron Courville

Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that da… (see more)tasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Inspired by language emergence literature, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple yet effective procedure that builds a multi-label description of the image by propagating binary predictions through successive generations of teacher and student networks with a learning bottleneck. Experiments show that our approach exhibits systematic benefits on ImageNet accuracy as well as ReaL F1 score, which indicates that MILe deals better with label ambiguity than the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and it is robust to distribution shifts. Code: https://github.com/rajeswar18/MILe

2022-06-17

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

A Probabilistic Perspective on Reinforcement Learning via Supervised Learning

Alexandre Piché

Rafael Pardinas

David Vázquez

Christopher Pal

2022-04-26

ICLR.cc/2022/Workshop/GPL (poster)

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

David Vázquez

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

David Vázquez

Publications