Portrait de David Vázquez

David Vázquez

Membre industriel associé
Professeur associé, Polytechnique Montréal, Département d'informatique et de génie logiciel
ServiceNow
Sujets de recherche
Apprentissage de représentations
Apprentissage multimodal
Apprentissage profond
Grands modèles de langage (LLM)
IA conversationnelle
Modèles génératifs
Vision par ordinateur

Publications

Group Robust Classification Without Any Group Information
Joao Monteiro
Pau Rodríguez
Empirical risk minimization (ERM) is sensitive to spurious correlations in the training data, which poses a significant risk when deploying … (voir plus)systems trained under this paradigm in high-stake applications. While the existing literature focuses on maximizing group-balanced or worst-group accuracy, estimating these accuracies is hindered by costly bias annotations. This study contends that current bias-unsupervised approaches to group robustness continue to rely on group information to achieve optimal performance. Firstly, these methods implicitly assume that all group combinations are represented during training. To illustrate this, we introduce a systematic generalization task on the MPI3D dataset and discover that current algorithms fail to improve the ERM baseline when combinations of observed attribute values are missing. Secondly, bias labels are still crucial for effective model selection, restricting the practicality of these methods in real-world scenarios. To address these limitations, we propose a revised methodology for training and validating debiased models in an entirely bias-unsupervised manner. We achieve this by employing pretrained self-supervised models to reliably extract bias information, which enables the integration of a logit adjustment training loss with our validation criterion. Our empirical analysis on synthetic and real-world tasks provides evidence that our approach overcomes the identified challenges and consistently enhances robust accuracy, attaining performance which is competitive with or outperforms that of state-of-the-art methods, which, conversely, rely on bias labels for validation.
Improving Generalization in Task-oriented Dialogues with Workflows and Action Plans
Stefania Raimondo
Christopher Pal
Xiaotian Liu
Hector. Palacios
A Survey of Self-Supervised and Few-Shot Object Detection
Issam Hadj Laradji
Pau Rodríguez
Labeling data is often expensive and time-consuming, especially for tasks such as object detection and instance segmentation, which require … (voir plus)dense labeling of the image. While few-shot object detection is about training a model on novel (unseen) object classes with little data, it still requires prior training on many labeled examples of base (seen) classes. On the other hand, self-supervised methods aim at learning representations from unlabeled data which transfer well to downstream tasks such as object detection. Combining few-shot and self-supervised object detection is a promising research direction. In this survey, we review and characterize the most recent approaches on few-shot and self-supervised object detection. Then, we give our main takeaways and discuss future research directions. Project page: https://gabrielhuang.github.io/fsod-survey/.
Language Decision Transformers with Exponential Tilt for Interactive Text Environments
Nicolas Gontier
Pau Rodríguez
Issam Hadj Laradji
Christopher Pal
Workflow Discovery from Dialogues in the Low Data Regime
Amine El hattami
Stefania Raimondo
Issam Hadj Laradji
Pau Rodríguez
Christopher Pal
Text-based dialogues are now widely used to solve real-world problems. In cases where solution strategies are already known, they can someti… (voir plus)mes be codified into workflows and used to guide humans or artificial agents through the task of helping clients. We introduce a new problem formulation that we call Workflow Discovery (WD) in which we are interested in the situation where a formal workflow may not yet exist. Still, we wish to discover the set of actions that have been taken to resolve a particular problem. We also examine a sequence-to-sequence (Seq2Seq) approach for this novel task. We present experiments where we extract workflows from dialogues in the Action-Based Conversations Dataset (ABCD). Since the ABCD dialogues follow known workflows to guide agents, we can evaluate our ability to extract such workflows using ground truth sequences of actions. We propose and evaluate an approach that conditions models on the set of possible actions, and we show that using this strategy, we can improve WD performance. Our conditioning approach also improves zero-shot and few-shot WD performance when transferring learned models to unseen domains within and across datasets. Further, on ABCD a modified variant of our Seq2Seq method achieves state-of-the-art performance on related but different problems of Action State Tracking (AST) and Cascading Dialogue Success (CDS) across many evaluation metrics.
Implicit Offline Reinforcement Learning via Supervised Learning
Rafael Pardinas
Igor Mordatch
Christopher Pal
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset of varied b… (voir plus)ehaviors. It is as simple as supervised learning and Behavior Cloning (BC) but takes advantage of the return information. On BC tasks, implicit models have been shown to match or outperform explicit ones. Despite the benefits of using implicit models to learn robotic skills via BC, Offline RL via Supervised Learning algorithms have been limited to explicit models. We show how implicit models leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets. Furthermore, we show how closely related our implicit methods are to other popular RL via Supervised Learning algorithms.
Flaky Performances when Pretraining on Relational Databases
Flaky Performances when Pre-Training on Relational Databases with a Plan for Future Characterization Efforts
We explore the downstream task performances for graph neural network (GNN) self-supervised learning (SSL) methods trained on subgraphs extra… (voir plus)cted from relational databases (RDBs). Intu-itively, this joint use of SSL and GNNs allows us to leverage more of the available data, which could translate to better results. However, while we observe positive transfer in some cases, others showed systematic performance degradation, including some spectacular ones. We hypothesize a mechanism that could explain this behaviour and draft the plan for future work testing it by characterizing how much relevant information different strategies can (theoretically and/or empirically) extract from (synthetic and/or real) RDBs.
Multi-label Iterated Learning for Image Classification with Label Ambiguity
Sai Rajeswar
Pau Rodríguez
Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that da… (voir plus)tasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data. Inspired by language emergence literature, we propose multi-label iterated learning (MILe) to incorporate the inductive biases of multi-label learning from single labels using the framework of iterated learning. MILe is a simple yet effective procedure that builds a multi-label description of the image by propagating binary predictions through successive generations of teacher and student networks with a learning bottleneck. Experiments show that our approach exhibits systematic benefits on ImageNet accuracy as well as ReaL F1 score, which indicates that MILe deals better with label ambiguity than the standard training procedure, even when fine-tuning from self-supervised weights. We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision. Furthermore, MILe improves performance in class incremental settings such as IIRC and it is robust to distribution shifts. Code: https://github.com/rajeswar18/MILe
A Probabilistic Perspective on Reinforcement Learning via Supervised Learning
Rafael Pardinas
Christopher Pal
Object-centric Compositional Imagination for Visual Abstract Reasoning
Like humans devoid of imagination, current machine learning systems lack the ability to adapt to new, unexpected situations by foreseeing th… (voir plus)em, which makes them unable to solve new tasks by analogical reasoning. In this work, we introduce a new compositional imagination framework that improves a model's ability to generalize. One of the key components of our framework is object-centric inductive biases that enables models to perceive the environment as a series of objects, properties, and transformations. By composing these key ingredients, it is possible to generate new unseen tasks that, when used to train the model, improve generalization. Experiments on a simplified version of the Abstraction and Reasoning Corpus (ARC) demonstrate the effectiveness of our framework.
Consistency-CAM: Towards Improved Weakly Supervised Semantic Segmentation.
Sai Rajeswar
Issam Hadj Laradji
Pau Rodríguez