Publications

Simulated Annealing in Early Layers Leads to Better Generalization

Amir M. Sarfi

Zahra Karimpour

Muawiz Chaudhary

Nasir M. Khalid

Sudhir Mudur

Recently, a number of iterative learning methods have been introduced to improve generalization. These typically rely on training for longer… (see more) periods of time in exchange for improved generalization. LLF (later-layer-forgetting) is a state-of-the-art method in this category. It strengthens learning in early layers by periodically re-initializing the last few layers of the network. Our principal innovation in this work is to use Simulated annealing in EArly Layers (SEAL) of the network in place of re-initialization of later layers. Essentially, later layers go through the normal gradient descent process, while the early layers go through short stints of gradient ascent followed by gradient descent. Extensive experiments on the popular Tiny-ImageNet dataset benchmark and a series of transfer learning and few-shot learning tasks show that we outperform LLF by a significant margin. We further show that, compared to normal training, LLF features, although improving on the target task, degrade the transfer learning performance across all datasets we explored. In comparison, our method outperforms LLF across the same target datasets by a large margin. We also show that the prediction depth of our method is significantly lower than that of LLF and normal training, indicating on average better prediction performance. 11The code to reproduce our results is publicly available at: https://github.com/amiiir-sarfi/SEAL

2023-06-17

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

A Survey of Contextual Optimization Methods for Decision Making under Uncertainty

Utsav Sadana

Abhilash Reddy Chenreddy

Érick Delage

Alexandre Forel

Emma Frejinger

Thibaut Vidal

2023-06-17

ArXiv (preprint)

doi.org

arxiv.org

DA Wand: Distortion-Aware Selection Using Neural Mesh Parameterization

Richard Liu

Noam Aigerman

Vladimir Kim

Rana Hanocka

We present a neural technique for learning to select a local sub-region around a point which can be used for mesh parameterization. The moti… (see more)vation for our framework is driven by interactive workflows used for decaling, texturing, or painting on surfaces. Our key idea is to incorporate segmentation probabilities as weights of a classical parameterization method, implemented as a novel differentiable parameterization layer within a neural network framework. We train a segmentation network to select 3D regions that are parameterized into 2D and penalized by the resulting distortion, giving rise to segmentations which are distortion-aware. Following training, a user can use our system to interactively select a point on the mesh and obtain a large, meaningful region around the selection which induces a low-distortion parameterization. Our code11https://github.com/threedle/DA-Wand and project22https://threedle.github.io/DA-Wand/ are publicly available.

2023-06-17

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

doi.org

arxiv.org

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

Hiroki Naganuma

Kartik Ahuja

Shiro Takagi

Tetsuya Motokawa

Rio Yokota

Kohta Ishikawa

Ikuro Sato

Ioannis Mitliagkas

Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution.… (see more) While much promising work has been accomplished to address this fragility, a systematic study of the role of optimizers and their out-of-distribution generalization performance has not been undertaken. In this study, we examine the performance of popular first-order optimizers for different classes of distributional shift under empirical risk minimization and invariant risk minimization. We address this question for image and text classification using DomainBed, WILDS, and Backgrounds Challenge as testbeds for studying different types of shifts---namely correlation and diversity shift. We search over a wide range of hyperparameters and examine classification accuracy (in-distribution and out-of-distribution) for over 20,000 models. We arrive at the following findings, which we expect to be helpful for practitioners: i) adaptive optimizers (e.g., Adam) perform worse than non-adaptive optimizers (e.g., SGD, momentum SGD) on out-of-distribution performance. In particular, even though there is no significant difference in in-distribution performance, we show a measurable difference in out-of-distribution performance. ii) in-distribution performance and out-of-distribution performance exhibit three types of behavior depending on the dataset---linear returns, increasing returns, and diminishing returns. For example, in the training of natural language data using Adam, fine-tuning the performance of in-distribution performance does not significantly contribute to the out-of-distribution generalization performance.

2023-06-16

TMLR (accepted)

doi.org

openreview.net

Interpretable deep learning architectures for improving drug response prediction performance: myth or reality?

Yihui Li

David Earl Hostallero

Amin Emad

Motivation: Recent advances in deep learning model development have enabled more accurate prediction of drug response in cancer. However, th… (see more)e black-box nature of these models still remains a hurdle in their adoption for precision cancer medicine. Recent efforts have focused on making these models interpretable by incorporating signaling pathway information in model architecture. While these models improve interpretability, it is unclear whether this higher interpretability comes at the cost of less accurate predictions, or a prediction improvement can also be obtained. Results: In this study, we comprehensively and systematically assessed four state-of-the-art interpretable models developed for drug response prediction to answer this question using three pathway collections. Our results showed that models that explicitly incorporate pathway information in the form of a latent layer perform worse compared to models that incorporate this information implicitly. Moreover, in most evaluation setups the best performance is achieved using a simple black-box model. In addition, replacing the signaling pathways with randomly generated pathways shows a comparable performance for the majority of these interpretable models. Our results suggest that new interpretable models are necessary to improve the drug response prediction performance. In addition, the current study provides different baseline models and evaluation setups necessary for such new models to demonstrate their superior prediction performance. Availability and Implementation: Implementation of all methods are provided in https://github.com/Emad-COMBINE-lab/InterpretableAI_for_DRP. Generated uniform datasets are in https://zenodo.org/record/7101665#.YzS79HbMKUk. Contact: amin.emad@mcgill.ca Supplementary Information: Online-only supplementary data is available at the journal’s website.

2023-06-16

Bioinformatics (published)

doi.org

Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Md. Rabiul Awal

Le Zhang

Aishwarya Agrawal

In this paper, we explore effective prompting techniques to enhance zero- and few-shot Visual Question Answering (VQA) performance in contem… (see more)porary Vision-Language Models (VLMs). Central to our investigation is the role of question templates in guiding VLMs to generate accurate answers. We identify that specific templates significantly influence VQA outcomes, underscoring the need for strategic template selection. Another pivotal aspect of our study is augmenting VLMs with image captions, providing them with additional visual cues alongside direct image features in VQA tasks. Surprisingly, this augmentation significantly improves the VLMs' performance in many cases, even though VLMs"see"the image directly! We explore chain-of-thought (CoT) reasoning and find that while standard CoT reasoning causes drops in performance, advanced methods like self-consistency can help recover it. Furthermore, we find that text-only few-shot examples enhance VLMs' alignment with the task format, particularly benefiting models prone to verbose zero-shot answers. Lastly, to mitigate the challenges associated with evaluating free-form open-ended VQA responses using string-matching based VQA metrics, we introduce a straightforward LLM-guided pre-processing technique to adapt the model responses to the expected ground-truth answer distribution. In summary, our research sheds light on the intricacies of prompting strategies in VLMs for VQA, emphasizing the synergistic use of captions, templates, and pre-processing to enhance model efficacy.

2023-06-16

ArXiv (preprint)

doi.org

arxiv.org

Preventing Dimensional Collapse in Contrastive Local Learning with Subsampling

Louis Fournier

Adeetya Patel

Michael Eickenberg

Edouard Oyallon

Eugene Belilovsky

2023-06-16

ICML.cc/2023/Workshop/LLW (published)

openreview.net

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding

Le Zhang

Rabiul Awal

Aishwarya Agrawal

Vision-Language Models (VLMs), such as CLIP, exhibit strong image-text comprehension abilities, facilitating advances in several downstream … (see more)tasks such as zero-shot image classification, image-text retrieval, and text-to-image generation. However, the compositional reasoning abilities of existing VLMs remains subpar. The root of this limitation lies in the inadequate alignment between the images and captions in the pretraining datasets. Additionally, the current contrastive learning objective fails to focus on fine-grained grounding components like relations, actions, and attributes, resulting in"bag-of-words"representations. We introduce a simple and effective method to improve compositional reasoning in VLMs. Our method better leverages available datasets by refining and expanding the standard image-text contrastive learning framework. Our approach does not require specific annotations and does not incur extra parameters. When integrated with CLIP, our technique yields notable improvement over state-of-the-art baselines across five vision-language compositional benchmarks. We open-source our code at https://github.com/lezhang7/Enhance-FineGrained.

2023-06-15

ArXiv (preprint)

arxiv.org

GEANT4-DNA simulation of temperature-dependent and pH-dependent yields of chemical radiolytic species

Jingyi Bian

Juan Duran

Wook-Geun Shin

Jose Ramos-Méndez

Jack C Sankey

Lilian Childress

Jan Seuntjens

Shirin A. Enger

2023-06-15

Physics in Medicine & Biology (published)

doi.org

LEAD: Min-Max Optimization from a Physical Perspective

Reyhane Askari Hemmat

Amartya Mitra

Guillaume Lajoie

Ioannis Mitliagkas

Adversarial formulations have rekindled interest in two-player min-max games. A central obstacle in the optimization of such games is the ro… (see more)tational dynamics that hinder their convergence. In this paper, we show that game optimization shares dynamic properties with particle systems subject to multiple forces, and one can leverage tools from physics to improve optimization dynamics. Inspired by the physical framework, we propose LEAD, an optimizer for min-max games. Next, using Lyapunov stability theory from dynamical systems as well as spectral analysis, we study LEAD’s convergence properties in continuous and discrete time settings for a class of quadratic min-max games to demonstrate linear convergence to the Nash equilibrium. Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements in GAN training.

2023-06-15

TMLR (accepted)

openreview.net

A solution algorithm for chance-constrained problems with integer second-stage recourse decisions

Andrea Lodi

Enrico Malaguti

Michele Monaci

Giacomo Nannicini

Paolo

Paronuzzi

2023-06-15

Mathematical programming (published)

doi.org

A2CiD2: Accelerating Asynchronous Communication in Decentralized Deep Learning

Adel Nabli

Eugene Belilovsky

Edouard Oyallon

2023-06-14

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications