Publications

Supplementary Material for MixupE

Yingtian Zou

Vikas Verma

Sarthak Mittal

Wai Hoh Tang

Hieu Pham

Juho Kannala

Yoshua Bengio

Arno Solin

Kenji Kawaguchi

We denote by z = (x,y) the input and output pair where x ∈ X ⊆ R and y ∈ Y ⊆ R . Let fθ(x) ∈ R be the output of the logits (i.e.,… (see more) the last layer before the softmax or sigmoid) of the model parameterized by θ. We use l(θ, z) = h(fθ(x)) − yfθ(x) to denote the loss function. Let g(·) be the activation function. We use x(i) to index i-th element of the vector x and xj to represent j-th variable in a set. The notation list is:

2023-01-01

(published)

A Survey of Diversification Metrics and Approaches in Retrieval Systems: From the Perspective of Search and Recommendation

Haolun Wu

Yansen Zhang

Chen Ma

Fuyuan Lyu

Fernando Diaz

Xue (Steve) Liu

Diversifying search results is an important research topic in retrieval systems in order to satisfy both the various interests of customers … (see more)and the equal market exposure of providers. There has been a growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, the diversity-aware studies in retrieval systems lack a systematic organization and are rather fragmented. In this survey, we are the first to propose a unified taxonomy for classifying the metrics and approaches of diversification in both search and recommendation, which are two of the most extensively researched fields of retrieval systems. We begin the survey with a brief discussion of why diversity is important in retrieval systems

2023-01-01

(published)

Survey of Scientific Rigor Studied in Machine Learning

D. Sculley

Gary R. Holt

Daniel R. Golovin

Eugene V. Davydov

Todd Phillips

Dietmar Ebner

Michael Young

Jean-francois Crespo

Dan Dennison

Emily Fox

Hugo Larochelle

The concern that Artificial Intelligence (AI) and Machine Learning (ML) are entering a “reproducibility crisis” has spurred significant … (see more)research in the past few years. Yet with each paper, it is often unclear what someone means by “reproducibility” and where it fits in the larger scope of what we will call the “scientific rigor” literature. Ultimately, the lack of clear rigor standards can affect the manner in which businesses seeking to adopt AI/ML implement such capabilities. In this survey, we will use 66 papers published since 2017 to construct a proposed set of 8 high-level categories of scientific rigor, what they are, and the history of work conducted in each. Our proposal is that these eight rigor types are not mutually exclusive and present a model for how they influence each other. To encourage more to study these questions, we map these rigors to the adoption process in real-world business use cases. In doing so, we can quantify gaps in the literature that suggest an under focus on the issues necessary for scientific rigor research to transition to practice

2023-01-01

(published)

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Sébastien Lachapelle

Tristan Deleu

Divyat Mahajan

Ioannis Mitliagkas

Yoshua Bengio

Simon Lacoste-Julien

Quentin Bertrand

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding … (see more)is limited. In this work, we provide evidence that disentangled representations coupled with sparse base-predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM base-predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.

2023-01-01

ICML (published)

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Sébastien Lachapelle

Tristan Deleu

Divyat Mahajan

Ioannis Mitliagkas

Yoshua Bengio

Simon Lacoste-Julien

Quentin Bertrand

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding … (see more)is limited. In this work, we provide evidence that disentangled representations coupled with sparse task-specific predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.

2023-01-01

ICML (published)

proceedings.mlr.press

Syntactic Substitutability as Unsupervised Dependency Syntax

Jasper Jian

Siva Reddy

2023-01-01

EMNLP (published)

Technology-enhanced trauma training in low-resource settings: A scoping review and feasibility analysis of educational technologies.

Minahil Khan

Fabio Botelho

Laura Pinkham

Elena Guadagno

Dan Poenaru

2023-01-01

Journal of Pediatric Surgery (published)

Temporal Abstraction in Reinforcement Learning with the Successor Representation

Marlos C. Machado

Andre Barreto

Doina Precup

Michael Bowling

arxiv.org

Test-time Defense against Adversarial Attacks: Detection and Reconstruction of Adversarial Examples via Masked Autoencoder

Yun-Yun Tsai

Ju-Chin Chao

Albert Wen

Zhaoyuan Yang

Chengzhi Mao

Tapan Shah

Junfeng Yang

Existing defense methods against adversarial attacks can be categorized into training time and test time defenses. Training time defense, i.… (see more)e., adversarial training, requires a signiﬁcant amount of extra time for training and is often not able to be generalized to unseen attacks. On the other hand, test time defense by test time weight adaptation requires access to perform gradient descent on (part of) the model weights, which could be infeasible for models with frozen weights. To address these challenges, we propose DRAM, a novel defense method to Detect and Reconstruct the multiple types of Adversarial attacks via Masked autoencoder (MAE). We demonstrate how to use MAE losses to build a KS-test to detect adversarial attacks. Moreover, the MAE losses can be used to repair adversarial samples from unseen attack types. In this sense, DRAM neither requires model weight updates in test time nor augments the training set with more adversarial samples. Evaluating DRAM on the large-scale ImageNet data, we achieve the best detection rate of 82% on average on eight types of adversarial attacks compared with other detection baselines. For reconstruction, DRAM improves the robust accuracy by 6% ∼ 41% for Standard ResNet50 and 3% ∼ 8% for Robust ResNet50 compared with other self-supervision tasks, such as rotation prediction and contrastive learning.

2023-01-01

(published)

The Age of Ransomware: A Survey on the Evolution, Taxonomy, and Research Directions

Salwa Razaulla

Claude Fachkha

Christine Markarian

Amjad Gawanmeh

Wathiq Mansoor

Benjamin Fung

Chadi Assi

The proliferation of ransomware has become a significant threat to cybersecurity in recent years, causing significant financial, reputationa… (see more)l, and operational damage to individuals and organizations. This paper aims to provide a comprehensive overview of the evolution of ransomware, its taxonomy, and its state-of-the-art research contributions. We begin by tracing the origins of ransomware and its evolution over time, highlighting the key milestones and major trends. Next, we propose a taxonomy of ransomware that categorizes different types of ransomware based on their characteristics and behavior. Subsequently, we review the existing research over several years in regard to detection, prevention, mitigation, and prediction techniques. Our extensive analysis, based on more than 150 references, has revealed that significant research, specifically 72.8%, has focused on detecting ransomware. However, a lack of emphasis has been placed on predicting ransomware. Additionally, of the studies focused on ransomware detection, a significant portion, 70%, have utilized Machine Learning methods. This study uncovers a range of shortcomings in research pertaining to real-time protection and identifying zero-day ransomware, and two issues specific to Machine Learning models. Adversarial machine learning exploitation and concept drift have been identified as under-researched areas in the field. This survey is a constructive roadmap for researchers interested in ransomware research matters.

2023-01-01

IEEE Access (published)

The Dormant Neuron Phenomenon in Deep Reinforcement Learning

Ghada Sokar

Rishabh Agarwal

Pablo Samuel Castro

Utku Evci

In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent's network suffers from an increasing n… (see more)umber of inactive neurons, thereby affecting network expressivity. We demonstrate the presence of this phenomenon across a variety of algorithms and environments, and highlight its effect on learning. To address this issue, we propose a simple and effective method (ReDo) that Recycles Dormant neurons throughout training. Our experiments demonstrate that ReDo maintains the expressive power of networks by reducing the number of dormant neurons and results in improved performance.

2023-01-01

ICML (published)