Publications

Evolving Domain Generalization
Wei Wang
Gezheng Xu
Ruizhi Pu
Jiaqi Li
Fan Zhou
Charles Ling
Boyu Wang
JARV1S: Phenotype Clone Search for Rapid Zero-Day Malware Triage and Functional Decomposition for Cyber Threat Intelligence
Christopher Molloy
Philippe Charland
Steven H. H. Ding
Benjamin C. M. Fung
Cyber threat intelligence (CTI) has become a critical component of the defense of organizations against the steady surge of cyber attacks. M… (see more)alware is one of the most challenging problems for CTI, due to its prevalence, the massive number of variants, and the constantly changing threat actor behaviors. Currently, Malpedia has indexed 2,390 unique malware families, while the AVTEST Institute has recorded more than 166 million new unique malware samples in 2021. There exists a vast number of variants per malware family. Consequently, the signature-based representation of patterns and knowledge of legacy systems can no longer be generalized to detect future malware attacks. Machine learning-based solutions can match more variants. However, as a black-box approach, they lack the explainability and maintainability required by incident response teams.There is thus an urgent need for a data-driven system that can abstract a future-proof, human-friendly, systematic, actionable, and dependable knowledge representation from software artifacts from the past for more effective and insightful malware triage. In this paper, we present the first phenotype-based malware decomposition system for quick malware triage that is effective against malware variants. We define phenotypes as directly observable characteristics such as code fragments, constants, functions, and strings. Malware development rarely starts from scratch, and there are many reused components and code fragments. The target under investigation is decomposed into known phenotypes that are mapped to known malware families, malware behaviors, and Advanced Persistent Threat (APT) groups. The implemented system provides visualizable phenotypes through an interactive tree map, helping the cyber analysts to navigate through the decomposition results. We evaluated our system on 200,000 malware samples, 100,000 benign samples, and a malware family with over 27,284 variants. The results indicate our system is scalable, efficient, and effective against zero-day malware and new variants of known families.
Agnostic Physics-Driven Deep Learning
Siddhartha Mishra
Yann Ollivier
Works for Me! Cannot Reproduce – A Large Scale Empirical Study of Non-reproducible Bugs
Mohammad Masudur Rahman
Marco Castelluccio
Adaptive Confidence Calibration
Jonathan W. Pearce
Contextual bandit optimization of super-resolution microscopy
Anthony Bilodeau
Renaud Bernatchez
Albert Michaud-Gagnon
Efficient Fine-Tuning of BERT Models on the Edge
Mohammadreza Tayaranian
Maryam Ziaeefard
James J. Clark
Brett H. Meyer
Warren J. Gross
Resource-constrained devices are increasingly the deployment targets of machine learning applications. Static models, however, do not always… (see more) suffice for dynamic environments. On-device training of models allows for quick adaptability to new scenarios. With the increasing size of deep neural networks, as noted with the likes of BERT and other natural language processing models, comes increased resource requirements, namely memory, computation, energy, and time. Furthermore, training is far more resource intensive than inference. Resource-constrained on-device learning is thus doubly difficult, especially with large BERT-like models. By reducing the memory usage of fine-tuning, pre-trained BERT models can become efficient enough to fine-tune on resource-constrained devices. We propose Freeze And Reconfigure (FAR), a memory-efficient training regime for BERT-like models that reduces the memory usage of activation maps during fine-tuning by avoiding unnecessary parameter updates. FAR reduces fine-tuning time on the DistilBERT model and CoLA dataset by 30%, and time spent on memory operations by 47%. More broadly, reductions in metric performance on the GLUE and SQuAD datasets are around 1% on average.
Evaluating Multimodal Interactive Agents
Josh Abramson
Arun Ahuja
Federico Carnevale
Petko Georgiev
Alex Goldin
Alden Hung
Jessica Landon
Timothy P Lillicrap
Alistair M. Muldal
Blake Aaron Richards
Adam Santoro
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these… (see more) interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster proxy metrics often do not correlate well with interactive evaluation. In this paper, we assess the merits of these existing evaluation metrics and present a novel approach to evaluation called the Standardised Test Suite (STS). The STS uses behavioural scenarios mined from real human interaction data. Agents see replayed scenario context, receive an instruction, and are then given control to complete the interaction offline. These agent continuations are recorded and sent to human annotators to mark as success or failure, and agents are ranked according to the proportion of continuations in which they succeed. The resulting STS is fast, controlled, interpretable, and representative of naturalistic interactions. Altogether, the STS consolidates much of what is desirable across many of our standard evaluation metrics, allowing us to accelerate research progress towards producing agents that can interact naturally with humans. A video may be found at https://youtu.be/YR1TngGORGQ.
Correlated Read Noise Reduction in Infrared Arrays Using Deep Learning
Étienne Artigaud
Laurence Perreault Levasseur
René Doyon
We present a new procedure rooted in deep learning to construct science images from data cubes collected by astronomical instruments using H… (see more)xRG detectors in low-flux regimes. It improves on the drawbacks of the conventional algorithms to construct 2D images from multiple readouts by using the readout scheme of the detectors to reduce the impact of correlated readout noise. We train a convolutional recurrent neural network on simulated astrophysical scenes added to laboratory darks to estimate the flux on each pixel of science images. This method achieves a reduction of the noise on constructed science images when compared to standard flux-measurement schemes (correlated double sampling, up-the-ramp sampling), which results in a reduction of the error on the spectrum extracted from these science images. Over simulated data cubes created in a low signal-to-noise ratio regime where this method could have the largest impact, we find that the error on our constructed science images falls faster than a
MaskEval: Weighted MLM-Based Evaluation for Text Summarization and Simplification
Rachel Bawden
Thomas Scaliom
Benoı̂t Sagot
Jackie CK Cheung
AB0393 SURVIVAL ON JANUS KINASE INHIBITORS VERSUS OTHER ADVANCED THERAPIES IN RHEUMATOID ARTHRITIS
N. Bakhtiar
Leanne Gray
S. Bilgrami
Lesley Lesley Ottewell
Frank N. Wood
Mohsin Bukhari
ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
Sean Chen
Jensen Gao
Siddharth Reddy
Anca Dragan
Sergey Levine
Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e.g., webcam images of eye gaze) can… (see more) be challenging, especially when it involves inferring the user's desired action in the absence of a natural ‘default’ interface. Reinforcement learning from online user feedback on the system's performance presents a natural solution to this problem, and enables the interface to adapt to individual users. However, this approach tends to require a large amount of human-in-the-loop training data, especially when feedback is sparse. We propose a hierarchical solution that learns efficiently from sparse user feedback: we use offline pre-training to acquire a latent embedding space of useful, high-level robot behaviors, which, in turn, enables the system to focus on using online user feedback to learn a mapping from user inputs to desired high-level behaviors. The key insight is that access to a pre-trained policy enables the system to learn more from sparse rewards than a naïve RL algorithm: using the pre-trained policy, the system can make use of successful task executions to relabel, in hindsight, what the user actually meant to do during unsuccessful executions. We evaluate our method primarily through a user study with 12 participants who perform tasks in three simulated robotic manipulation domains using a webcam and their eye gaze: flipping light switches, opening a shelf door to reach objects inside, and rotating a valve. The results show that our method successfully learns to map 128-dimensional gaze features to 7-dimensional joint torques from sparse rewards in under 10 minutes of online training, and seamlessly helps users who employ different gaze strategies, while adapting to distributional shift in webcam inputs, tasks, and environments