Publications

Attention for Inference Compilation
William Harvey
Andreas Munk
Atilim Güneş Baydin
Alexander Bergholm
Frank Wood
We present a new approach to automatic amortized inference in universal probabilistic programs which improves performance compared to curren… (voir plus)t methods. Our approach is a variation of inference compilation (IC) which leverages deep neural networks to approximate a posterior distribution over latent variables in a probabilistic program. A challenge with existing IC network architectures is that they can fail to model long-range dependencies between latent variables. To address this, we introduce an attention mechanism that attends to the most salient variables previously sampled in the execution of a probabilistic program. We demonstrate that the addition of attention allows the proposal distributions to better match the true posterior, enhancing inference about latent variables in simulators.
Predicting Adverse Radiation Effects in Brain Tumors After Stereotactic Radiotherapy With Deep Learning and Handcrafted Radiomics
Simon A. Keek
Manon Beuque
Sergey Primakov
Henry C. Woodruff
Avishek Chatterjee
Janita E. van Timmeren
Lizza E. L. Hendriks
Johannes Kraft
Nicolaus Andratschke
Steve E. Braunstein
Olivier Morin
Philippe Lambin
Introduction There is a cumulative risk of 20–40% of developing brain metastases (BM) in solid cancers. Stereotactic radiotherapy (SRT) en… (voir plus)ables the application of high focal doses of radiation to a volume and is often used for BM treatment. However, SRT can cause adverse radiation effects (ARE), such as radiation necrosis, which sometimes cause irreversible damage to the brain. It is therefore of clinical interest to identify patients at a high risk of developing ARE. We hypothesized that models trained with radiomics features, deep learning (DL) features, and patient characteristics or their combination can predict ARE risk in patients with BM before SRT. Methods Gadolinium-enhanced T1-weighted MRIs and characteristics from patients treated with SRT for BM were collected for a training and testing cohort (N = 1,404) and a validation cohort (N = 237) from a separate institute. From each lesion in the training set, radiomics features were extracted and used to train an extreme gradient boosting (XGBoost) model. A DL model was trained on the same cohort to make a separate prediction and to extract the last layer of features. Different models using XGBoost were built using only radiomics features, DL features, and patient characteristics or a combination of them. Evaluation was performed using the area under the curve (AUC) of the receiver operating characteristic curve on the external dataset. Predictions for individual lesions and per patient developing ARE were investigated. Results The best-performing XGBoost model on a lesion level was trained on a combination of radiomics features and DL features (AUC of 0.71 and recall of 0.80). On a patient level, a combination of radiomics features, DL features, and patient characteristics obtained the best performance (AUC of 0.72 and recall of 0.84). The DL model achieved an AUC of 0.64 and recall of 0.85 per lesion and an AUC of 0.70 and recall of 0.60 per patient. Conclusion Machine learning models built on radiomics features and DL features extracted from BM combined with patient characteristics show potential to predict ARE at the patient and lesion levels. These models could be used in clinical decision making, informing patients on their risk of ARE and allowing physicians to opt for different therapies.
Interpretable Malware Classification based on Functional Analysis
Miles Q. Li
Benjamin C. M. Fung
Transfer functions: learning about a lagged exposure-outcome association in time-series data
Hiroshi Mamiya
Alexandra M. Schmidt
Erica E. M. Moodie
David L. Buckeridge
Many population exposures in time-series analysis, including food marketing, exhibit a time-lagged association with population health outcom… (voir plus)es such as food purchasing. A common approach to measuring patterns of associations over different time lags relies on a finite-lag model, which requires correct specification of the maximum duration over which the lagged association extends. However, the maximum lag is frequently unknown due to the lack of substantive knowledge or the geographic variation of lag length. We describe a time-series analytical approach based on an infinite lag specification under a transfer function model that avoids the specification of an arbitrary maximum lag length. We demonstrate its application to estimate the lagged exposure-outcome association in food environmental research: display promotion of sugary beverages with lagged sales.
An Introduction to Lifelong Supervised Learning
Mojtaba Farmazi
Sanket Vaibhav Mehta
Mohamed Abdelsalam
Janarthanan Janarthanan
A. Chandar
This primer is an attempt to provide a detailed summary of the different facets of lifelong learning. We start with Chapter 2 which provides… (voir plus) a high-level overview of lifelong learning systems. In this chapter, we discuss prominent scenarios in lifelong learning (Section 2.4), provide 8 Introduction a high-level organization of different lifelong learning approaches (Section 2.5), enumerate the desiderata for an ideal lifelong learning system (Section 2.6), discuss how lifelong learning is related to other learning paradigms (Section 2.7), describe common metrics used to evaluate lifelong learning systems (Section 2.8). This chapter is more useful for readers who are new to lifelong learning and want to get introduced to the field without focusing on specific approaches or benchmarks. The remaining chapters focus on specific aspects (either learning algorithms or benchmarks) and are more useful for readers who are looking for specific approaches or benchmarks. Chapter 3 focuses on regularization-based approaches that do not assume access to any data from previous tasks. Chapter 4 discusses memory-based approaches that typically use a replay buffer or an episodic memory to save subset of data across different tasks. Chapter 5 focuses on different architecture families (and their instantiations) that have been proposed for training lifelong learning systems. Following these different classes of learning algorithms, we discuss the commonly used evaluation benchmarks and metrics for lifelong learning (Chapter 6) and wrap up with a discussion of future challenges and important research directions in Chapter 7.
Partial Disentanglement via Mechanism Sparsity
FIXME: synchronize with database! An empirical study of data access self-admitted technical debt
Biruk Asmare Muse
Csaba Nagy
Anthony Cleve
Giuliano Antoniol
Advanced MRI scan acquisition metrics improve baseline disease severity predictions compared to traditional community MRI scan metrics
Abdul Al-Shawwa
David W. Cadotte
David Anderson
Nathan Evaniew
Nathan Evaniew
Bradley Jacobs
Julien Cohen‐Adad
Degenerative Cervical Myelopathy (DCM) is the functional derangement of the spinal cord and acts as one of the most common atraumatic spinal… (voir plus) cord injuries. Magnetic resonance imaging (MRI) are key in confirming the diagnosis of DCM in patients, though the utilization of higher fidelity magnetic resonance imaging scans and their integration into machine learning models remains largely unexplored. This study looks at the predictive ability of common community MRI scans in comparison to high fidelity scans in disease diagnosis. We hypothesize that the utilization of higher fidelity "advanced" MRI scans will increase the effectiveness of machine learning models predicting DCM severity. Through the utilization of Random Forest Classifiers, we have been able to predict disease severity with 41.8% accuracy in current community MRI scans and 63.9% in the advanced MRI scans. Furthermore, across the different predictive model variations tested, the advanced MRI scans consistently produced higher prediction accuracies compared to the community MRI counterparts. These results support our hypothesis and indicate that machine learning models have the potential to predict disease severity. However, neither performed well enough to be considered for use in clinical practice, indicating that the utilization of more sophisticated machine models may be required for these purposes.
Joint Multisided Exposure Fairness for Recommendation
Bhaskar Mitra
Xue Liu
Prior research on exposure fairness in the context of recommender systems has focused mostly on disparities in the exposure of individual or… (voir plus) groups of items to individual users of the system. The problem of how individual or groups of items may be systemically under or over exposed to groups of users, or even all users, has received relatively less attention. However, such systemic disparities in information exposure can result in observable social harms, such as withholding economic opportunities from historically marginalized groups (allocative harm) or amplifying gendered and racialized stereotypes (representational harm). Previously, Diaz et al. developed the expected exposure metric---that incorporates existing user browsing models that have previously been developed for information retrieval---to study fairness of content exposure to individual users. We extend their proposed framework to formalize a family of exposure fairness metrics that model the problem jointly from the perspective of both the consumers and producers. Specifically, we consider group attributes for both types of stakeholders to identify and mitigate fairness concerns that go beyond individual users and items towards more systemic biases in recommendation. Furthermore, we study and discuss the relationships between the different exposure fairness dimensions proposed in this paper, as well as demonstrate how stochastic ranking policies can be optimized towards said fairness goals.
On Natural Language User Profiles for Transparent and Scrutable Recommendation
Filip Radlinski
Krisztian Balog
Lucas Dixon
Ben Wedin
Natural interaction with recommendation and personalized search systems has received tremendous attention in recent years. We focus on the c… (voir plus)hallenge of supporting people's understanding and control of these systems and explore a fundamentally new way of thinking about representation of knowledge in recommendation and personalization systems. Specifically, we argue that it may be both desirable and possible for algorithms that use natural language representations of users' preferences to be developed. We make the case that this could provide significantly greater transparency, as well as affordances for practical actionable interrogation of, and control over, recommendations. Moreover, we argue that such an approach, if successfully applied, may enable a major step towards systems that rely less on noisy implicit observations while increasing portability of knowledge of one's interests.
Retrieval-Enhanced Machine Learning
Hamed Zamani
Mostafa Dehghani
Donald Metzler
Michael Bendersky
Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of use… (voir plus)rs of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
Offline Retrieval Evaluation Without Evaluation Metrics
Offline evaluation of information retrieval and recommendation has traditionally focused on distilling the quality of a ranking into a scala… (voir plus)r metric such as average precision or normalized discounted cumulative gain. We can use this metric to compare the performance of multiple systems for the same request. Although evaluation metrics provide a convenient summary of system performance, they also collapse subtle differences across users into a single number and can carry assumptions about user behavior and utility not supported across retrieval scenarios. We propose recall-paired preference (RPP), a metric-free evaluation method based on directly computing a preference between ranked lists. RPP simulates multiple user subpopulations per query and compares systems across these pseudo-populations. Our results across multiple search and recommendation tasks demonstrate that RPP substantially improves discriminative power while correlating well with existing metrics and being equally robust to incomplete data.