Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation
Brennan Nichyporuk
Jillian L. Cardinell
Justin Szeto
Raghav Mehta
Jean-Pierre R. Falet
Douglas Arnold
Sotirios A. Tsaftaris
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, wh… (see more)ere unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.
When Do We Need GNN for Node Classification?
Sitao Luan
Chenqing Hua
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
When Do We Need GNN for Node Classification?
Sitao Luan
Chenqing Hua
Qincheng Lu
Jiaqi Zhu
Xiao-Wen Chang
Notational Programming for Notebook Environments: A Case Study with Quantum Circuits
Ian A. Arawjo
Anthony DeArmas
Michael Roberts
Shrutarshi Basu
Tapan S. Parikh
We articulate a vision for computer programming that includes pen-based computing, a paradigm we term notational programming. Notational pro… (see more)gramming blurs contexts: certain typewritten variables can be referenced in handwritten notation and vice-versa. To illustrate this paradigm, we developed an extension, Notate, to computational notebooks which allows users to open drawing canvases within lines of code. As a case study, we explore quantum programming and designed a notation, Qaw, that extends quantum circuit notation with abstraction features, such as variable-sized wire bundles and recursion. Results from a usability study with novices suggest that users find our core interaction of implicit cross-context references intuitive, but suggests further improvements to debugging infrastructure, interface design, and recognition rates. Throughout, we discuss questions raised by the notational paradigm, including a shift from ‘recognition’ of notations to ‘reconfiguration’ of practices and values around programming, and from ‘sketching’ to writing and drawing, or what we call ‘notating.’
Notational Programming for Notebook Environments: A Case Study with Quantum Circuits
Anthony DeArmas
Michael Roberts
Shrutarshi Basu
Tapan Parikh
We articulate a vision for computer programming that includes pen-based computing, a paradigm we term notational programming. Notational pro… (see more)gramming blurs contexts: certain typewritten variables can be referenced in handwritten notation and vice-versa. To illustrate this paradigm, we developed an extension, Notate, to computational notebooks which allows users to open drawing canvases within lines of code. As a case study, we explore quantum programming and designed a notation, Qaw, that extends quantum circuit notation with abstraction features, such as variable-sized wire bundles and recursion. Results from a usability study with novices suggest that users find our core interaction of implicit cross-context references intuitive, but suggests further improvements to debugging infrastructure, interface design, and recognition rates. Throughout, we discuss questions raised by the notational paradigm, including a shift from ‘recognition’ of notations to ‘reconfiguration’ of practices and values around programming, and from ‘sketching’ to writing and drawing, or what we call ‘notating.’
Low-Rank Representation of Reinforcement Learning Policies
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.
Segmentation of Multiple Sclerosis Lesions across Hospitals: Learn Continually or Train from Scratch?
Enamundram Naga Karthik
Anne Kerbrat
Pierre Labauge
Tobias Granberg
Jason F. Talbott
Daniel S Reich
Massimo Filippi
Rohit Bakshi
Virginie Callot
Segmentation of Multiple Sclerosis (MS) lesions is a challenging problem. Several deep-learning-based methods have been proposed in recent y… (see more)ears. However, most methods tend to be static, that is, a single model trained on a large, specialized dataset, which does not generalize well. Instead, the model should learn across datasets arriving sequentially from different hospitals by building upon the characteristics of lesions in a continual manner. In this regard, we explore experience replay, a well-known continual learning method, in the context of MS lesion segmentation across multi-contrast data from 8 different hospitals. Our experiments show that replay is able to achieve positive backward transfer and reduce catastrophic forgetting compared to sequential fine-tuning. Furthermore, replay outperforms the multi-domain training, thereby emerging as a promising solution for the segmentation of MS lesions. The code is available at this link: https://github.com/naga-karthik/continual-learning-ms
Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
Yuesong Zou
Ahmad Pesaranghader
Ziyang Song
Aman Verma
Invariant representation driven neural classifier for anti-QCD jet tagging
Taoli Cheng
Learning Latent Structural Causal Models
Jithendaraa Subramanian
Yashas Annadani
Ivaxi Sheth
Nan Rosemary Ke
Tristan Deleu
Stefan Bauer
Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better e… (see more)xplanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model.
Machine learning-based incremental learning in interactive domain modelling
Rijul Saini
Gunter Mussbacher
Jörg Kienzle
scCobra: Contrastive cell embedding learning with domain-adaptation for single-cell data integration and harmonization
Bowen Zhao
Dong-Qing Wei
Yi Xiong