Publications

Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints

Jose Gallego-Posada

The performance of trained neural networks is robust to harsh levels of pruning. Coupled with the ever-growing size of deep learning models,… (voir plus) this observation has motivated extensive research on learning sparse models. In this work, we focus on the task of controlling the level of sparsity when performing sparse learning. Existing methods based on sparsity-inducing penalties involve expensive trial-and-error tuning of the penalty factor, thus lacking direct control of the resulting model sparsity. In response, we adopt a constrained formulation: using the gate mechanism proposed by Louizos et al. (2018), we formulate a constrained optimization problem where sparsification is guided by the training objective and the desired sparsity target in an end-to-end fashion. Experiments on CIFAR-{10, 100}, TinyImageNet, and ImageNet using WideResNet and ResNet{18, 50} models validate the effectiveness of our proposal and demonstrate that we can reliably achieve pre-determined sparsity targets without compromising on predictive performance.

2022-10-30

NeurIPS.cc/2022/Conference (accepté)

doi.org

openreview.net

MAgNet: Mesh Agnostic Neural PDE Solver

The computational complexity of classical numerical methods for solving Partial Differential Equations (PDE) scales significantly as the res… (voir plus)olution increases. As an important example, climate predictions require fine spatio-temporal resolutions to resolve all turbulent scales in the fluid simulations. This makes the task of accurately resolving these scales computationally out of reach even with modern supercomputers. As a result, current numerical modelers solve PDEs on grids that are too coarse (3km to 200km on each side), which hinders the accuracy and usefulness of the predictions. In this paper, we leverage the recent advances in Implicit Neural Representations (INR) to design a novel architecture that predicts the spatially continuous solution of a PDE given a spatial position query. By augmenting coordinate-based architectures with Graph Neural Networks (GNN), we enable zero-shot generalization to new non-uniform meshes and long-term predictions up to 250 frames ahead that are physically consistent. Our Mesh Agnostic Neural PDE Solver (MAgNet) is able to make accurate predictions across a variety of PDE simulation datasets and compares favorably with existing baselines. Moreover, MAgNet generalizes well to different meshes and resolutions up to four times those trained on.

2022-10-30

NeurIPS.cc/2022/Conference (accepté)

doi.org

openreview.net

Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation

Brennan Nichyporuk

Jillian Cardinell

Justin Szeto

Raghav Mehta

Jean-Pierre R. Falet

Douglas L. Arnold

Sotirios A. Tsaftaris

Tal Arbel

Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, wh… (voir plus)ere unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.

2022-10-30

ArXiv (prépublication)

doi.org

arxiv.org

When Do We Need Graph Neural Networks for Node Classification?

Sitao Luan

Chenqing Hua

Qincheng Lu

Jiaqi Zhu

Xiao-Wen Chang

Doina Precup

2022-10-29

arXiv.org (prépublication)

doi.org

arxiv.org

Notational Programming for Notebook Environments: A Case Study with Quantum Circuits

Ian Arawjo

Anthony DeArmas

Michael Roberts

Shrutarshi Basu

Tapan Parikh

We articulate a vision for computer programming that includes pen-based computing, a paradigm we term notational programming. Notational pro… (voir plus)gramming blurs contexts: certain typewritten variables can be referenced in handwritten notation and vice-versa. To illustrate this paradigm, we developed an extension, Notate, to computational notebooks which allows users to open drawing canvases within lines of code. As a case study, we explore quantum programming and designed a notation, Qaw, that extends quantum circuit notation with abstraction features, such as variable-sized wire bundles and recursion. Results from a usability study with novices suggest that users find our core interaction of implicit cross-context references intuitive, but suggests further improvements to debugging infrastructure, interface design, and recognition rates. Throughout, we discuss questions raised by the notational paradigm, including a shift from ‘recognition’ of notations to ‘reconfiguration’ of practices and values around programming, and from ‘sketching’ to writing and drawing, or what we call ‘notating.’

2022-10-27

ACM Symposium on User Interface Software and Technology (publié)

doi.org

Segmentation of Multiple Sclerosis Lesions across Hospitals: Learn Continually or Train from Scratch?

Enamundram Naga Karthik

Anne Kerbrat

Pierre Labauge

Tobias Granberg

Jason F. Talbott

Daniel S Reich

Massimo Filippi

Rohit Bakshi

Virginie Callot

A. Chandar

Julien Cohen-Adad

Segmentation of Multiple Sclerosis (MS) lesions is a challenging problem. Several deep-learning-based methods have been proposed in recent y… (voir plus)ears. However, most methods tend to be static, that is, a single model trained on a large, specialized dataset, which does not generalize well. Instead, the model should learn across datasets arriving sequentially from different hospitals by building upon the characteristics of lesions in a continual manner. In this regard, we explore experience replay, a well-known continual learning method, in the context of MS lesion segmentation across multi-contrast data from 8 different hospitals. Our experiments show that replay is able to achieve positive backward transfer and reduce catastrophic forgetting compared to sequential fine-tuning. Furthermore, replay outperforms the multi-domain training, thereby emerging as a promising solution for the segmentation of MS lesions. The code is available at this link: https://github.com/naga-karthik/continual-learning-ms

2022-10-26

ArXiv (prépublication)

doi.org

arxiv.org

Arc travel time and path choice model estimation subsumed

Sobhan Mohammadpour

Emma Frejinger

We propose a method for maximum likelihood estimation of path choice model parameters and arc travel time using data of diﬀerent levels of… (voir plus) granularity. Hitherto these two tasks have been tackled separately under strong assumptions. Using a small example, we illustrate that this can lead to biased results. Results on both real (New York yellow cab) and simulated data show strong performance of our method compared to existing baselines. models and loss functions. It is designed to estimate arc travel time and path choice model parameters simultaneously. We showed that by marginalizing the unobserved variables and using stochastic gradient estimates, we obtain a maximum likelihood estimation even for observations at diﬀerent level of granularity. We showed that we can mix diﬀerent data type when computing the MLE without needing to use a linear combination of losses as

2022-10-24

ArXiv (prépublication)

doi.org

arxiv.org

Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model

Yuesong Zou

Ahmad Pesaranghader

Ziyang Song

Aman Verma

David L. Buckeridge

Yue Li

The rapid growth of electronic health record (EHR) datasets opens up promising opportunities to understand human diseases in a systematic wa… (voir plus)y. However, effective extraction of clinical knowledge from the EHR data has been hindered by its sparsity and noisy information. We present GAT-ETM, an end-to-end knowledge graph-based multimodal embedded topic model. GAT-ETM distills latent disease topics from EHR data by learning the embedding from a constructed medical knowledge graph. We applied GAT-ETM to a large-scale EHR dataset consisting of over 1 million patients. We evaluated its performance based on EHR reconstruction and drug imputation. GAT-ETM demonstrated superior performance over the alternative methods on both tasks. Moreover, our model learned clinically meaningful graph-informed embedding of the EHR codes. In additional, our model is also able to discover interpretable and accurate patient representations for patient stratification and drug recommendations. Our code is available at Anonymous GitHub.

2022-10-24

Scientific Reports (publié)

doi.org

Invariant representation driven neural classifier for anti-QCD jet tagging

Taoli Cheng

Aaron Courville

We leverage representation learning and the inductive bias in neural-net-based Standard Model jet classification tasks, to detect non-QCD si… (voir plus)gnal jets. In establishing the framework for classification-based anomaly detection in jet physics, we demonstrate that, with a \emph{well-calibrated} and \emph{powerful enough feature extractor}, a well-trained \emph{mass-decorrelated} supervised Standard Model neural jet classifier can serve as a strong generic anti-QCD jet tagger for effectively reducing the QCD background. Imposing \emph{data-augmented} mass-invariance (and thus decoupling the dominant factor) not only facilitates background estimation, but also induces more substructure-aware representation learning. We are able to reach excellent tagging efficiencies for all the test signals considered. In the best case, we reach a background rejection rate of 51 and a significance improvement factor of 3.6 at 50 \% signal acceptance, with the jet mass decorrelated. This study indicates that supervised Standard Model jet classifiers have great potential in general new physics searches.

2022-10-23

Journal of High Energy Physics (publié)

doi.org

arxiv.org

Learning Latent Structural Causal Models

Jithendaraa Subramanian

Yashas Annadani

Ivaxi Sheth

Nan Rosemary Ke

Tristan Deleu

Stefan Bauer

D. Nowrouzezahrai

S Ebrahimi Kahou

Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better e… (voir plus)xplanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model.

2022-10-23

ArXiv (prépublication)

doi.org

openreview.net

Machine learning-based incremental learning in interactive domain modelling

Rijul Saini

Gunter Mussbacher

Jin L.C. Guo

Jörg Kienzle

2022-10-23

Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems (publié)

doi.org

scCobra: Contrastive cell embedding learning with domain-adaptation for single-cell data integration and harmonization

Bowen Zhao

Dong-Qing Wei

Yi Xiong

Jun Ding

The rapid development of single-cell technologies has underscored the need for more effective methods in the integration and harmonization o… (voir plus)f single-cell sequencing data. The prevalent challenge of batch effects, resulting from technical and biological variations across studies, demands accurate and reliable solutions for data integration. Traditional tools often have limitations, both due to reliance on gene expression distribution assumptions and the common issue of over-correction, particularly in methods based on anchor alignments. Here we introduce scCobra, a deep neural network tool designed specifically to address these challenges. By leveraging a deep generative model that combines a contrastive neural network with domain adaptation, scCobra effectively mitigates batch effects and minimizes over-correction without depending on gene expression distribution assumptions. Additionally, scCobra enables online label transfer across datasets with batch effects, facilitating the continuous integration of new data without retraining, and offers features for batch effect simulation and advanced multi-omic batch integration. These capabilities make scCobra a versatile data integration and harmonization tool for achieving accurate and insightful biological interpretations from complex datasets.

2022-10-23

bioRxiv (prépublication)

doi.org

Mila sur Udemy

Désinformation 2.0 : quand l’IA brouille nos ondes

Publications du Fellowship en politiques de l'IA

Publications

Mila sur Udemy

Désinformation 2.0 : quand l’IA brouille nos ondes

Publications du Fellowship en politiques de l'IA

Mots-clés populaires:

Publications