The Causal-Neural Connection: Expressiveness, Learnability, and Inference
Kevin Muyuan Xia
Kai-Zhan Lee
Elias Bareinboim
One of the central elements of any causal inference is an object called structural causal model (SCM), which represents a collection of mech… (voir plus)anisms and exogenous sources of random variation of the system under investigation (Pearl, 2000). An important property of many kinds of neural networks is universal approximability: the ability to approximate any function to arbitrary precision. Given this property, one may be tempted to surmise that a collection of neural nets is capable of learning any SCM by training on data generated by that SCM. In this paper, we show this is not the case by disentangling the notions of expressivity and learnability. Specifically, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020), which describes the limits of what can be learned from data, still holds for neural models. For instance, an arbitrarily complex and expressive neural net is unable to predict the effects of interventions given observational data alone. Given this result, we introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences. Building on this new class of models, we focus on solving two canonical tasks found in the literature known as causal identification and estimation. Leveraging the neural toolbox, we develop an algorithm that is both sufficient and necessary to determine whether a causal effect can be learned from data (i.e., causal identifiability); it then estimates the effect whenever identifiability holds (causal estimation). Simulations corroborate the proposed approach.
The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning
Patrick J Mineault
Timothy P. Lillicrap
Christopher C. Pack
The visual system of mammals is comprised of parallel, hierarchical specialized pathways. Different pathways are specialized in so far as th… (voir plus)ey use representations that are more suitable for supporting specific downstream behaviours. In particular, the clearest example is the specialization of the ventral (“what”) and dorsal (“where”) pathways of the visual cortex. These two pathways support behaviours related to visual recognition and movement, respectively. To-date, deep neural networks have mostly been used as models of the ventral, recognition pathway. However, it is unknown whether both pathways can be modelled with a single deep ANN. Here, we ask whether a single model with a single loss function can capture the properties of both the ventral and the dorsal pathways. We explore this question using data from mice, who like other mammals, have specialized pathways that appear to support recognition and movement behaviours. We show that when we train a deep neural network architecture with two parallel pathways using a self-supervised predictive loss function, we can outperform other models in fitting mouse visual cortex. Moreover, we can model both the dorsal and ventral pathways. These results demonstrate that a self-supervised predictive learning approach applied to parallel pathway architectures can account for some of the functional specialization seen in mammalian visual systems.
Problèmes associés au déploiement des modèles fondés sur l’apprentissage machine en santé
Joseph Paul Cohen
Tianshi Cao
Joseph D Viviano
Chin-Wei Huang
Michael Fralick
Marzyeh Ghassemi
Muhammad Mamdani
Russell Greiner
Learned Image Compression for Machine Perception
Felipe Codevilla
Jean Gabriel Simard
On the Effectiveness of Interpretable Feedforward Neural Network
Miles Q. Li
Adel Abusitta
Deep learning models have achieved state-of-the-art performance in many classification tasks. However, most of them cannot provide an explan… (voir plus)ation for their classification results. Machine learning models that are interpretable are usually linear or piecewise linear and yield inferior performance. Non-linear models achieve much better classification performance, but it is usually hard to explain their classification results. As a counter-example, an interpretable feedforward neural network (IFFNN) is proposed to achieve both high classification performance and interpretability for malware detection. If the IFFNN can perform well in a more flexible and general form for other classification tasks while providing meaningful explanations, it may be of great interest to the applied machine learning community. In this paper, we propose a way to generalize the interpretable feedforward neural network to multi-class classification scenarios and any type of feedforward neural networks, and evaluate its classification performance and interpretability on interpretable datasets. We conclude by finding that the generalized IFFNNs achieve comparable classification performance to their normal feedforward neural network counterparts and provide meaningful explanations. Thus, this kind of neural network architecture has great practical use.
Vesicular trafficking is a key determinant of the statin response in acute myeloid leukemia
Jana K Krosl
Marie-Eve Bordeleau
Céline Moison
Tara MacRae
Isabel Boivin
Nadine Mayotte
Deanne Gracias
Irène Baccelli
Vincent-Philippe Lavallee
Richard Bisaillon
Bernhard Lehnertz
Rodrigo Mendoza-Sanchez
Réjean Ruel
Thierry Bertomeu
Jasmin Coulombe-Huntington
Geneviève Boucher
Nandita Noronha
C. Pabst
M. Tyers
Patrick Gendron … (voir 5 de plus)
Frederic Barabe
Anne Marinier
Josée Hébert
Guy Sauvageau
Key Points Inhibition of RAB protein function mediates the anti–acute myeloid leukemia activity of statins. Statin sensitivity is associat… (voir plus)ed with enhanced vesicle-mediated traffic.
Vesicular trafficking is a key determinant of the statin response in acute myeloid leukemia
Jana Krosl
Marie-Eve Bordeleau
Céline Moison
Tara MacRae
Isabel Boivin
Nadine Mayotte
Deanne Gracias
Irène Baccelli
Vincent-Philippe Lavallee
Richard Bisaillon
Bernhard Lehnertz
Rodrigo Mendoza-Sanchez
Réjean Ruel
Thierry Bertomeu
Jasmin Coulombe-Huntington
Geneviève Boucher
Nandita Noronha
Caroline Pabst
Mike Tyers
Patrick Gendron … (voir 5 de plus)
Frederic Barabe
Anne Marinier
Josée Hébert
Guy Sauvageau
Key Points Inhibition of RAB protein function mediates the anti–acute myeloid leukemia activity of statins. Statin sensitivity is associat… (voir plus)ed with enhanced vesicle-mediated traffic.
Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval
Devang Kulshreshtha
Robert Belfer
Iulian V. Serban
In this work, we introduce back-training, an alternative to self-training for unsupervised domain adaptation (UDA). While self-training gene… (voir plus)rates synthetic training data where natural inputs are aligned with noisy outputs, back-training results in natural outputs aligned with noisy inputs. This significantly reduces the gap between target domain and synthetic data distribution, and reduces model overfitting to source domain. We run UDA experiments on question generation and passage retrieval from the Natural Questions domain to machine learning and biomedical domains. We find that back-training vastly outperforms self-training by a mean improvement of 7.8 BLEU-4 points on generation, and 17.6% top-20 retrieval accuracy across both domains. We further propose consistency filters to remove low-quality synthetic data before training. We also release a new domain-adaptation dataset - MLQuestions containing 35K unaligned questions, 50K unaligned passages, and 3K aligned question-passage pairs.
Estimating treatment effect for individuals with progressive multiple sclerosis using deep learning
JR Falet
Joshua D. Durso-Finley
Brennan Nichyporuk
Jan Schroeter
Francesca Bovis
Maria-Pia Sormani
Douglas Arnold
Opioid prescribing among new users for non-cancer pain in the USA, Canada, UK, and Taiwan: A population-based cohort study
Meghna Jani
Nadyne Girard
David W. Bates
Therese Sheppard
Jack Li
Usman Iqbal
Shelly Vik
Colin Weaver
Judy Seidel
William G. Dixon
Robyn Tamblyn
Background The opioid epidemic in North America has been driven by an increase in the use and potency of prescription opioids, with ensuing … (voir plus)excessive opioid-related deaths. Internationally, there are lower rates of opioid-related mortality, possibly because of differences in prescribing and health system policies. Our aim was to compare opioid prescribing rates in patients without cancer, across 5 centers in 4 countries. In addition, we evaluated differences in the type, strength, and starting dose of medication and whether these characteristics changed over time. Methods and findings We conducted a retrospective multicenter cohort study of adults who are new users of opioids without prior cancer. Electronic health records and administrative health records from Boston (United States), Quebec and Alberta (Canada), United Kingdom, and Taiwan were used to identify patients between 2006 and 2015. Standard dosages in morphine milligram equivalents (MMEs) were calculated according to The Centers for Disease Control and Prevention. Age- and sex-standardized opioid prescribing rates were calculated for each jurisdiction. Of the 2,542,890 patients included, 44,690 were from Boston (US), 1,420,136 Alberta, 26,871 Quebec (Canada), 1,012,939 UK, and 38,254 Taiwan. The highest standardized opioid prescribing rates in 2014 were observed in Alberta at 66/1,000 persons compared to 52, 51, and 18/1,000 in the UK, US, and Quebec, respectively. The median MME/day (IQR) at initiation was highest in Boston at 38 (20 to 45); followed by Quebec, 27 (18 to 43); Alberta, 23 (9 to 38); UK, 12 (7 to 20); and Taiwan, 8 (4 to 11). Oxycodone was the first prescribed opioid in 65% of patients in the US cohort compared to 14% in Quebec, 4% in Alberta, 0.1% in the UK, and none in Taiwan. One of the limitations was that data were not available from all centers for the entirety of the 10-year period. Conclusions In this study, we observed substantial differences in opioid prescribing practices for non-cancer pain between jurisdictions. The preference to start patients on higher MME/day and more potent opioids in North America may be a contributing cause to the opioid epidemic.
Refining BERT Embeddings for Document Hashing via Mutual Information Maximization
Zijing Ou
Qinliang Su
Jianxing Yu
Ruihui Zhao
Yefeng Zheng
Existing unsupervised document hashing methods are mostly established on generative models. Due to the difficulties of capturing long depend… (voir plus)ency structures, these methods rarely model the raw documents directly, but instead to model the features extracted from them (e.g. bag-of-words (BOW), TFIDF). In this paper, we propose to learn hash codes from BERT embeddings after observing their tremendous successes on downstream tasks. As a first try, we modify existing generative hashing models to accommodate the BERT embeddings. However, little improvement is observed over the codes learned from the old BOW or TFIDF features. We attribute this to the reconstruction requirement in the generative hashing, which will enforce irrelevant information that is abundant in the BERT embeddings also compressed into the codes. To remedy this issue, a new unsupervised hashing paradigm is further proposed based on the mutual information (MI) maximization principle. Specifically, the method first constructs appropriate global and local codes from the documents and then seeks to maximize their mutual information. Experimental results on three benchmark datasets demonstrate that the proposed method is able to generate hash codes that outperform existing ones learned from BOW features by a substantial margin.
The Topic Confusion Task: A Novel Evaluation Scenario for Authorship Attribution
Malik H. Altakrori