Publications

On the Effectiveness of Interpretable Feedforward Neural Network

Miles Q. Li

Adel Abusitta

Deep learning models have achieved state-of-the-art performance in many classification tasks. However, most of them cannot provide an explan… (see more)ation for their classification results. Machine learning models that are interpretable are usually linear or piecewise linear and yield inferior performance. Non-linear models achieve much better classification performance, but it is usually hard to explain their classification results. As a counter-example, an interpretable feedforward neural network (IFFNN) is proposed to achieve both high classification performance and interpretability for malware detection. If the IFFNN can perform well in a more flexible and general form for other classification tasks while providing meaningful explanations, it may be of great interest to the applied machine learning community. In this paper, we propose a way to generalize the interpretable feedforward neural network to multi-class classification scenarios and any type of feedforward neural networks, and evaluate its classification performance and interpretability on interpretable datasets. We conclude by finding that the generalized IFFNNs achieve comparable classification performance to their normal feedforward neural network counterparts and provide meaningful explanations. Thus, this kind of neural network architecture has great practical use.

2021-11-03

ArXiv (preprint)

doi.org

arxiv.org

Vesicular trafficking is a key determinant of the statin response in acute myeloid leukemia

Jana K Krosl

Marie-Eve Bordeleau

Céline Moison

Tara MacRae

Isabel Boivin

Nadine Mayotte

Deanne Gracias

Irène Baccelli

Vincent-Philippe Lavallee

Richard Bisaillon

Bernhard Lehnertz

Rodrigo Mendoza-Sanchez

Réjean Ruel

Thierry Bertomeu

Jasmin Coulombe-Huntington

Geneviève Boucher

Nandita Noronha

C. Pabst

M. Tyers

Patrick Gendron … (see 5 more)

Sébastien Lemieux

Frederic Barabe

Anne Marinier

Josée Hébert

Guy Sauvageau

Key Points Inhibition of RAB protein function mediates the anti–acute myeloid leukemia activity of statins. Statin sensitivity is associat… (see more)ed with enhanced vesicle-mediated traffic.

2021-11-03

Blood Advances (published)

doi.org

Vesicular trafficking is a key determinant of the statin response in acute myeloid leukemia

Jana Krosl

Marie-Eve Bordeleau

Céline Moison

Tara MacRae

Isabel Boivin

Nadine Mayotte

Deanne Gracias

Irène Baccelli

Vincent-Philippe Lavallee

Richard Bisaillon

Bernhard Lehnertz

Rodrigo Mendoza-Sanchez

Réjean Ruel

Thierry Bertomeu

Jasmin Coulombe-Huntington

Geneviève Boucher

Nandita Noronha

Caroline Pabst

Mike Tyers

Patrick Gendron … (see 5 more)

Sébastien Lemieux

Frederic Barabe

Anne Marinier

Josée Hébert

Guy Sauvageau

Key Points Inhibition of RAB protein function mediates the anti–acute myeloid leukemia activity of statins. Statin sensitivity is associat… (see more)ed with enhanced vesicle-mediated traffic.

2021-11-03

Blood Advances (published)

doi.org

Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval

Devang Kulshreshtha

Robert Belfer

Iulian V. Serban

Siva Reddy

In this work, we introduce back-training, an alternative to self-training for unsupervised domain adaptation (UDA). While self-training gene… (see more)rates synthetic training data where natural inputs are aligned with noisy outputs, back-training results in natural outputs aligned with noisy inputs. This significantly reduces the gap between target domain and synthetic data distribution, and reduces model overfitting to source domain. We run UDA experiments on question generation and passage retrieval from the Natural Questions domain to machine learning and biomedical domains. We find that back-training vastly outperforms self-training by a mean improvement of 7.8 BLEU-4 points on generation, and 17.6% top-20 retrieval accuracy across both domains. We further propose consistency filters to remove low-quality synthetic data before training. We also release a new domain-adaptation dataset - MLQuestions containing 35K unaligned questions, 50K unaligned passages, and 3K aligned question-passage pairs.

2021-11-01

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (published)

doi.org

arxiv.org

Estimating treatment effect for individuals with progressive multiple sclerosis using deep learning

JR Falet

Joshua D. Durso-Finley

Brennan Nichyporuk

Jan Schroeter

Francesca Bovis

Maria-Pia Sormani

Doina Precup

Tal Arbel

Douglas Arnold

2021-11-01

medRxiv (preprint)

doi.org

Opioid prescribing among new users for non-cancer pain in the USA, Canada, UK, and Taiwan: A population-based cohort study

Meghna Jani

Nadyne Girard

David W. Bates

David Buckeridge

Therese Sheppard

Jack Li

Usman Iqbal

Shelly Vik

Colin Weaver

Judy Seidel

William G. Dixon

Robyn Tamblyn

Background The opioid epidemic in North America has been driven by an increase in the use and potency of prescription opioids, with ensuing … (see more)excessive opioid-related deaths. Internationally, there are lower rates of opioid-related mortality, possibly because of differences in prescribing and health system policies. Our aim was to compare opioid prescribing rates in patients without cancer, across 5 centers in 4 countries. In addition, we evaluated differences in the type, strength, and starting dose of medication and whether these characteristics changed over time. Methods and findings We conducted a retrospective multicenter cohort study of adults who are new users of opioids without prior cancer. Electronic health records and administrative health records from Boston (United States), Quebec and Alberta (Canada), United Kingdom, and Taiwan were used to identify patients between 2006 and 2015. Standard dosages in morphine milligram equivalents (MMEs) were calculated according to The Centers for Disease Control and Prevention. Age- and sex-standardized opioid prescribing rates were calculated for each jurisdiction. Of the 2,542,890 patients included, 44,690 were from Boston (US), 1,420,136 Alberta, 26,871 Quebec (Canada), 1,012,939 UK, and 38,254 Taiwan. The highest standardized opioid prescribing rates in 2014 were observed in Alberta at 66/1,000 persons compared to 52, 51, and 18/1,000 in the UK, US, and Quebec, respectively. The median MME/day (IQR) at initiation was highest in Boston at 38 (20 to 45); followed by Quebec, 27 (18 to 43); Alberta, 23 (9 to 38); UK, 12 (7 to 20); and Taiwan, 8 (4 to 11). Oxycodone was the first prescribed opioid in 65% of patients in the US cohort compared to 14% in Quebec, 4% in Alberta, 0.1% in the UK, and none in Taiwan. One of the limitations was that data were not available from all centers for the entirety of the 10-year period. Conclusions In this study, we observed substantial differences in opioid prescribing practices for non-cancer pain between jurisdictions. The preference to start patients on higher MME/day and more potent opioids in North America may be a contributing cause to the opioid epidemic.

2021-11-01

PLoS Medicine (published)

doi.org

Refining BERT Embeddings for Document Hashing via Mutual Information Maximization

Zijing Ou

Qinliang Su

Jianxing Yu

Ruihui Zhao

Yefeng Zheng

Bang Liu

Existing unsupervised document hashing methods are mostly established on generative models. Due to the difficulties of capturing long depend… (see more)ency structures, these methods rarely model the raw documents directly, but instead to model the features extracted from them (e.g. bag-of-words (BOW), TFIDF). In this paper, we propose to learn hash codes from BERT embeddings after observing their tremendous successes on downstream tasks. As a first try, we modify existing generative hashing models to accommodate the BERT embeddings. However, little improvement is observed over the codes learned from the old BOW or TFIDF features. We attribute this to the reconstruction requirement in the generative hashing, which will enforce irrelevant information that is abundant in the BERT embeddings also compressed into the codes. To remedy this issue, a new unsupervised hashing paradigm is further proposed based on the mutual information (MI) maximization principle. Specifically, the method first constructs appropriate global and local codes from the documents and then seeks to maximize their mutual information. Experimental results on three benchmark datasets demonstrate that the proposed method is able to generate hash codes that outperform existing ones learned from BOW features by a substantial margin.

2021-11-01

Findings of the Association for Computational Linguistics: EMNLP 2021 (published)

doi.org

arxiv.org

The Topic Confusion Task: A Novel Evaluation Scenario for Authorship Attribution

Malik H. Altakrori

Jackie Cheung

Benjamin Fung

2021-11-01

Findings of the Association for Computational Linguistics: EMNLP 2021 (published)

doi.org

arxiv.org

Visually Grounded Reasoning across Languages and Cultures

Fangyu Liu

Emanuele Bugliarello

Edoardo Ponti

Siva Reddy

Nigel Collier

Desmond Elliott

The design of widespread vision-and-language datasets and pre-trained encoders directly adopts, or draws inspiration from, the concepts and … (see more)images of ImageNet. While one can hardly overestimate how much this benchmark contributed to progress in computer vision, it is mostly derived from lexical databases and image queries in English, resulting in source material with a North American or Western European bias. Therefore, we devise a new protocol to construct an ImageNet-style hierarchy representative of more languages and cultures. In particular, we let the selection of both concepts and images be entirely driven by native speakers, rather than scraping them automatically. Specifically, we focus on a typologically diverse set of languages, namely, Indonesian, Mandarin Chinese, Swahili, Tamil, and Turkish. On top of the concepts and images obtained through this new protocol, we create a multilingual dataset for Multicultural Reasoning over Vision and Language (MaRVL) by eliciting statements from native speaker annotators about pairs of images. The task consists of discriminating whether each grounded statement is true or false. We establish a series of baselines using state-of-the-art models and find that their cross-lingual transfer performance lags dramatically behind supervised performance in English. These results invite us to reassess the robustness and accuracy of current state-of-the-art models beyond a narrow domain, but also open up new exciting challenges for the development of truly multilingual and multicultural systems.

2021-11-01

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (published)

doi.org

openreview.net

From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence

Nicholas Roy

Ingmar Posner

T. Barfoot

Philippe Beaudoin

Yoshua Bengio

Jeannette Bohg

Oliver Brock

Isabelle Depatie

Dieter Fox

D. Koditschek

Tom'as Lozano-p'erez

Vikash K. Mansinghka

Chris Pal

Blake Richards

Dorsa Sadigh

Stefan Schaal

G. Sukhatme

Denis Therien

Marc Emile Toussaint

Michiel van de Panne

2021-10-28

ArXiv (preprint)

arxiv.org

How do AI systems fail socially?: an engineering risk analysis approach

Shalaleh Rismani

AJung Moon

Failure Mode and Effect Analysis (FMEA) has been used as an engineering risk assessment tool since 1949. FMEAs are effective in preemptively… (see more) identifying and addressing how a device or process might fail in operation and are often used in the design of high-risk technology applications such as military, automotive industry and medical devices. In this work, we explore whether FMEAs can serve as a risk assessment tool for machine learning practitioners, especially in deploying systems for high-risk applications (e.g. algorithms for recidivism assessment). In particular, we discuss how FMEAs can be used to identify social and ethical failures of Artificial Intelligent Systemss (AISs), recognizing that FMEAs have the potential to uncover a broader range of failures. We first propose a process for developing a Social FMEAs (So-FMEAs) by building on the existing FMEAs framework and a recently published definition of Social Failure Modes by Millar. We then demonstrate a simple proof-of-concept, So-FMEAs for the COMPAS algorithm, a risk assessment tool used by judges to make recidivism-related decisions for convicted individuals. Through this preliminary investigation, we illustrate how a traditional engineering risk management tool could be adapted for analyzing social and ethical failures of AIS. Engineers and designers of AISs can use this new approach to improve their system's design and perform due diligence with respect to potential ethical and social failures.

2021-10-28

2021 IEEE International Symposium on Ethics in Engineering, Science and Technology (ETHICS) (published)

doi.org

A Survey of Self-Supervised and Few-Shot Object Detection

Gabriel Huang

Issam Hadj Laradji

David Vazquez

Simon Lacoste-Julien

Pau Rodriguez

Labeling data is often expensive and time-consuming, especially for tasks such as object detection and instance segmentation, which require … (see more)dense labeling of the image. While few-shot object detection is about training a model on novel (unseen) object classes with little data, it still requires prior training on many labeled examples of base (seen) classes. On the other hand, self-supervised methods aim at learning representations from unlabeled data which transfer well to downstream tasks such as object detection. Combining few-shot and self-supervised object detection is a promising research direction. In this survey, we review and characterize the most recent approaches on few-shot and self-supervised object detection. Then, we give our main takeaways and discuss future research directions. Project page: https://gabrielhuang.github.io/fsod-survey/.

2021-10-27

ArXiv (preprint)

doi.org

arxiv.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications