Publications

Multi-Domain Balanced Sampling Improves Out-of-Generalization of Chest X-ray Pathology Prediction Models

Enoch Amoatey Tetteh

Joseph D Viviano

David M. Krueger

Learning models that generalize under different distribution shifts in medical imaging has been a long-standing research challenge. There ha… (see more)ve been several proposals for efﬁcient and robust visual representation learning among vision research practitioners, especially in the sensitive and critical biomedical domain. In this paper, we propose an idea for out-of-distribution generalization of chest X-ray pathologies that uses a simple balanced batch sampling technique. We observed that balanced sampling between the multiple training datasets improves the performance over baseline models trained without balancing. Code for this work is available on Github. 1

2020-12-31

(published)

www.semanticscholar.org

Multilevel Approaches for the Critical Node Problem

Andrea Baggio

Margarida Carvalho

Andrea Lodi

Andrea Tramontani

In recent years, a lot of effort has been dedicated to develop strategies to defend networks against possible cascade failures or malicious … (see more)viral attacks. In particular, many results rely on two different viewpoints. On the one hand, network safety is investigated from a preventive perspective. In this paradigm, for a given network, the goal is to modify its structure, in order to minimize the propagation of failures. On the other hand, blocking models have been proposed for scenarios where the attack has already taken place. In this case, a harmful spreading process is assumed to propagate through the network with particular dynamics, allowing some time for an effective defensive reaction. In this work, we combine these two perspectives. More precisely, following the framework Defender-AttackerDefender, we consider a model of prevention, attack, and damage containment using a three-stage, sequential game. Thus, we assume the defender not only to be able to adopt preventive strategies but also to defend the network after an attack takes place. Assuming that the attacker will act optimally, we want to chose a defensive strategy for the first stage that would minimize the total damage to the network in the end of the third stage. Our contribution consists of considering this problem as a trilevel Mixed-Integer Program and design an exact algorithm for it based on tools developed for multilevel programming.

2020-12-31

Operational Research (published)

doi.org

Multimodal Audio-textual Architecture for Robust Spoken Language Understanding

Dmitriy Serdyuk

Yongqiang Wang

Christian Fue-730

Anuj Kumar

Baiyang Liu

Yoshua Bengio

Edwin Simonnet

Sahar Ghannay

Nathalie Camelin

Tandem spoken language understanding 001 (SLU) systems suffer from the so-called 002 automatic speech recognition (ASR) error 003 propagatio… (see more)n problem. Additionally, as the 004 ASR is not optimized to extract semantics, but 005 solely the linguistic content, relevant semantic 006 cues might be left out of its transcripts. In 007 this work, we propose a multimodal language 008 understanding (MLU) architecture to mitigate 009 these problems. Our solution is based on 010 two compact unidirectional long short-term 011 memory (LSTM) models that encode speech 012 and text information. A fusion layer is also 013 used to fuse audio and text embeddings. 014 Two fusion strategies are explored: a simple 015 concatenation of these embeddings and a 016 cross-modal attention mechanism that learns 017 the contribution of each modality. The ﬁrst 018 approach showed to be the optimal solution 019 to robustly extract semantic information from 020 audio-textual data. We found that attention 021 is less effective at testing time when the text 022 modality is corrupted. Our model is evaluated 023 on three SLU datasets and robustness is tested 024 using ASR outputs from three off-the-shelf 025 ASR engines. Results show that the proposed 026 approach effectively mitigates the ASR error 027 propagation problem for all datasets. 028

2020-12-31

(published)

www.semanticscholar.org

Neural Approximate Sufficient Statistics for Implicit Models

Yanzhi Chen

Dinghuai Zhang

Michael U. Gutmann

Aaron Courville

Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation… (see more) of the likelihood function is intractable, but sampling data from the model is possible. The idea is to frame the task of constructing sufficient statistics as learning mutual information maximizing representations of the data with the help of deep neural networks. The infomax learning procedure does not need to estimate any density or density ratio. We apply our approach to both traditional approximate Bayesian computation and recent neural likelihood methods, boosting their performance on a range of tasks.

2020-12-31

ICLR (published)

doi.org

openreview.net

A Novel and Dedicated Machine Learning Model for Malware Classification

Miles Q. Li

Benjamin C. M. Fung

Philippe Charland

Steven H. H. Ding

: Malicious executables are comprised of functions that can be represented in assembly code. In the assembly code mining literature, many so… (see more)ftware reverse engineering tools have been created to disassemble executables, search function clones, and ﬁnd vulnerabilities, among others. The development of a machine learning-based malware classiﬁcation model that can simultaneously achieve excellent classiﬁcation performance and provide insightful interpretation for the classiﬁcation results remains to be a hot research topic. In this paper, we propose a novel and dedicated machine learning model for the research problem of malware classiﬁcation. Our proposed model generates assembly code function clusters based on function representation learning and provides excellent interpretability for the classiﬁcation results. It does not require a large or balanced dataset to train which meets the situation of real-life scenarios. Experiments show that our proposed approach outperforms previous state-of-the-art malware classiﬁcation models and provides meaningful interpretation of classiﬁcation results.

2020-12-31

International Conference on Software and Data Technologies (published)

doi.org

A Novel Neural Network-Based Malware Severity Classification System

Miles Q. Li

Benjamin C. M. Fung

2020-12-31

International Conference on Software and Data Technologies (published)

doi.org

On-the-Fly Attention Modularization for Neural Generation

Yue Dong

Chandra Bhagavatula

Ximing Lu

Jena D. Hwang

Antoine Bosselut

Jackie CK Cheung

Yejin Choi

Despite considerable advancements with deep neural language models (LMs), neural text generation still suffers from de generation: generated… (see more) text is repetitive, generic, self-inconsistent, and lacking commonsense. The empirical analyses on sentence-level attention patterns reveal that neural text degeneration may be associated with insufﬁcient learning of inductive biases by the attention mechanism. Our ﬁndings motivate on-the-ﬂy attention modularization, a simple but effective method for injecting inductive biases into attention computation during inference. The resulting text produced by the language model with attention modularization can yield enhanced diversity and commonsense reasoning while maintaining ﬂuency and coherence.

2020-12-31

arXiv.org (preprint)

dblp.uni-trier.de

Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata

Borja Balle

We address the approximate minimization problem for weighted finite automata (WFAs) with weights in …

2020-12-31

arXiv (preprint)

doi.org

arxiv.org

Optimization of Artificial Neural Network Hyperparameters For Processing Retrospective Information

A. Rogachev

F. Scholle

Yann Lecun

Yoshua Bengio

I. L. Kashirin

M. Demchenko

. Justification of the selection of the architecture and hyperparameters of artificial neural networks (ANN), focused on solving various cla… (see more)sses of applied problems, is a scientific and methodological problem. Optimizing the selection of ANN hyperparameters allows you to improve the quality and speed of ANN training. Various methods of optimizing the selection of ANN hyper-parameters are known – the use of evolutionary calculations, genetic algorithms, etc., but they require the use of additional software. To optimize the process of selecting ANN hyperparameters, Google Research has developed the KerasTuner software tool. It is a platform for automated search of a set of optimal combinations of hyperparameters. In Kerastuner, you can use various methods - random search, Bayesian optimization, or Hyperband. In the numerical experiments conducted by the author, 14 hyperparameters were varied, including the number of blocks of convolutional layers and the filters forming them, the type of activation function, the parameters of the "dropout" layers, and others. The studied tools demonstrated high efficiency while simultaneously varying more than a dozen optimized parameters of the convolutional network. The calculation time on the Colaboratory platform for the various combined ANN architectures studied, including recurrent RNN networks, was several hours, even with the use of GPU graphics accelerators. For ANN, focused on the processing and recognition of retrospective information, an increase in the quality of recognition was achieved to 80 ... 95%.

2020-12-31

(published)

www.semanticscholar.org

Overview of the TREC 2021 Fair Ranking Track

Asia J. Biega

Fernando Diaz

Michael D. Ekstrand

Sebastian Kohlmeier

The TREC Fair Ranking Track aims to provide a platform for participants to develop and evaluate novel retrieval algorithms that can provide … (see more)a fair exposure to a mixture of demographics or attributes, such as ethnicity, that are represented by relevant documents in response to a search query. For example, particular demographics or attributes can be represented by the documents' topical content or authors. The 2021 Fair Ranking Track adopted a resource allocation task. The task focused on supporting Wikipedia editors who are looking to improve the encyclopedia's coverage of topics under the purview of a WikiProject. WikiProject coordinators and/or Wikipedia editors search for Wikipedia documents that are in need of editing to improve the quality of the article. The 2021 Fair Ranking track aimed to ensure that documents that are about, or somehow represent, certain protected characteristics receive a fair exposure to the Wikipedia editors, so that the documents have an fair opportunity of being improved and, therefore, be well-represented in Wikipedia. The under-representation of particular protected characteristics in Wikipedia can result in systematic biases that can have a negative human, social, and economic impact, particularly for disadvantaged or protected societal groups.

2020-12-31

TREC (published)

doi.org

arxiv.org

Personalized Medicine for OSA Syndrome in a Nutshell: Conceptual Clarification for Integration.

Christophe Gauld

Marie Darrason

Guillaume Dumas

Jean‐Arthur Micoulaud‐Franchi

2020-12-31

Chest (published)

doi.org

PMFL: Partial Meta-Federated Learning for heterogeneous tasks and its applications on real-world medical records

Tianyi Zhang

Shirui Zhang

Ziwei Chen

Yoshua Bengio

Dianbo Liu

Federated machine learning is a versatile and flexible tool to utilize distributed data from different sources, especially when communicatio… (see more)n technology develops rapidly and an unprecedented amount of data could be collected on mobile devices nowadays. Federated learning method exploits not only the data but the computational power of all devices in the network to achieve more efficient model training. Nevertheless, while most traditional federated learning methods work well for homogeneous data and tasks, adapting the method to heterogeneous data and task distribution is challenging. This limitation has constrained the applications of federated learning in real-world contexts, especially in healthcare settings. Inspired by the fundamental idea of meta-learning, in this study we propose a new algorithm, which is an integration of federated learning and meta-learning, to tackle this issue. In addition, owing to the advantage of transfer learning for model generalization, we further improve our algorithm by introducing partial parameter sharing to balance global and local learning. We name this method partial meta-federated learning (PMFL). Finally, we apply the algorithms to two medical datasets. We show that our algorithm could obtain the fastest training speed and achieve the best performance when dealing with heterogeneous medical datasets.

2020-12-31

IEEE International Conference on Big Data (unknown)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications