Publications

Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon

Andrea Lodi

This paper surveys the recent attempts, both from the machine learning and operations research communities, at leveraging machine learning t… (see more)o solve combinatorial optimization problems. Given the hard nature of these problems, state-of-the-art algorithms rely on handcrafted heuristics for making decisions that are otherwise too expensive to compute or mathematically not well defined. Thus, machine learning looks like a natural candidate to make such decisions in a more principled and optimized way. We advocate for pushing further the integration of machine learning and combinatorial optimization and detail a methodology to do so. A main point of the paper is seeing generic optimization problems as data points and inquiring what is the relevant distribution of problems to use for learning on a given task.

2020-12-31

Eur. J. Oper. Res. (published)

doi.org

arxiv.org

MBAIL: Multi-Batch Best Action Imitation Learning utilizing Sample Transfer and Policy Distillation

Dingwei Wu

Tianyu Li

David Meger

M. Jenkin

Steve Liu

Gregory Dudek

Batch reinforcement learning (RL) aims to learn a good control policy from a previously collected dataset without requiring additional inter… (see more)actions with the environment. Unfortunately, in the real world, we may only have a limited amount of training data for tasks we are interested in. Most batch RL methods are intended to learn a policy over one ﬁxed dataset, and are not intended to learn a policy that can perform well over other tasks. How can we leverage the advantages of batch RL while dealing with limited training data is another challenge in real world. In this work, we propose to add sample transfer and policy distillation to a leading Batch RL approach. The proposed methods are evaluated on multiple control tasks to showcase their effectiveness.

2020-12-31

(published)

www.semanticscholar.org

MICo: Improved representations via sampling-based state similarity for Markov decision processes

Pablo Samuel Castro

Tyler Kastner

Prakash Panangaden

Mark Rowland

We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effe… (see more)ctive means of shaping the learnt representations of deep reinforcement learning agents. While existing notions of state similarity are typically difficult to learn at scale due to high computational cost and lack of sample-based algorithms, our newly-proposed distance addresses both of these issues. In addition to providing detailed theoretical analyses, we provide empirical evidence that learning this distance alongside the value function yields structured and informative representations, including strong results on the Arcade Learning Environment benchmark.

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (published)

openreview.net

MICo: Learning improved representations via sampling-based state similarity for Markov decision processes

Pablo Samuel Castro

Tyler Kastner

Prakash Panangaden

Mark Rowland

We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an eﬀ… (see more)ective means of shaping the learnt representations of deep reinforcement learning agents. While existing notions of state similarity are typically diﬃcult to learn at scale due to high computational cost and lack of sample-based algorithms, our newly-proposed distance addresses both of these issues. In addition to providing detailed theoretical analysis

2020-12-31

arXiv.org (preprint)

dblp.uni-trier.de

Multi-Agent Estimation and Filtering for Minimizing Team Mean-Squared Error

Mohammad Afshari

Aditya Mahajan

Motivated by estimation problems arising in autonomous vehicles and decentralized control of unmanned aerial vehicles, we consider multi-age… (see more)nt estimation and filtering problems in which multiple agents generate state estimates based on decentralized information and the objective is to minimize a coupled mean-squared error which we call team mean-square error. We call the resulting estimates as minimum team mean-squared error (MTMSE) estimates. We show that MTMSE estimates are different from minimum mean-squared error (MMSE) estimates. We derive closed-form expressions for MTMSE estimates, which are linear function of the observations where the corresponding gain depends on the weight matrix that couples the estimation error. We then consider a filtering problem where a linear stochastic process is monitored by multiple agents which can share their observations (with delay) over a communication graph. We derive expressions to recursively compute the MTMSE estimates. To illustrate the effectiveness of the proposed scheme we consider an example of estimating the distances between vehicles in a platoon and show that MTMSE estimates significantly outperform MMSE estimates and consensus Kalman filtering estimates.

2020-12-31

IEEE Transactions on Signal Processing (published)

doi.org

arxiv.org

2D Multi-Class Model for Gray and White Matter Segmentation of the Cervical Spinal Cord at 7T

Nilser J. Laines Medina

Charley Gros

Julien Cohen-Adad

Virginie Callot

Arnaud Le Troter

The spinal cord (SC), which conveys information between the brain and the peripheral nervous system, plays a key role in various neurologica… (see more)l disorders such as multiple sclerosis (MS) and amyotrophic lateral sclerosis (ALS), in which both gray matter (GM) and white matter (WM) may be impaired. While automated methods for WM/GM segmentation are now largely available, these techniques, developed for conventional systems (3T or lower) do not necessarily perform well on 7T MRI data, which feature finer details, contrasts, but also different artifacts or signal dropout. The primary goal of this study is thus to propose a new deep learning model that allows robust SC/GM multi-class segmentation based on ultra-high resolution 7T T2*-w MR images. The second objective is to highlight the relevance of implementing a specific data augmentation (DA) strategy, in particular to generate a generic model that could be used for multi-center studies at 7T.

2020-12-31

arXiv (preprint)

doi.org

arxiv.org

Multi-Domain Balanced Sampling Improves Out-of-Generalization of Chest X-ray Pathology Prediction Models

Enoch Amoatey Tetteh

Joseph D Viviano

Yoshua Bengio

David M. Krueger

Joseph Paul Cohen

Learning models that generalize under different distribution shifts in medical imaging has been a long-standing research challenge. There ha… (see more)ve been several proposals for efﬁcient and robust visual representation learning among vision research practitioners, especially in the sensitive and critical biomedical domain. In this paper, we propose an idea for out-of-distribution generalization of chest X-ray pathologies that uses a simple balanced batch sampling technique. We observed that balanced sampling between the multiple training datasets improves the performance over baseline models trained without balancing. Code for this work is available on Github. 1

2020-12-31

(published)

www.semanticscholar.org

Multilevel Approaches for the Critical Node Problem

Andrea Baggio

Margarida Carvalho

Andrea Lodi

Andrea Tramontani

In recent years, a lot of effort has been dedicated to develop strategies to defend networks against possible cascade failures or malicious … (see more)viral attacks. In particular, many results rely on two different viewpoints. On the one hand, network safety is investigated from a preventive perspective. In this paradigm, for a given network, the goal is to modify its structure, in order to minimize the propagation of failures. On the other hand, blocking models have been proposed for scenarios where the attack has already taken place. In this case, a harmful spreading process is assumed to propagate through the network with particular dynamics, allowing some time for an effective defensive reaction. In this work, we combine these two perspectives. More precisely, following the framework Defender-AttackerDefender, we consider a model of prevention, attack, and damage containment using a three-stage, sequential game. Thus, we assume the defender not only to be able to adopt preventive strategies but also to defend the network after an attack takes place. Assuming that the attacker will act optimally, we want to chose a defensive strategy for the first stage that would minimize the total damage to the network in the end of the third stage. Our contribution consists of considering this problem as a trilevel Mixed-Integer Program and design an exact algorithm for it based on tools developed for multilevel programming.

2020-12-31

Operational Research (published)

doi.org

Multimodal Audio-textual Architecture for Robust Spoken Language Understanding

Dmitriy Serdyuk

Yongqiang Wang

Christian Fue-730

Anuj Kumar

Baiyang Liu

Yoshua Bengio

Edwin Simonnet

Sahar Ghannay

Nathalie Camelin

Tandem spoken language understanding 001 (SLU) systems suffer from the so-called 002 automatic speech recognition (ASR) error 003 propagatio… (see more)n problem. Additionally, as the 004 ASR is not optimized to extract semantics, but 005 solely the linguistic content, relevant semantic 006 cues might be left out of its transcripts. In 007 this work, we propose a multimodal language 008 understanding (MLU) architecture to mitigate 009 these problems. Our solution is based on 010 two compact unidirectional long short-term 011 memory (LSTM) models that encode speech 012 and text information. A fusion layer is also 013 used to fuse audio and text embeddings. 014 Two fusion strategies are explored: a simple 015 concatenation of these embeddings and a 016 cross-modal attention mechanism that learns 017 the contribution of each modality. The ﬁrst 018 approach showed to be the optimal solution 019 to robustly extract semantic information from 020 audio-textual data. We found that attention 021 is less effective at testing time when the text 022 modality is corrupted. Our model is evaluated 023 on three SLU datasets and robustness is tested 024 using ASR outputs from three off-the-shelf 025 ASR engines. Results show that the proposed 026 approach effectively mitigates the ASR error 027 propagation problem for all datasets. 028

2020-12-31

(published)

www.semanticscholar.org

Neural Approximate Sufficient Statistics for Implicit Models

Yanzhi Chen

Dinghuai Zhang

Michael U. Gutmann

Aaron Courville

Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation… (see more) of the likelihood function is intractable, but sampling data from the model is possible. The idea is to frame the task of constructing sufficient statistics as learning mutual information maximizing representations of the data with the help of deep neural networks. The infomax learning procedure does not need to estimate any density or density ratio. We apply our approach to both traditional approximate Bayesian computation and recent neural likelihood methods, boosting their performance on a range of tasks.

2020-12-31

ICLR (published)

doi.org

openreview.net

A Novel and Dedicated Machine Learning Model for Malware Classification

Miles Q. Li

Benjamin C. M. Fung

Philippe Charland

Steven H. H. Ding

: Malicious executables are comprised of functions that can be represented in assembly code. In the assembly code mining literature, many so… (see more)ftware reverse engineering tools have been created to disassemble executables, search function clones, and ﬁnd vulnerabilities, among others. The development of a machine learning-based malware classiﬁcation model that can simultaneously achieve excellent classiﬁcation performance and provide insightful interpretation for the classiﬁcation results remains to be a hot research topic. In this paper, we propose a novel and dedicated machine learning model for the research problem of malware classiﬁcation. Our proposed model generates assembly code function clusters based on function representation learning and provides excellent interpretability for the classiﬁcation results. It does not require a large or balanced dataset to train which meets the situation of real-life scenarios. Experiments show that our proposed approach outperforms previous state-of-the-art malware classiﬁcation models and provides meaningful interpretation of classiﬁcation results.

2020-12-31

International Conference on Software and Data Technologies (published)

doi.org

A Novel Neural Network-Based Malware Severity Classification System

Miles Q. Li

Benjamin C. M. Fung

2020-12-31

International Conference on Software and Data Technologies (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications