Publications

Tensor of Quantitative Equational Theories.

Giorgio Bacci

Radu Mardare

Prakash Panangaden

Gordon Plotkin

2020-12-31

CALCO (publié)

doi.org

Test Sample Accuracy Scales with Training Sample Density in Neural Networks

Xu Ji

Razvan Pascanu

Devon Hjelm

Andrea Vedaldi

Balaji Lakshminarayanan

Yoshua Bengio

Intuitively, one would expect accuracy of a trained neural network's prediction on test samples to correlate with how densely the samples ar… (voir plus)e surrounded by seen training samples in representation space. We find that a bound on empirical training error smoothed across linear activation regions scales inversely with training sample density in representation space. Empirically, we verify this bound is a strong predictor of the inaccuracy of the network's prediction on test samples. For unseen test sets, including those with out-of-distribution samples, ranking test samples by their local region's error bound and discarding samples with the highest bounds raises prediction accuracy by up to 20% in absolute terms for image classification datasets, on average over thresholds.

2020-12-31

arXiv.org (prépublication)

doi.org

openreview.net

Textual Time Travel: A Temporally Informed Approach to Theory of Mind

Akshatha Arodi

Jackie CK Cheung

Natural language processing systems such as dialogue agents should be able to reason about other people’s beliefs, intentions and desires.… (voir plus) This capability, called theory of mind (ToM), is crucial, as it allows a model to predict and interpret the needs of users based on their mental states. A recent line of research evaluates the ToM capability of existing memoryaugmented neural models through questionanswering. These models perform poorly on false belief tasks where beliefs differ from reality, especially when the dataset contains distracting sentences. In this paper, we propose a new temporally informed approach for improving the ToM capability of memory-augmented neural models. Our model incorporates priors about the entities’ minds and tracks their mental states as they evolve over time through an extended passage. It then responds to queries through textual time travel—i.e., by accessing the stored memory of an earlier time step. We evaluate our model on ToM datasets and find that this approach improves performance, particularly by correcting the predicted mental states to match the false belief.

2020-12-31

Conference on Empirical Methods in Natural Language Processing (publié)

doi.org

On the Expressivity of Markov Reward

David Abel

Will Dabney

Anna Harutyunyan

Mark K. Ho

Michael L. Littman

Doina Precup

Satinder Singh

Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way … (voir plus)to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of"task"that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories. Our main results prove that while reward can express many of these tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists. We conclude with an empirical study that corroborates and illustrates our theoretical findings.

2020-12-31

Advances in Neural Information Processing Systems 34 (NeurIPS 2021) (publié)

openreview.net

The Locomotive Assignment Problem with Distributed Power at the Canadian National Railway Company

Camilo Ortiz-Astorquiza

Jean-François Cordeau

Emma Frejinger

Some of the most important optimization problems faced by railway operators arise from the management of their locomotive fleet. In this pap… (voir plus)er, we study a general version of the locomotive assignment problem encountered at the tactical level by one of the largest railroads in North America: the Canadian National Railway Company (CN). We present a modeling framework with two integer linear programming formulations and contribute to the state of the art by allowing to decide each train's operating mode (distributed power or not) over the whole (weekly) planning horizon without partitioning it into smaller time windows. Given the difficulty to solve the problem, one of the formulations is enhanced through various refinements such as constraint relaxations, preprocessing and fixed cost approximations. We thus achieve a significant reduction in the required computational time to solve instances of realistic size. We also present two versions of a Benders decomposition-based algorithm to obtain feasible solutions. On average, it allows to reduce the associated computational time by two hours. Results from an extensive computational study and a case study with data provided by CN confirm the potential benefits of the model and solution approach.

2020-12-31

Transportation Science (publié)

doi.org

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Maxime Gasse

Simon Bowly

Quentin Cappart

Jonas Charfreitag

Laurent Charlin

Didier Chételat

Antonia Chmiela

Justin Dumouchelle

Ambros Gleixner

Aleksandr M. Kazachkov

Elias Khalil

Pawel Lichocki

Andrea Lodi

Miles Lubin

Chris J. Maddison

Christopher Morris

Dimitri J. Papageorgiou

Augustin Parjadis

Sebastian Pokutta

Antoine Prouvost … (voir 22 de plus)

Lara Scavuzzo

Giulia Zarpellon

Linxin Yang

Sha Lai

Akang Wang

Xiaodong Luo

Shuchang Zhou

Haohan Huang

Shengcheng Shao

Yuanming Zhu

Akang Wang

Mao Kun

Zixuan Cao

Yuanming Zhu

Zhewei Huang

Shuchang Zhou

C. Binbin

He Minggui

Hao Hao

Shuchang Zhou

Mao Kun

Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused … (voir plus)on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either directly as solvers or by enhancing exact solvers. Based on this context, the ML4CO aims at improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components. The competition featured three challenging tasks: finding the best feasible solution, producing the tightest optimality certificate, and giving an appropriate solver configuration. Three realistic datasets were considered: balanced item placement, workload apportionment, and maritime inventory routing. This last dataset was kept anonymous for the contestants.

2020-12-31

NeurIPS (Competition and Demos) (publié)

doi.org

proceedings.mlr.press

The role of case importation in explaining differences in early SARS-CoV-2 transmission dynamics in Canada—A mathematical modeling study of surveillance data

Arnaud Godin

Yiqing Xia

David L Buckeridge

Sharmistha Mishra

Dirk Douwes-Schultz

Yannan Shen

Maxime Lavigne

Mélanie Drolet

Alexandra M Schmidt

Marc Brisson

Mathieu Maheu-Giroux

2020-12-31

International Journal of Infectious Diseases (publié)

doi.org

The Topic Confusion Task: A Novel Scenario for Authorship Attribution

Malik H. Altakrori

Jackie CK Cheung

Benjamin C. M. Fung

Authorship attribution is the problem of identifying the most plausible author of an anonymous text from a set of candidate authors. Researc… (voir plus)hers have investigated same-topic and cross-topic scenarios of authorship attribution, which differ according to whether unseen topics are used in the testing phase. However, neither scenario allows us to explain whether errors are caused by failure to capture authorship style, by the topic shift or by other factors. Motivated by this, we propose the topic confusion task, where we switch the author-topic conﬁg-uration between training and testing set. This setup allows us to probe errors in the attribution process. We investigate the accuracy and two error measures: one caused by the models’ confusion by the switch because the features capture the topics, and one caused by the features’ inability to capture the writing styles, leading to weaker models. By evaluating different features, we show that stylometric features with part-of-speech tags are less susceptible to topic variations and can increase the accuracy of the attribution process. We further show that combining them with word-level n - grams can outperform the state-of-the-art technique in the cross-topic scenario. Finally, we show that pretrained language models such as BERT and RoBERTa perform poorly on this task, and are outperformed by simple n -gram features.

2020-12-31

arXiv.org (prépublication)

dblp.uni-trier.de

A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix

Thang Doan

Mehdi Bennani

Bogdan Mazoure

Guillaume Rabusseau

Pierre Alquier

Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime. Although maj… (voir plus)or advances have been made in the field, one recurring problem which remains unsolved is that of Catastrophic Forgetting (CF). While the issue has been extensively studied empirically, little attention has been paid from a theoretical angle. In this paper, we show that the impact of CF increases as two tasks increasingly align. We introduce a measure of task similarity called the NTK overlap matrix which is at the core of CF. We analyze common projected gradient algorithms and demonstrate how they mitigate forgetting. Then, we propose a variant of Orthogonal Gradient Descent (OGD) which leverages structure of the data through Principal Component Analysis (PCA). Experiments support our theoretical findings and show how our method can help reduce CF on classical CL datasets.

2020-12-31

AISTATS (publié)

doi.org

proceedings.mlr.press

Toward Causal Representation Learning

Bernhard Schölkopf

Francesco Locatello

Stefan Bauer

Nan Rosemary Ke

Nal Kalchbrenner

Anirudh Goyal

Yoshua Bengio

The two fields of machine learning and graphical causality arose and are developed separately. However, there is, now, cross-pollination and… (voir plus) increasing interest in both fields to benefit from the advances of the other. In this article, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, that is, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities.

2020-12-31

Proceedings of the IEEE (publié)

doi.org

Toward Tweet-Mining Framework for Extracting Terrorist Attack-Related Information and Reporting

Farkhund Iqbal

Rabia Batool

Benjamin C. M. Fung

Saiqa Aleem

Ahmed Abbasi

Abdul Rehman Javed

The widespread popularity of social networking is leading to the adoption of Twitter as an information dissemination tool. Existing research… (voir plus) has shown that information dissemination over Twitter has a much broader reach than traditional media and can be used for effective post-incident measures. People use informal language on Twitter, including acronyms, misspelled words, synonyms, transliteration, and ambiguous terms. This makes incident-related information extraction a non-trivial task. However, this information can be valuable for public safety organizations that need to respond in an emergency. This paper proposes an early event-related information extraction and reporting framework that monitors Twitter streams synthesizes event-specific information, e.g., a terrorist attack, and alerts law enforcement, emergency services, and media outlets. Specifically, the proposed framework, Tweet-to-Act (T2A), employs word embedding to transform tweets into a vector space model and then utilizes the Word Mover’s Distance (WMD) to cluster tweets for the identification of incidents. To extract reliable and valuable information from a large dataset of short and informal tweets, the proposed framework employs sequence labeling with bidirectional Long Short-Term Memory based Recurrent Neural Networks (bLSTM-RNN). Extensive experimental results suggest that our proposed framework, T2A, outperforms other state-of-the-art methods that use vector space modeling and distance calculation techniques, e.g., Euclidean and Cosine distance. T2A achieves an accuracy of 96% and an F1-score of 86.2% on real-life datasets.

2020-12-31

IEEE Access (publié)

doi.org

Towards a Trace-Preserving Tensor Network Representation of Quantum Channels

Siddarth Srinivasan

Sandesh M. Adhikary

Jacob Miller

Bibek Pokharel

Guillaume Rabusseau

Byron Boots

The problem of characterizing quantum channels arises in a number of contexts such as quantum process tomography and quantum error correctio… (voir plus)n. However, direct approaches to parameterizing and optimizing the Choi matrix representation of quantum channels face a curse of dimensionality: the number of parameters scales exponentially in the number of qubits. Recently, Torlai et al. [2020] proposed using locally puriﬁed density operators (LPDOs), a tensor network representation of Choi matrices, to overcome the unfavourable scaling in parameters. While the LPDO structure allows it to satisfy a ‘complete positivity’ (CP) constraint required of physically valid quantum channels, it makes no guarantees about a similarly required ‘trace preservation’ (TP) constraint. In practice, the TP constraint is violated, and the learned quantum channel may even be trace-increasing, which is non-physical. In this work, we present the problem of optimizing over TP LPDOs, discuss two approaches to characterizing the TP constraints on LPDOs, and outline the next steps for developing an optimization scheme.

2020-12-31

(publié)

www.semanticscholar.org

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications