Publications

Brain-spinal cord interaction in long-term motor sequence learning in human: An fMRI study

Ali Khatibi

Shahabeddin Vahdat

Ovidiu Lungu

Jürgen Finsterbusch

Christian Büchel

Julien Cohen-Adad

Veronique Marchand-Pauvert

Julien Doyon

2021-12-31

NeuroImage (published)

doi.org

Building Together - Towards a Roadmap for African Language Technologies

Kathleen Siminyu

Jade Abbott

Kọ́lá Túbọ̀sún

Aremu Anuoluwapo

Blessing Kudzaishe Sibanda

Kofi Yeboah

David Ifeoluwa Adelani

Masabata Mokgesi-Selinga

Frederick R. Apina

Angela Thandizwe Mthembu

Arshath Ramkilowan

Babatunde Oladimeji

2021-12-31

Social Science Research Network (published)

doi.org

Catalyzing next-generation Artificial Intelligence through NeuroAI

Anthony Zador

Blake Aaron Richards

Bence Ölveczky

Sean Escola

Yoshua Bengio

Kwabena Boahen

Matthew Botvinick

Dmitri Chklovskii

Anne Churchland

Claudia Clopath

James DiCarlo

Surya Ganguli

Jeff Hawkins

Konrad Paul Kording

Alexei Koulakov

Yann Lecun

Timothy P Lillicrap

Adam Marblestone

Bruno Olshausen

Alexandre Pouget … (see 7 more)

Cristina Savin

Terrence Sejnowski

Eero Simoncelli

Sara Solla

David Sussillo

Andreas S. Tolias

Doris Tsao

2021-12-31

arXiv.org (preprint)

doi.org

arxiv.org

Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise

Eduard Gorbunov

Marina Danilova

David Dobre

Pavel Dvurechensky

Alexander Gasnikov

Gauthier Gidel

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

Cognitive Models as Simulators: The Case of Moral Decision-Making

Ardavan S. Nobandegani

T. Shultz

Irina Rish

2021-12-31

CogSci (published)

doi.org

arxiv.org

COIL: A Deep Architecture for Column Generation

Behrouz Babaki

Laurent Charlin

Sanjay Dominik Jena

. Column generation is a popular method to solve large-scale linear programs with an exponential number of variables. Several important appl… (see more)ications, such as the vehicle routing problem, rely on this technique in order to be solved. However, in practice, column generation methods suffer from slow convergence (i.e. they require too many iterations). Stabilization techniques, which carefully select the column to add at each iteration, are commonly used to improve convergence. In this work, we frame the problem of selecting which columns to add as one of sequential decision-making. We propose a neural column generation architecture that iteratively selects columns to be added to the problem. Our architecture is inspired by stabilization techniques and predicts the optimal duals, which are then used to select the columns to add. We proposed architecture, trained using imitation learning. Exemplified on the Vehicle Routing Problem, we show that several machine learning models yield good performance in predicting the optimal duals and that our architecture outperforms them as well as a popular state-of-the-art stabilization technique. Further, the architecture approach can generalize to instances larger than those observed during training.

2021-12-31

(published)

www.semanticscholar.org

Combining Reinforcement Learning and Constraint Programming for Sequence-Generation Tasks with Hard Constraints

Daphne Lafleur

A. Chandar

Gilles Pesant

While Machine Learning (ML) techniques are good at generating data similar to a dataset, they lack the capacity to enforce constraints. On t… (see more)he other hand, any solution to a Constraint Programming (CP) model satisfies its constraints but has no obligation to imitate a dataset. Yet, we sometimes need both. In this paper we borrow RL-Tuner, a Reinforcement Learning (RL) algorithm introduced to tune neural networks, as our enabling architecture to exploit the respective strengths of ML and CP. RL-Tuner maximizes the sum of a pretrained network’s learned probabilities and of manually-tuned penalties for each violated constraint. We replace the latter with outputs of a CP model representing the marginal probabilities of each value and the number of constraint violations. As was the case for the original RL-Tuner, we apply our algorithm to music generation since it is a highly-constrained domain for which CP is especially suited. We show that combining ML and CP, as opposed to using them individually, allows the agent to reflect the pretrained network while taking into account constraints, leading to melodic lines that respect both the corpus’ style and the music theory constraints.

2021-12-31

International Conference on Principles and Practice of Constraint Programming (published)

doi.org

Computing Nash Equilibria for Integer Programming Games

Margarida Carvalho

Andrea Lodi

João Pedro Pedroso

The recently defined class of integer programming games (IPG) models situations where multiple self-interested decision makers interact, wit… (see more)h their strategy sets represented by a finite set of linear constraints together with integer requirements. Many real-world problems can suitably be fit in this class, and hence anticipating IPG outcomes is of crucial value for policy makers and regulators. Nash equilibria have been widely accepted as the solution concept of a game. Consequently, their computation provides a reasonable prediction of the games outcome. In this paper, we start by showing the computational complexity of deciding the existence of a Nash equilibrium for an IPG. Then, using sufficient conditions for their existence, we develop two general algorithmic approaches that are guaranteed to approximate an equilibrium under mild conditions. We also showcase how our methodology can be changed to determine other equilibria definitions. The performance of our methods is analyzed through computational experiments in a knapsack game, a competitive lot-sizing game, and a kidney exchange game. To the best of our knowledge, this is the first time that equilibria computation methods for general integer programming games have been designed and computationally tested.

2021-12-31

European Journal of Operational Research (published)

doi.org

arxiv.org

A Conceptual Framework for Representing Events Under Public Health Surveillance.

Anya Okhmatovskaia

Yannan Shen

Iris Ganser

Nigel Collier

Nicholas B. King

Zaiqiao Meng

David L. Buckeridge

Information integration across multiple event-based surveillance (EBS) systems has been shown to improve global disease surveillance in expe… (see more)rimental settings. In practice, however, integration does not occur due to the lack of a common conceptual framework for encoding data within EBS systems. We aim to address this gap by proposing a candidate conceptual framework for representing events and related concepts in the domain of public health surveillance.

2021-12-31

Medical Informatics Europe (published)

doi.org

Consistency-CAM: Towards Improved Weakly Supervised Semantic Segmentation.

Sai Rajeswar

Issam Hadj Laradji

Pau Rodríguez

David Vázquez

Aaron Courville

2021-12-31

BMVC (published)

dblp.uni-trier.de

Continual Learning In Environments With Polynomial Mixing Times

Matthew Riemer

Sharath Chandra Raparthy

Ignacio Cases

Gopeshh Subbaraj

Maximilian Puelma Touzel

Irina Rish

The mixing time of the Markov chain induced by a policy limits performance in real-world continual learning scenarios. Yet, the effect of mi… (see more)xing times on learning in continual reinforcement learning (RL) remains underexplored. In this paper, we characterize problems that are of long-term interest to the development of continual RL, which we call scalable MDPs, through the lens of mixing times. In particular, we theoretically establish that scalable MDPs have mixing times that scale polynomially with the size of the problem. We go on to demonstrate that polynomial mixing times present significant difficulties for existing approaches, which suffer from myopic bias and stale bootstrapped estimates. To validate our theory, we study the empirical scaling behavior of mixing times with respect to the number of tasks and task duration for high performing policies deployed across multiple Atari games. Our analysis demonstrates both that polynomial mixing times do emerge in practice and how their existence may lead to unstable learning behavior like catastrophic forgetting in continual learning settings.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

Continuous MDP Homomorphisms and Homomorphic Policy Gradient

Sahand Rezaei-Shoshtari

Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms. In this pape… (see more)r, we study abstraction in the continuous-control setting. We extend the definition of MDP homomorphisms to encompass continuous actions in continuous state spaces. We derive a policy gradient theorem on the abstract MDP, which allows us to leverage approximate symmetries of the environment for policy optimization. Based on this theorem, we propose an actor-critic algorithm that is able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. We demonstrate the effectiveness of our method on benchmark tasks in the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance when learning from pixel observations.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (published)

doi.org

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications