Publications

Nash Games Among Stackelberg Leaders

Gabriele Dragotto

Felipe Feijoo

Sriram Sankaranarayanan

We analyze Nash games played among leaders of Stackelberg games (NASP). We show it is Σ p 2 - hard to decide if the game has a mixed-strate… (see more)gy Nash equilibrium (MNE), even when there are only two leaders and each leader has one follower. We provide a ﬁnite time algorithm with a running time bounded by O (2 2 n ) which computes MNEs for NASP when it exists and returns infeasibility if no MNE exists. We also provide two ways to improve the algorithm which involves constructing a series of inner approximations (alternatively, outer approximations) to the leaders’ feasible region that will provably obtain the required MNE. Finally, we test our algorithms on a range of NASPs arising out of a game in the energy market, where countries act as Stackelberg leaders who play a Nash game, and the domestic producers act as the followers.

2019-10-14

arXiv.org (preprint)

dblp.uni-trier.de

Improving Pathological Structure Segmentation via Transfer Learning Across Diseases

Barleen Kaur

Paul Lemaitre

Raghav Mehta

Nazanin Mohammadi Sepahvand

Doina Precup

Douglas Arnold

Tal Arbel

2019-10-13

Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data (published)

doi.org

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

S. Meng

Sharan Vaswani

Issam Hadj Laradji

Mark Schmidt

Simon Lacoste-Julien

We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition satisfied b… (see more)y over-parameterized models. Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyze stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic L-BFGS and a "Hessian-free" implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and real datasets under a kernel mapping. Our experimental results demonstrate the fast convergence of these methods, both in terms of the number of iterations and wall-clock time.

2019-10-11

ArXiv (preprint)

arxiv.org

Old Dog Learns New Tricks: Randomized UCB for Bandit Problems

Sharan Vaswani

Abbas Mehrabian

Audrey Durand

Branislav Kveton

We propose …

2019-10-11

ArXiv (preprint)

arxiv.org

Reinforcement Learning Models of Human Behavior: Reward Processing in Mental Disorders

Baihan Lin

Guillermo Cecchi

Djallel Bouneffouf

Jenna Reinen

Irina Rish

Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement … (see more)learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. For the AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Empirically, the proposed model outperforms Q-Learning and Double Q-Learning in artificial scenarios with certain reward distributions and real-world human decision making gambling tasks. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in long-term recommendation systems.

2019-10-02

NeurIPS.cc/2019/Workshop/Neuro_AI (poster)

openreview.net

Evaluation of a web-based tool for labelling potential hospital outbreaks: a mixed methods study

B. Leclère

David Buckeridge

D. Lepelletier

2019-10-01

Journal of Hospital Infection (published)

doi.org

Patterns of autism symptoms: hidden structure in the ADOS and ADI-R instruments

Jeremy Lefort-Besnard

Kai Vogeley

Leonhard Schilbach

Gael Varoquaux

Bertrand Thirion

Guillaume Dumas

Danilo Bzdok

2019-09-27

Translational Psychiatry (published)

doi.org

Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning

Thang Doan

Bogdan Mazoure

Audrey Durand

Joelle Pineau

(Rex) Devon Hjelm

Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensiona… (see more)l state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions. One way to avoid local optima is to use a population of agents to ensure coverage of the policy space, yet learning a population with the "best" coverage is still an open problem. In this work, we present a novel approach to population-based RL in continuous control that leverages properties of normalizing flows to perform attractive and repulsive operations between current members of the population and previously observed policies. Empirical results on the MuJoCo suite demonstrate a high performance gain for our algorithm compared to prior work, including Soft-Actor Critic (SAC).

2019-09-17

ArXiv (preprint)

openreview.net

Neural Architecture Search for Class-incremental Learning

Shenyang Huang

Vincent Francois-Lavet

Guillaume Rabusseau

In class-incremental learning, a model learns continuously from a sequential data stream in which new classes occur. Existing methods often … (see more)rely on static architectures that are manually crafted. These methods can be prone to capacity saturation because a neural network's ability to generalize to new concepts is limited by its fixed capacity. To understand how to expand a continual learner, we focus on the neural architecture design problem in the context of class-incremental learning: at each time step, the learner must optimize its performance on all classes observed so far by selecting the most competitive neural architecture. To tackle this problem, we propose Continual Neural Architecture Search (CNAS): an autoML approach that takes advantage of the sequential nature of class-incremental learning to efficiently and adaptively identify strong architectures in a continual learning setting. We employ a task network to perform the classification task and a reinforcement learning agent as the meta-controller for architecture search. In addition, we apply network transformations to transfer weights from previous learning step and to reduce the size of the architecture search space, thus saving a large amount of computational resources. We evaluate CNAS on the CIFAR-100 dataset under varied incremental learning scenarios with limited computational power (1 GPU). Experimental results demonstrate that CNAS outperforms architectures that are optimized for the entire dataset. In addition, CNAS is at least an order of magnitude more efficient than naively using existing autoML methods.

2019-09-14

ArXiv (preprint)

arxiv.org

Recognizable series on graphs and hypergraphs

Raphael Bailly

Guillaume Rabusseau

François Denis

2019-09-01

Journal of Computer and System Sciences (published)

doi.org

Teaching Modelling Literacy: An Artificial Intelligence Approach

Rijul Saini

Gunter Mussbacher

Jin Guo

Jörg Kienzle

In Model-Driven Engineering (MDE), models are used to build and analyze complex systems. In the last decades, different modelling formalisms… (see more) have been proposed for supporting software development. However, their adoption and practice strongly rely on mastering essential modelling skills to develop a complete and coherent model-based system. Moreover, it is often difficult for novice modellers to get direct and timely feedback and recommendations on their modelling strategies and decisions, particularly in large classroom settings which hinders their learning. Certainly, there is an opportunity to apply Artificial Intelligence (AI) techniques to an MDE learning environment to empower the provisioning of automated and intelligent modelling advocacy. In this paper, we propose a framework called ModBud (a modelling buddy) to educate novice modellers about the art of abstraction. ModBud uses natural language processing (NLP) and machine learning (ML) to create modelling bots with the aim of improving the modelling skills of novice modellers and assisting other practitioners, too. These bots could be used to support teaching with automatic creation or grading of models and enhance learning beyond the traditional classroom-based MDE education with timely feedback and personalized tutoring. Research challenges for the proposed framework are discussed and a research roadmap is presented.

2019-09-01

2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C) (published)

doi.org

Online Continual Learning with Maximally Interfered Retrieval

Rahaf Aljundi

Lucas Caccia

Eugene Belilovsky

Massimo Caccia

Min Lin

Laurent Charlin

Tinne Tuytelaars

Continual learning, the setting where a learning agent is faced with a never ending stream of data, continues to be a great challenge for mo… (see more)dern machine learning systems. In particular the online or "single-pass through the data" setting has gained attention recently as a natural setting that is difficult to tackle. Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks. These approaches typically rely on randomly selecting samples from the replay memory or from a generative model, which is suboptimal. In this work, we consider a controlled sampling of memories for replay. We retrieve the samples which are most interfered, i.e. whose prediction will be most negatively impacted by the foreseen parameters update. We show a formulation for this sampling criterion in both the generative replay and the experience replay setting, producing consistent gains in performance and greatly reduced forgetting. We release an implementation of our method at this https URL.

2019-08-11

ArXiv (preprint)

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications