Publications

Nash Games Among Stackelberg Leaders
Gabriele Dragotto
Felipe Feijoo
Sriram Sankaranarayanan
We analyze Nash games played among leaders of Stackelberg games (NASP). We show it is Σ p 2 - hard to decide if the game has a mixed-strate… (see more)gy Nash equilibrium (MNE), even when there are only two leaders and each leader has one follower. We provide a finite time algorithm with a running time bounded by O (2 2 n ) which computes MNEs for NASP when it exists and returns infeasibility if no MNE exists. We also provide two ways to improve the algorithm which involves constructing a series of inner approximations (alternatively, outer approximations) to the leaders’ feasible region that will provably obtain the required MNE. Finally, we test our algorithms on a range of NASPs arising out of a game in the energy market, where countries act as Stackelberg leaders who play a Nash game, and the domestic producers act as the followers.
Improving Pathological Structure Segmentation via Transfer Learning Across Diseases
Barleen Kaur
Paul Lemaitre
Raghav Mehta
Nazanin Mohammadi Sepahvand
Douglas Arnold
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
S. Meng
Sharan Vaswani
Issam Hadj Laradji
Mark Schmidt
We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition satisfied b… (see more)y over-parameterized models. Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient and Hessian, we show that R-SSN can converge at a quadratic rate in a local neighbourhood of the solution. We also show that R-SSN attains local linear convergence for the family of self-concordant functions. Furthermore, we analyze stochastic BFGS algorithms in the interpolation setting and prove their global linear convergence. We empirically evaluate stochastic L-BFGS and a "Hessian-free" implementation of R-SSN for binary classification on synthetic, linearly-separable datasets and real datasets under a kernel mapping. Our experimental results demonstrate the fast convergence of these methods, both in terms of the number of iterations and wall-clock time.
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems
Sharan Vaswani
Abbas Mehrabian
Branislav Kveton
We propose …
Reinforcement Learning Models of Human Behavior: Reward Processing in Mental Disorders
Baihan Lin
Guillermo Cecchi
Djallel Bouneffouf
Jenna Reinen
Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement … (see more)learning problem, which extends the standard Q-learning approach to incorporate a two-stream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. For the AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Empirically, the proposed model outperforms Q-Learning and Double Q-Learning in artificial scenarios with certain reward distributions and real-world human decision making gambling tasks. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in long-term recommendation systems.
Evaluation of a web-based tool for labelling potential hospital outbreaks: a mixed methods study
B. Leclère
D. Lepelletier
Patterns of autism symptoms: hidden structure in the ADOS and ADI-R instruments
Jeremy Lefort-Besnard
Kai Vogeley
Leonhard Schilbach
Gael Varoquaux
Bertrand Thirion
Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning
Thang Doan
Bogdan Mazoure
Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensiona… (see more)l state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions. One way to avoid local optima is to use a population of agents to ensure coverage of the policy space, yet learning a population with the "best" coverage is still an open problem. In this work, we present a novel approach to population-based RL in continuous control that leverages properties of normalizing flows to perform attractive and repulsive operations between current members of the population and previously observed policies. Empirical results on the MuJoCo suite demonstrate a high performance gain for our algorithm compared to prior work, including Soft-Actor Critic (SAC).
Neural Architecture Search for Class-incremental Learning
Shenyang Huang
Vincent Francois-Lavet
In class-incremental learning, a model learns continuously from a sequential data stream in which new classes occur. Existing methods often … (see more)rely on static architectures that are manually crafted. These methods can be prone to capacity saturation because a neural network's ability to generalize to new concepts is limited by its fixed capacity. To understand how to expand a continual learner, we focus on the neural architecture design problem in the context of class-incremental learning: at each time step, the learner must optimize its performance on all classes observed so far by selecting the most competitive neural architecture. To tackle this problem, we propose Continual Neural Architecture Search (CNAS): an autoML approach that takes advantage of the sequential nature of class-incremental learning to efficiently and adaptively identify strong architectures in a continual learning setting. We employ a task network to perform the classification task and a reinforcement learning agent as the meta-controller for architecture search. In addition, we apply network transformations to transfer weights from previous learning step and to reduce the size of the architecture search space, thus saving a large amount of computational resources. We evaluate CNAS on the CIFAR-100 dataset under varied incremental learning scenarios with limited computational power (1 GPU). Experimental results demonstrate that CNAS outperforms architectures that are optimized for the entire dataset. In addition, CNAS is at least an order of magnitude more efficient than naively using existing autoML methods.
Recognizable series on graphs and hypergraphs
Raphael Bailly
François Denis
Teaching Modelling Literacy: An Artificial Intelligence Approach
Rijul Saini
Gunter Mussbacher
Jörg Kienzle
In Model-Driven Engineering (MDE), models are used to build and analyze complex systems. In the last decades, different modelling formalisms… (see more) have been proposed for supporting software development. However, their adoption and practice strongly rely on mastering essential modelling skills to develop a complete and coherent model-based system. Moreover, it is often difficult for novice modellers to get direct and timely feedback and recommendations on their modelling strategies and decisions, particularly in large classroom settings which hinders their learning. Certainly, there is an opportunity to apply Artificial Intelligence (AI) techniques to an MDE learning environment to empower the provisioning of automated and intelligent modelling advocacy. In this paper, we propose a framework called ModBud (a modelling buddy) to educate novice modellers about the art of abstraction. ModBud uses natural language processing (NLP) and machine learning (ML) to create modelling bots with the aim of improving the modelling skills of novice modellers and assisting other practitioners, too. These bots could be used to support teaching with automatic creation or grading of models and enhance learning beyond the traditional classroom-based MDE education with timely feedback and personalized tutoring. Research challenges for the proposed framework are discussed and a research roadmap is presented.
Online Continual Learning with Maximally Interfered Retrieval
Rahaf Aljundi
Lucas Caccia
Massimo Caccia
Min Lin
Tinne Tuytelaars
Continual learning, the setting where a learning agent is faced with a never ending stream of data, continues to be a great challenge for mo… (see more)dern machine learning systems. In particular the online or "single-pass through the data" setting has gained attention recently as a natural setting that is difficult to tackle. Methods based on replay, either generative or from a stored memory, have been shown to be effective approaches for continual learning, matching or exceeding the state of the art in a number of standard benchmarks. These approaches typically rely on randomly selecting samples from the replay memory or from a generative model, which is suboptimal. In this work, we consider a controlled sampling of memories for replay. We retrieve the samples which are most interfered, i.e. whose prediction will be most negatively impacted by the foreseen parameters update. We show a formulation for this sampling criterion in both the generative replay and the experience replay setting, producing consistent gains in performance and greatly reduced forgetting. We release an implementation of our method at this https URL.