Publications

Model-Invariant State Abstractions for Model-Based Reinforcement Learning

Manan Tomar

Amy Zhang

Roberto Calandra

Matthew E. Taylor

Accuracy and generalization of dynamics models is key to the success of model-based reinforcement learning (MBRL). As the complexity of task… (voir plus)s increases, so does the sample inefficiency of learning accurate dynamics models. However, many complex tasks also exhibit sparsity in the dynamics, i.e., actions have only a local effect on the system dynamics. In this paper, we exploit this property with a causal invariance perspective in the single-task setting, introducing a new type of state abstraction called \textit{model-invariance}. Unlike previous forms of state abstractions, a model-invariance state abstraction leverages causal sparsity over state variables. This allows for compositional generalization to unseen states, something that non-factored forms of state abstractions cannot do. We prove that an optimal policy can be learned over this model-invariance state abstraction and show improved generalization in a simple toy domain. Next, we propose a practical method to approximately learn a model-invariant representation for complex domains and validate our approach by showing improved modelling performance over standard maximum likelihood approaches on challenging tasks, such as the MuJoCo-based Humanoid. Finally, within the MBRL setting we show strong performance gains with respect to sample efficiency across a host of other continuous control tasks.

2021-02-19

ArXiv (prépublication)

openreview.net

Concurrent prescriptions for opioids and benzodiazepines and risk of opioid overdose: protocol for a retrospective cohort study using linked administrative data

Erin Y Liu

Robyn Tamblyn

Kristian B Filion

David Buckeridge

2021-02-18

BMJ Open (publié)

doi.org

Smart Futures Based Resource Trading and Coalition Formation for Real-Time Mobile Data Processing

Ruitao Chen

Xianbin Wang

Xue (Steve) Liu

Collaboration among mobile devices (MDs) is becoming more important, as it could augment computing capacity at the network edge through peer… (voir plus)-to-peer service provisioning, and directly enhance real-time computational performance in smart Internet-of-Things applications. As an important aspect of collaboration mechanism, conventional resource trading (RT) among MDs relies on an onsite interaction process, i.e., price negotiation between service providers and requesters, which, however, inevitably incurs excessive latency and degrades RT efficiency. To overcome this challenge, this article adopts the concept of futures contract (FC) used in financial market, and proposes a smart futures for low latency RT. This new technique enables MDs to form trading coalitions and negotiate multilateral forward contracts applied to a collaboration term in the future. To maximize the benefits of self-interested MDs, the negotiation process of FC is modelled as a coalition formation game comprised of three components executed in an iterative manner, i.e., futures resource allocation, revenue sharing and payment allocation, and distributed decision-making of individual MD. Additionally, a FC enforcement scheme is implemented to efficiently manage the onsite resource sharing via recording resource balances of different task-types and MDs. Simulation results prove the superiority of smart futures in RT latency reduction and trading fairness provisioning.

2021-02-18

IEEE Transactions on Services Computing (published)

doi.org

Smart Futures Based Resource Trading and Coalition Formation for Real-Time Mobile Data Processing

Ruitao Chen

Xianbin Wang

Xue (Steve) Liu

Collaboration among mobile devices (MDs) is becoming more important, as it could augment computing capacity at the network edge through peer… (voir plus)-to-peer service provisioning, and directly enhance real-time computational performance in smart Internet-of-Things applications. As an important aspect of collaboration mechanism, conventional resource trading (RT) among MDs relies on an onsite interaction process, i.e., price negotiation between service providers and requesters, which, however, inevitably incurs excessive latency and degrades RT efficiency. To overcome this challenge, this article adopts the concept of futures contract (FC) used in financial market, and proposes a smart futures for low latency RT. This new technique enables MDs to form trading coalitions and negotiate multilateral forward contracts applied to a collaboration term in the future. To maximize the benefits of self-interested MDs, the negotiation process of FC is modelled as a coalition formation game comprised of three components executed in an iterative manner, i.e., futures resource allocation, revenue sharing and payment allocation, and distributed decision-making of individual MD. Additionally, a FC enforcement scheme is implemented to efficiently manage the onsite resource sharing via recording resource balances of different task-types and MDs. Simulation results prove the superiority of smart futures in RT latency reduction and trading fairness provisioning.

2021-02-18

IEEE Transactions on Services Computing (publié)

doi.org

SVRG meets AdaGrad: painless variance reduction

Benjamin Dubois-Taine

Sharan Vaswani

Reza Babanezhad Harikandeh

Mark Schmidt

Simon Lacoste-Julien

2021-02-18

ArXiv (preprint)

doi.org

arxiv.org

Bridging the Gap Between Adversarial Robustness and Optimization Bias

Fartash Faghri

Cristina Vasconcelos

David J Fleet

Fabian Pedregosa

Nicolas Le Roux

2021-02-17

ArXiv (prépublication)

arxiv.org

Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata

Borja Balle

Clara Lacroce

Prakash Panangaden

Doina Precup

Guillaume Rabusseau

We address the approximate minimization problem for weighted finite automata (WFAs) with weights in …

2021-02-13

ArXiv (prépublication)

doi.org

arxiv.org

Task dependent deep LDA pruning of neural networks

Qing Tian

Tal Arbel

James J. Clark

2021-02-01

Computer Vision and Image Understanding (publié)

doi.org

Variational Nested Dropout

Yufei Cui

Yushun Mao

Ziquan Liu

Qiao Li

Antoni Bert Chan

Xue (Steve) Liu

Tei-Wei Kuo

Chun Jason Xue

Nested dropout is a variant of dropout operation that is able to order network parameters or features based on the pre-defined importance du… (voir plus)ring training. It has been explored for: I. Constructing nested nets Cui et al. 2020, Cui et al. 2021: the nested nets are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. The nested dropout implicitly ranks the network parameters, generating a set of sub-networks such that any smaller sub-network forms the basis of a larger one. II. Learning ordered representation Rippel et al. 2014: the nested dropout applied to the latent representation of a generative model (e.g., auto-encoder) ranks the features, enforcing explicit order of the dense representation over dimensions. However, the dropout rate is fixed as a hyper-parameter during the whole training process. For nested nets, when network parameters are removed, the performance decays in a human-specified trajectory rather than in a trajectory learned from data. For generative models, the importance of features is specified as a constant vector, restraining the flexibility of representation learning. To address the problem, we focus on the probabilistic counterpart of the nested dropout. We propose a variational nested dropout (VND) operation that draws samples of multi-dimensional ordered masks at a low cost, providing useful gradients to the parameters of nested dropout. Based on this approach, we design a Bayesian nested neural network that learns the order knowledge of the parameter distributions. We further exploit the VND under different generative models for learning ordered latent distributions. In experiments, we show that the proposed approach outperforms the nested network in terms of accuracy, calibration, and out-of-domain detection in classification tasks. It also outperforms the related generative models on data generation tasks.

2021-01-27

ArXiv (preprint)

doi.org

arxiv.org

Correction to: The patient advisor, an organizational resource as a lever for an enhanced oncology patient experience (PAROLEonco): a longitudinal multiple case study protocol

Marie-Pascale Pomey

Michèle de Guise

Mado Desforges

Karine Bouchard

Cécile Vialaron

Louise Normandin

Monica Iliescu‐Nelea

Israël Fortin

Isabelle Ganache

Catherine Régis

Zeev Rosberger

Danielle Charpentier

L. Bélanger

Michel Dorval

Djahanchah Philip Ghadiri

Mélanie Lavoie-Tremblay

A. Boivin

Jean-François Pelletier

Nicolas Fernandez

Alain M. Danino

2021-01-14

BMC Health Services Research (publié)

doi.org

Assessing the Impact: Does an Improvement to a Revenue Management System Lead to an Improved Revenue?

Greta Laage

Emma Frejinger

Andrea Lodi

Guillaume Rabusseau

2021-01-13

ArXiv (prépublication)

arxiv.org

Learning with Gradient Descent and Weakly Convex Losses

Dominic Richards

Michael Rabbat

We study the learning performance of gradient descent when the empirical risk is weakly convex, namely, the smallest negative eigenvalue of … (voir plus)the empirical risk's Hessian is bounded in magnitude. By showing that this eigenvalue can control the stability of gradient descent, generalisation error bounds are proven that hold under a wider range of step sizes compared to previous work. Out of sample guarantees are then achieved by decomposing the test error into generalisation, optimisation and approximation errors, each of which can be bounded and traded off with respect to algorithmic parameters, sample size and magnitude of this eigenvalue. In the case of a two layer neural network, we demonstrate that the empirical risk can satisfy a notion of local weak convexity, specifically, the Hessian's smallest eigenvalue during training can be controlled by the normalisation of the layers, i.e., network scaling. This allows test error guarantees to then be achieved when the population risk minimiser satisfies a complexity assumption. By trading off the network complexity and scaling, insights are gained into the implicit bias of neural network scaling, which are further supported by experimental findings.

2021-01-13

ArXiv (preprint)

arxiv.org

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Publications

La recherche en IA au service du monde réel

Boussole des politiques en IA

Vie étudiante et ressources

Mots-clés populaires:

Publications