Publications

IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control

FranÃ§ois-Xavier Devailly

Denis Larocque

Scaling adaptive traffic signal control involves dealing with combinatorial state and action spaces. Multi-agent reinforcement learning atte… (voir plus)mpts to address this challenge by distributing control to specialized agents. However, specialization hinders generalization and transferability, and the computational graphs underlying neural-network architectures—dominating in the multi-agent setting—do not offer the flexibility to handle an arbitrary number of entities which changes both between road networks, and over time as vehicles traverse the network. We introduce Inductive Graph Reinforcement Learning (IG-RL) based on graph-convolutional networks which adapts to the structure of any road network, to learn detailed representations of traffic signal controllers and their surroundings. Our decentralized approach enables learning of a transferable-adaptive-traffic-signal-control policy. After being trained on an arbitrary set of road networks, our model can generalize to new road networks and traffic distributions, with no additional training and a constant number of parameters, enabling greater scalability compared to prior methods. Furthermore, our approach can exploit the granularity of available data by capturing the (dynamic) demand at both the lane level and the vehicle level. The proposed method is tested on both road networks and traffic settings never experienced during training. We compare IG-RL to multi-agent reinforcement learning and domain-specific baselines. In both synthetic road networks and in a larger experiment involving the control of the 3,971 traffic signals of Manhattan, we show that different instantiations of IG-RL outperform baselines.

2022-07-01

IEEE Transactions on Intelligent Transportation Systems (publié)

doi.org

arxiv.org

Leishmania parasites exchange drug-resistance genes through extracellular vesicles

Noélie Douanne

George Dong

Atia Amin

Lorena Bernardo

Mathieu Blanchette

David Langlais

Martin Olivier

Christopher Fernandez-Prada

2022-07-01

Cell Reports (publié)

doi.org

Naming Autism in the Right Context

Andres Roman-Urrestarazu

Guillaume Dumas

Varun Warrier

2022-07-01

JAMA Pediatrics (publié)

doi.org

R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS

Kyle Kastner

Aaron Courville

2022-06-30

ArXiv (prépublication)

doi.org

arxiv.org

A guided multiverse study of neuroimaging analyses

Jessica Dafflon

Pedro F. Da Costa

František Váša

Ricardo Pio Monti

Danilo Bzdok

Peter J. Hellyer

Federico Turkheimer

Jonathan Smallwood

Emily J. H. Jones

Robert Leech

2022-06-29

Nature Communications (publié)

doi.org

Integrating Equity, Diversity, and Inclusion throughout the lifecycle of Artificial Intelligence in health

Milka Nyariro

Elham Emami

Samira Abbasgholizadeh-Rahimi

Health care systems are the infrastructures that are put together to deliver health and social services to the population at large. These or… (voir plus)ganizations are increasingly applying Artificial Intelligence (AI) to improve the efficiency and effectiveness of health and social care. Unfortunately, both health care systems and AI are confronted with a lack of Equity, Diversity, and Inclusion (EDI). This short paper focuses on the importance of integrating EDI concepts throughout the life cycle of AI in health. We discuss the risks that the lack of EDI in the design, development and implementation of AI-based tools might have on the already marginalized communities and populations in the healthcare setting. Moreover, we argue that integrating EDI principles and practice throughout the lifecycle of AI in health has an important role in achieving health equity for all populations. Further research needs to be conducted to explore how studies in AI-health have integrated.

2022-06-29

13th Augmented Human International Conference (publié)

doi.org

Annotation Cost-Sensitive Deep Active Learning with Limited Data (Student Abstract)

Renaud Bernatchez

Audrey Durand

Flavie Lavoie-Cardinal

2022-06-28

Proceedings of the AAAI Conference on Artificial Intelligence (publié)

doi.org

Biological Sequence Design with GFlowNets

Moksh J. Jain

Emmanuel Bengio

Alex Hernandez-Garcia

Jarrid Rector-Brooks

Bonaventure F. P. Dossou

Micheal Kilgour

Payel Das

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (publié)

doi.org

arxiv.org

Building Robust Ensembles via Margin Boosting

Dinghuai Zhang

Hongyang R. Zhang

Aaron Courville

Yoshua Bengio

Pradeep Ravikumar

Arun Sai Suggala

In the context of adversarial robustness, a single model does not usually have enough power to defend against all possible adversarial attac… (voir plus)ks, and as a result, has sub-optimal robustness. Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks. In this work, we take a principled approach towards building robust ensembles. We view this problem from the perspective of margin-boosting and develop an algorithm for learning an ensemble with maximum margin. Through extensive empirical evaluation on benchmark datasets, we show that our algorithm not only outperforms existing ensembling techniques, but also large models trained in an end-to-end fashion. An important byproduct of our work is a margin-maximizing cross-entropy (MCE) loss, which is a better alternative to the standard cross-entropy (CE) loss. Empirically, we show that replacing the CE loss in state-of-the-art adversarial training techniques with our MCE loss leads to significant performance improvement.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (publié)

doi.org

arxiv.org

Direct Behavior Specification via Constrained Reinforcement Learning

Chris J Pal

The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors. Most oft… (voir plus)en, practitioners go about the task of behavior specification by manually engineering the reward function, a counter-intuitive process that requires several iterations and is prone to reward hacking by the agent. In this work, we argue that constrained RL, which has almost exclusively been used for safe RL, also has the potential to significantly reduce the amount of work spent for reward specification in applied RL projects. To this end, we propose to specify behavioral preferences in the CMDP framework and to use Lagrangian methods to automatically weigh each of these behavioral constraints. Specifically, we investigate how CMDPs can be adapted to solve goal-based tasks while adhering to several constraints simultaneously. We evaluate this framework on a set of continuous control tasks relevant to the application of Reinforcement Learning for NPC design in video games.

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (publié)

proceedings.mlr.press

arxiv.org

Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning

Harley Wiltzer

David Meger

Marc Gendron-Bellemare

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (publié)

doi.org

arxiv.org

Estimating Social Influence from Observational Data

Dhanya Sridhar

Caterina De Bacco

David Blei

We consider the problem of estimating social influence, the effect that a person's behavior has on the future behavior of their peers. The k… (voir plus)ey challenge is that shared behavior between friends could be equally explained by influence or by two other confounding factors: 1) latent traits that caused people to both become friends and engage in the behavior, and 2) latent preferences for the behavior. This paper addresses the challenges of estimating social influence with three contributions. First, we formalize social influence as a causal effect, one which requires inferences about hypothetical interventions. Second, we develop Poisson Influence Factorization (PIF), a method for estimating social influence from observational data. PIF fits probabilistic factor models to networks and behavior data to infer variables that serve as substitutes for the confounding latent traits. Third, we develop assumptions under which PIF recovers estimates of social influence. We empirically study PIF with semi-synthetic and real data from Last.fm, and conduct a sensitivity analysis. We find that PIF estimates social influence most accurately compared to related methods and remains robust under some violations of its assumptions.

2022-06-28

Proceedings of the First Conference on Causal Learning and Reasoning (publié)

doi.org

openreview.net

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Publications

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Mots-clés populaires:

Publications