Publications

Improving the Generalizability and Robustness of Large-Scale Traffic Signal Control
Tianyu Shi
François-Xavier Devailly
Denis Larocque
A number of deep reinforcement-learning (RL) approaches propose to control traffic signals. Compared to traditional approaches, RL approache… (see more)s can learn from higher-dimensionality input road and vehicle sensors and better adapt to varying traffic conditions resulting in reduced travel times (in simulation). However, these RL methods require training from massive traffic sensor data. To offset this relative inefficiency, some recent RL methods have the ability to first learn from small-scale networks and then generalize to unseen city-scale networks without additional retraining (zero-shot transfer). In this work, we study the robustness of such methods along two axes. First, sensor failures and GPS occlusions create missing-data challenges and we show that recent methods remain brittle in the face of these missing data. Second, we provide a more systematic study of the generalization ability of RL methods to new networks with different traffic regimes. Again, we identify the limitations of recent approaches. We then propose using a combination of distributional and vanilla reinforcement learning through a policy ensemble. Building upon the state-of-the-art previous model which uses a decentralized approach for large-scale traffic signal control with graph convolutional networks (GCNs), we first learn models using a distributional reinforcement learning (DisRL) approach. In particular, we use implicit quantile networks (IQN) to model the state-action return distribution with quantile regression. For traffic signal control problems, an ensemble of standard RL and DisRL yields superior performance across different scenarios, including different levels of missing sensor data and traffic flow patterns. Furthermore, the learning scheme of the resulting model can improve zero-shot transferability to different road network structures, including both synthetic networks and real-world networks (e.g., Luxembourg, Manhattan). We conduct extensive experiments to compare our approach to multi-agent reinforcement learning and traditional transportation approaches. Results show that the proposed method improves robustness and generalizability in the face of missing data, varying road networks, and traffic flows.
Inertia-Based Indices to Determine the Number of Clusters in K-Means: An Experimental Evaluation
Andrei Rykov
Renato Cordeiro De Amorim
Boris Mirkin
This paper gives an experimentally supported review and comparison of several indices based on the conventional K-means inertia criterion fo… (see more)r determining the number of clusters,
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus
Tolúlope' Ògúnremí
Kọ́lá Túbọ̀sún
Aremu Anuoluwapo
Iroro Orife
Learning Conditional Policies for Crystal Design Using Offline Reinforcement Learning
Prashant Govindarajan
Santiago Miret
Jarrid Rector-Brooks
Mariano Phielipp
Janarthanan Rajendran
Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material dis… (see more)covery. Recent developments in generative and geometric deep learning have shown...
Maximum entropy GFlowNets with soft Q-learning
Sobhan Mohammadpour
Emmanuel Bengio
Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play
Daniel Bairamian
Philippe Marcotte
Joshua Romoff
Gabriel Robert
Model-based graph reinforcement learning for inductive traffic signal control
François-Xavier Devailly
Denis Larocque
Most reinforcement learning methods for adaptive-traffic-signal-control require training from scratch to be applied on any new intersection … (see more)or after any modification to the road network, traffic distribution, or behavioral constraints experienced during training. Considering 1) the massive amount of experience required to train such methods, and 2) that experience must be gathered by interacting in an exploratory fashion with real road-network-users, such a lack of transferability limits experimentation and applicability. Recent approaches enable learning policies that generalize for unseen road-network topologies and traffic distributions, partially tackling this challenge. However, the literature remains divided between the learning of cyclic (the evolution of connectivity at an intersection must respect a cycle) and acyclic (less constrained) policies, and these transferable methods 1) are only compatible with cyclic constraints and 2) do not enable coordination. We introduce a new model-based method, MuJAM, which, on top of enabling explicit coordination at scale for the first time, pushes generalization further by allowing a generalization to the controllers' constraints. In a zero-shot transfer setting involving both road networks and traffic settings never experienced during training, and in a larger transfer experiment involving the control of 3,971 traffic signal controllers in Manhattan, we show that MuJAM, using both cyclic and acyclic constraints, outperforms domain-specific baselines as well as another transferable approach.
A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem
Paul Barde
Jakob Nicolaus Foerster
Amy Zhang
Neural Semantic Surface Maps
Luca Morreale
Vladimir Kim
Niloy J. Mitra
Operational Research: methods and applications
Fotios Petropoulos
Gilbert Laporte
Emel Aktas
Sibel A. Alumur
Claudia Archetti
Hayriye Ayhan
Maria Battarra
Julia A. Bennell
Jean-Marie Bourjolly
John E. Boylan
Michèle Breton
David Canca
Bo Chen
Cihan Tugrul Cicek
Louis Anthony Cox
Christine S.M. Currie
Erik Demeulemeester
Li Ding
Stephen M. Disney … (see 62 more)
Matthias Ehrgott
Martin J. Eppler
Güneş Erdoğan
Bernard Fortz
L. Alberto Franco
Jens Frische
Salvatore Greco
Amanda J. Gregory
Raimo P. Hämäläinen
Willy Herroelen
Mike Hewitt
Jan Holmström
John N. Hooker
Tuğçe Işık
Jill Johnes
Bahar Y. Kara
Özlem Karsu
Katherine Kent
Charlotte Köhler
Martin Kunc
Yong-Hong Kuo
Judit Lienert
Adam N. Letchford
Janny Leung
Dong Li
Haitao Li
Ivana Ljubić
Sebastián Lozano
Virginie Lurkin
Silvano Martello
Ian G. McHale
Gerald Midgley
John D.W. Morecroft
Akshay Mutha
Ceyda Oğuz
Sanja Petrovic
Ulrich Pferschy
Harilaos N. Psaraftis
Sam Rose
Lauri Saarinen
Said Salhi
Jing-Sheng Song
Dimitrios Sotiros
Kathryn E. Stecke
Arne K. Strauss
İstenç Tarhan
Clemens Thielen
Paolo Toth
Greet Vanden Berghe
Christos Vasilakis
Vikrant Vaze
Daniele Vigo
Kai Virtanen
Xun Wang
Rafał Weron
Leroy White
Tom Van Woensel
Mike Yearworth
E. Alper Yıldırım
Georges Zaccour
Xuying Zhao
Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a … (see more)diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes.
Optimal Zero-Shot Detector for Multi-Armed Attacks
Federica Granese
Marco Romanelli
This paper explores a scenario in which a malicious actor employs a multi-armed attack strategy to manipulate data samples, offering them va… (see more)rious avenues to introduce noise into the dataset. Our central objective is to protect the data by detecting any alterations to the input. We approach this defensive strategy with utmost caution, operating in an environment where the defender possesses significantly less information compared to the attacker. Specifically, the defender is unable to utilize any data samples for training a defense model or verifying the integrity of the channel. Instead, the defender relies exclusively on a set of pre-existing detectors readily available"off the shelf". To tackle this challenge, we derive an innovative information-theoretic defense approach that optimally aggregates the decisions made by these detectors, eliminating the need for any training data. We further explore a practical use-case scenario for empirical evaluation, where the attacker possesses a pre-trained classifier and launches well-known adversarial attacks against it. Our experiments highlight the effectiveness of our proposed solution, even in scenarios that deviate from the optimal setup.
Policy Gradient Methods in the Presence of Symmetries and State Abstractions
Sahand Rezaei-Shoshtari
Rosie Zhao