Aditya Mahajan

Associate Academic Member

Associate Professor, McGill University, Department of Electrical and Computer Engineering

Research Topics

Reinforcement Learning

Biography

Aditya Mahajan is a professor in the Department of Electrical and Computer Engineering at McGill University and an associate academic member of Mila – Quebec Artificial Intelligence Institute.

He is also a member of the McGill Centre for Intelligent Machines (CIM), the International Laboratory for Learning Systems (ILLS), and the Group for Research in Decision Analysis (GERAD). Mahajan received his BTech degree in electrical engineering from the Indian Institute of Technology Kanpur, and his MSc and PhD degrees in electrical engineering and computer science from the University of Michigan at Ann Arbor.

He is a senior member of the U.S. Institute of Electrical and Electronics Engineers (IEEE), as well as a member of Professional Engineers Ontario. He currently serves as associate editor for IEEE Transactions on Automatic Control, IEEE Control Systems Letters, and Mathematics of Control, Signals, and Systems (Springer). He served as associate editor for the conference editorial board of the IEEE Control Systems Society from 2014 to 2017.

Mahajan’s numerous awards include the 2015 George Axelby Outstanding Paper Award, 2016 NSERC Discovery Accelerator Award, 2014 CDC Best Student Paper Award (as supervisor), and 2016 NecSys Best Student Paper Award (as supervisor). Mahajan’s principal research interests are stochastic control and reinforcement learning.

Current Students

Reza Alvandi

Master's Research - McGill University

Mohamed-Amine Azzouz

Master's Research - McGill University

Berk Bozkurt

Collaborating Alumni - McGill University

Website

Google Scholar

Arka Ian Goswami

Master's Research - McGill University

Gaspard Lambrechts

Postdoctorate - McGill University

Co-supervisor :

Master's Research - Université de Montréal

Samin Nili

PhD - McGill University

Google Scholar

Reza Pourmohammadi Najafabadi

Master's Research - McGill University

Github

Amit Sinha

PhD - McGill University

Github

Google Scholar

Tao Zhang

PhD - McGill University

Publications

Networked control of coupled subsystems: Spectral decomposition and low-dimensional solutions

Shuang Gao

Aditya Mahajan

In this paper, we investigate optimal networked control of coupled subsystems where the dynamics and the cost couplings depend on an underly… (see more)ing weighted graph. We use the spectral decomposition of the graph adjacency matrix to decompose the overall system into (L+1) systems with decoupled dynamics and cost, where L is the rank of the adjacency matrix. Consequently, the optimal control input at each subsystem can be computed by solving (L+1) decoupled Riccati equations. A salient feature of the result is that the solution complexity depends on the rank of the adjacency matrix rather than the size of the network (i.e., the number of nodes). Therefore, the proposed solution framework provides a scalable method for synthesizing and implementing optimal control laws for large-scale systems.

2019-12-01

IEEE Conference on Decision and Control (published)

doi.org

Restless bandits with controlled restarts: Indexability and computation of Whittle index

Nima Akbarzadeh

Aditya Mahajan

Motivated by applications in machine repair, queueing, surveillance, and clinic care, we consider a scheduling problem where a decision make… (see more)r can reset m out of n Markov processes at each time. Processes that are reset, restart according to a known probability distribution and processes that are not reset, evolve in a Markovian manner. Due to the high complexity of finding an optimal policy, such scheduling problems are often modeled as restless bandits. We show that the model satisfies a technical condition known as indexability. For indexable restless bandits, the Whittle index policy, which computes a function known as Whittle index for each process and resets the m processes with the lowest index, is known to be a good heuristic. The Whittle index is computed by solving an auxiliary Markov decision problem for each arm. When the optimal policy for this auxiliary problem is threshold based, we use ideas from renewal theory to derive closed form expression for the Whittle index. We present detailed numerical experiments which suggest that Whittle index policy performs close to the optimal policy and performs significantly better than myopic policy, which is a commonly used heuristic.

2019-12-01

IEEE Conference on Decision and Control (published)

doi.org

Dynamic spectrum access under partial observations: A restless bandit approach

Nima Akbarzadeh

Aditya Mahajan

We consider a communication system where multiple unknown channels are available for transmission. Each channel is a channel with state whic… (see more)h evolves in a Markov manner. The transmitter has to select L channels to use and also decide the resources (e.g., power, rate, etc.) to use for each of the selected channels. It observes the state of the channels it uses and receives no feedback on the state of the other channels. We model this problem as a partially observable Markov decision process and obtain a simplified belief state. We show that the optimal resource allocation policy can be identified in closed form. Once the optimal resource allocation policy is fixed, choosing the channel scheduling policy may be viewed as a restless bandit. We present an efficient algorithm to check indexability and compute the Whittle index for each channel. When the model is indexable, the Whittle index policy, which transmits over the L channels with the smallest Whittle indices, is an attractive heuristic policy.

2019-06-01

Canadian Workshop on Information Theory (published)

doi.org

Multi-Agent Estimation and Filtering for Minimizing Team Mean-Squared Error

Mohammad Afshari

Aditya Mahajan

Motivated by estimation problems arising in autonomous vehicles and decentralized control of unmanned aerial vehicles, we consider multi-age… (see more)nt estimation and filtering problems in which multiple agents generate state estimates based on decentralized information and the objective is to minimize a coupled mean-squared error which we call team mean-square error. We call the resulting estimates as minimum team mean-squared error (MTMSE) estimates. We show that MTMSE estimates are different from minimum mean-squared error (MMSE) estimates. We derive closed-form expressions for MTMSE estimates, which are linear function of the observations where the corresponding gain depends on the weight matrix that couples the estimation error. We then consider a filtering problem where a linear stochastic process is monitored by multiple agents which can share their observations (with delay) over a communication graph. We derive expressions to recursively compute the MTMSE estimates. To illustrate the effectiveness of the proposed scheme we consider an example of estimating the distances between vehicles in a platoon and show that MTMSE estimates significantly outperform MMSE estimates and consensus Kalman filtering estimates.

2019-03-28

ArXiv (preprint)

doi.org

arxiv.org

Reinforcement Learning in Stationary Mean-field Games

Jayakumar Subramanian

Aditya Mahajan

Multi-agent reinforcement learning has made significant progress in recent years, but it remains a hard problem. Hence, one often resorts to… (see more) developing learning algorithms for specific classes of multi-agent systems. In this paper we study reinforcement learning in a specific class of multi-agent systems systems called mean-field games. In particular, we consider learning in stationary mean-field games. We identify two different solution concepts---stationary mean-field equilibrium and stationary mean-field social-welfare optimal policy---for such games based on whether the agents are non-cooperative or cooperative, respectively. We then generalize these solution concepts to their local variants using bounded rationality based arguments. For these two local solution concepts, we present two reinforcement learning algorithms. We show that the algorithms converge to the right solution under mild technical conditions and demonstrate this using two numerical examples.

2019-03-01

Adaptive Agents and Multi-Agent Systems (published)

dblp.uni-trier.de

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Aditya Mahajan

Biography

Current Students

Publications

Hackathon | Building safer AI for youth mental health

Indigenous Pathfinders in AI

AI Advantage

Popular keywords:

Aditya Mahajan

Biography

Current Students

Publications