Portrait de Erick Delage

Erick Delage

Membre académique associé
Professeur titulaire, HEC Montréal, Département de sciences de la décision
Sujets de recherche
Apprentissage par renforcement
Optimisation

Biographie

Erick Delage est professeur au Département de sciences de la décision à HEC Montréal, titulaire de la Chaire de recherche du Canada en prise de décision sous incertitude, et membre du Collège de nouveaux chercheurs et créateurs en art et en science de la Société royale du Canada. Ses domaines de recherche englobent l'optimisation robuste et stochastique, l'analyse de décision, l'apprentissage automatique, l'apprentissage par renforcement et la gestion des risques avec des applications en optimisation de portefeuille, en gestion des stocks, et dans les problèmes liés à l'énergie et aux transports.

Étudiants actuels

Postdoctorat - HEC
Doctorat - HEC
Postdoctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - Concordia
Doctorat - HEC
Doctorat - HEC
Doctorat - HEC

Publications

Fair Resource Allocation in Weakly Coupled Markov Decision Processes
Xiaohui Tu
Yossiri Adulyasak
Nima Akbarzadeh
We consider fair resource allocation in sequential decision-making environments modeled as weakly coupled Markov decision processes, where r… (voir plus)esource constraints couple the action spaces of
Planning and Learning in Risk-Aware Restless Multi-Arm Bandits
Nima Akbarzadeh
Yossiri Adulyasak
In restless multi-arm bandits, a central agent is tasked with optimally distributing limited resources across several bandits (arms), with e… (voir plus)ach arm being a Markov decision process. In this work, we generalize the traditional restless multi-arm bandit problem with a risk-neutral objective by incorporating risk-awareness. We establish indexability conditions for the case of a risk-aware objective and provide a solution based on Whittle index. In addition, we address the learning problem when the true transition probabilities are unknown by proposing a Thompson sampling approach and show that it achieves bounded regret that scales sublinearly with the number of episodes and quadratically with the number of arms. The efficacy of our method in reducing risk exposure in restless multi-arm bandits is illustrated through a set of numerical experiments in the contexts of machine replacement and patient scheduling applications under both planning and learning setups.
Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
Jia Lin Hau
Esther Derman
Mohammad Ghavamzadeh
Marek Petrik
In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences … (voir plus)for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in MDPs with strong convergence and performance guarantees. The algorithm leverages a new, simple dynamic program (DP) decomposition for quantile MDPs. Compared with prior work, our DP decomposition requires neither known transition probabilities nor solving complex saddle point equations and serves as a suitable foundation for other model-free RL algorithms. Our numerical results in tabular domains show that our Q-learning algorithm converges to its DP variant and outperforms earlier algorithms.
Conformal Inverse Optimization
Bo Lin
Timothy Chan
Inverse optimization has been increasingly used to estimate unknown parameters in an optimization model based on decision data. We show that… (voir plus) such a point estimation is insufficient in a prescriptive setting where the estimated parameters are used to prescribe new decisions. The prescribed decisions may be low-quality and misaligned with human intuition and thus are unlikely to be adopted. To tackle this challenge, we propose conformal inverse optimization, which seeks to learn an uncertainty set for the unknown parameters and then solve a robust optimization model to prescribe new decisions. Under mild assumptions, we show that our method enjoys provable guarantees on solution quality, as evaluated using both the ground-truth parameters and the decision maker's perception of the unknown parameters. Our method demonstrates strong empirical performance compared to classic inverse optimization.
End-to-end Conditional Robust Optimization
Abhilash Reddy Chenreddy
Robust Data-driven Prescriptiveness Optimization
Mehran Poursoltani
Angelos Georghiou
The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information t… (voir plus)o provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.
Robust Data-driven Prescriptiveness Optimization
Mehran Poursoltani
Angelos Georghiou
Crowdkeeping in Last-Mile Delivery
Xin Wang
Okan Arslan
In order to improve the efficiency of the last-mile delivery system when customers are possibly absent for deliveries, we propose the idea o… (voir plus)f employing the crowd to work as keepers and to provide storage services for their neighbors. Crowd keepers have extra flexibility, more availability, and lower costs than fixed storage options such as automated lockers, and this leads to a more efficient and a more profitable system for last-mile deliveries. We present a bilevel program that jointly determines the assignment, routing, and pricing decisions while considering customer preferences, keeper behaviors, and platform operations. We develop an equivalent single-level program, a mixed-integer linear program with subtour elimination constraints, that can be solved to optimality using a row generation algorithm. To improve the efficiency of the solution procedure, we further derive exact best response sets for both customers and keepers, and approximate optimal travel times using linear regression. We present a numerical study using a real-world data set from Amazon. The fixed-storage and no-storage systems are used as benchmarks to assess the performance of the crowdkeeping system. The results show that the crowdkeeping delivery system has the potential to generate higher profits because of its ability to consolidate deliveries and to eliminate failed deliveries. Funding: Funding provided by the Natural Sciences and Engineering Research Council of Canada [Grants 2022-04979 and 2022-05261], the Canada Research Chair program [Grant CRC-2018-00105], and the China Scholarship Council [Grant 202006190051] is gratefully acknowledged. Supplemental Material: The online appendix is available at https://doi.org/10.1287/trsc.2022.0323 .
Crowdkeeping in Last-mile Delivery
Xin Wang
Okan Arslan
A Column Generation Scheme for Distributionally Robust Multi-Item Newsvendor Problems
Shanshan Wang
This paper studies a distributionally robust multi-item newsvendor problem, where the demand distribution is unknown but specified with a ge… (voir plus)neral event-wise ambiguity set. Using the event-wise affine decision rules, we can obtain a conservative approximation formulation of the problem, which can typically be further reformulated as a linear program. In order to efficiently solve the resulting large-scale linear program, we develop a column generation-based decomposition scheme and speed up the computational efficiency by exploiting a special column selection strategy and stopping early based on a Karush-Kuhn-Tucker condition test. Focusing on the Wasserstein ambiguity set and the event-wise mean absolute deviation set, a computational study demonstrates both the computational efficiency of the proposed algorithm, which significantly outperforms a commercial solver and a Benders decomposition method, and the out-of-sample superiority of distributionally robust solutions relative to their sample average approximation counterparts. History: Accepted by Nicola Secomandi, Area Editor for Stochastic Models & Reinforcement Learning. Funding: This work was supported by the Natural Sciences and Engineering Research Council of Canada [492997-2016, RGPIN-2016-05208], the National Natural Science Foundation of China [71972012], Alliance de recherche numérique du Canada, and Canada Research Chairs [CRC-2018-00105]. It was also supported by Groupe d’études et de recherche en analyse des décisions (GERAD). Finally, this research was enabled in part by support provided by Digital Research Alliance of Canada ( https://alliancecan.ca/en ). Supplemental Material: The software that supports the findings of this study is available within the paper and its supplemental information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0010 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0010 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
Distributional Robustness and Inequity Mitigation in Disaster Preparedness of Humanitarian Operations
Hongming Li
Ning Zhu
Michael Pinedo
Shoufeng Ma
Problem definition: In this paper, we study a predisaster relief network design problem with uncertain demands. The aim is to determine the … (voir plus)prepositioning and reallocation of relief supplies. Motivated by the call of the International Federation of Red Cross and Red Crescent Societies (IFRC) to leave no one behind, we consider three important practical aspects of humanitarian operations: shortages, equity, and uncertainty. Methodology/results: We first employ a form of robust satisficing measure, which we call the shortage severity measure, to evaluate the severity of the shortage caused by uncertain demand in a context with limited distribution information. Because shortages often raise concerns about equity, we then formulate a mixed-integer lexicographic optimization problem with nonconvex objectives and design a new branch-and-bound algorithm to identify the exact solution. We also propose two approaches for identifying optimal postdisaster adaptable resource reallocation: an exact approach and a conservative approximation that is more computationally efficient. Our case study considers the 2010 Yushu earthquake, which occurred in northwestern China, and demonstrates the value of our methodology in mitigating geographical inequities and reducing shortages. Managerial implications: In our case study, we show that (i) incorporating equity in both predisaster deployment and postdisaster reallocation can produce substantially more equitable shortage prevention strategies while sacrificing only a reasonable amount of total shortage; (ii) increasing donations/budgets may not necessarily alleviate the shortage suffered by the most vulnerable individuals if equity is not fully considered; and (iii) exploiting disaster magnitude information when quantifying uncertainty can help alleviate geographical inequities caused by uncertain relief demands. Funding: This work was supported by the Natural Sciences and Engineering Research Council of Canada [Grant RGPIN-2016-05208], the National Natural Science Foundation of China [Grants 71971154, 72010107004, 72091214, and 72122015], and the Canada Research Chairs [Grant CRC-2018-00105]. Supplemental Material: The online appendices are available at https://doi.org/10.1287/msom.2023.1230 .
On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes
Jia Lin Hau
Mohammad Ghavamzadeh
Marek Petrik