Portrait of Quentin Cappart

Quentin Cappart

Affiliate Member
Associate Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Research Topics
Graph Neural Networks
Learning on Graphs
Reasoning

Biography

Quentin Cappart is an associate professor in the Department of Computer and Software Engineering at Polytechnique Montréal and an Affiliate member at Mila. He leads the CORAIL research group, which he co-founded with Louis-Martin Rousseau. Cappart obtained a BSc in engineering (2012), a MSc in computer engineering (2014), a MSc in management (2018) and a PhD (2017) at the Université catholique de Louvain (Belgium).

After his PhD, he joined Polytechnique Montréal and the International Research Centre on Enterprise Networks, Logistics and Transportation (CIRRELT) as a postdoctoral fellow (2018–2020). During these two years, he was also a research intern at ElementAI. Cappart’s main research area is the integration of machine learning with search procedures for solving combinatorial problems.

Current Students

PhD - Polytechnique Montréal
Principal supervisor :
Postdoctorate - Polytechnique Montréal
Principal supervisor :

Publications

Scaling Decision-Focused Learning to Large Problems with Lagrangian Decomposition
Stéphane Eilles-Chan Way
Hugo Percot
Tias Guns
Louis-Martin Rousseau
Decision-focused learning has shown great promise for addressing predict-then-optimize problems, particularly in the presence of under-speci… (see more)fied models. However, its practical deployment is often hindered by high computational costs and limited scalability, as it requires solving a constrained optimization problem for each training instance at every iteration. To address these challenges, we propose a novel framework that incorporates Lagrangian decomposition into the decision-focused learning paradigm. Specifically, we introduce a new surrogate objective along with two loss functions for evaluating and training the underlying prediction model. We further propose two variants of our approach, which offer different trade-offs between computational efficiency and solution quality. Our framework can be seamlessly integrated with standard decision-focused learning methods, including Smart Predict-then-Optimize (SPO+) and Implicit Maximum Likelihood Estimation (IMLE). Through experiments on two standard benchmarks, the multi-dimensional knapsack problem and quadratic portfolio optimization, we demonstrate that our approach achieves competitive performance while remaining amenable to parallelization. In particular, it consistently outperforms traditional decision-focused learning methods on large-scale instances, involving up to eight times more variables than those typically considered in related work. The implementation is available at https://github.com/corail-research/DFL-LD.
Learning Admissible Heuristics via Cost Partitioning
Hugo Barral
Marie-José Huguet
Sylvie Thiébaux
Admissible heuristics are essential for optimal planning, yet learning them remains challenging due to the risk of overestimation. Cost part… (see more)itioning combines multiple abstraction heuristics while preserving admissibility, but computing optimal partitions online is expensive. We propose a framework that learns to infer admissible cost partitions by leveraging the Lagrangian dual equivalence between cost partitioning and multiplier prediction. Planning states and patterns are encoded as labelled graphs, and an action-centric variant of the Weisfeiler-Leman algorithm extracts structural feature vectors. A deep architecture with axial self-attention and a softmax output layer maps these features to cost weights that satisfy the partition constraints by construction, ensuring admissibility. Experiments demonstrate reduced node expansions compared to suboptimal partitioning baselines while maintaining strict admissibility. To our knowledge, this is the first machine-learned heuristic guaranteed to be admissible.
An Exact Framework for Solving the Space-Time Dependent TSP
Isaac Rudich
Manuel López-Ibáñez
Michael Romer
Louis-Martin Rousseau
Many real-world scenarios involve solving bilevel optimization problems in which there is an outer discrete optimization problem and an inne… (see more)r problem involving expensive or black box computation. This arises in space-time–dependent variants of the traveling salesman problem, such as when planning space missions that visit multiple astronomical objects. Planning these missions presents significant challenges due to the constant relative motion of the objects involved. There is an outer combinatorial problem of finding the optimal order to visit the objects and an inner optimization problem that requires finding the optimal departure time and trajectory to travel between each pair of objects. The constant motion of the objects complicates the inner problem, making it computationally expensive. This paper introduces a novel framework utilizing decision diagrams (DDs) and a DD-based branch-and-bound technique, peel-and-bound, to achieve exact solutions for such bilevel optimization problems, assuming sufficient inner problem optimizer quality. The framework leverages problem-specific knowledge to expedite search processes and minimize the number of expensive evaluations required. As a case study, we apply this framework to the asteroid routing problem, a benchmark problem in global trajectory optimization. Experimental results demonstrate the framework’s scalability and ability to generate robust heuristic solutions for tested instances. Many of these solutions are exact, contingent on the assumed quality of the inner problem’s optimizer. History: Accepted by Andrea Lodi, Area Editor for Design & Analysis of Algorithms–Discrete. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0866 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0866 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
Code and Data Repository for An Exact Framework for Solving the Space-Time Dependent TSP
Isaac Rudich
Manuel López-Ibáñez
Michael Romer
Louis-Martin Rousseau
The software and data in this repository are a snapshot of the software and data that were used in the research reported on in the paper An … (see more)Exact Framework for Solving the Space-Time Dependent TSP by Isaac Rudich, Manuel López-Ibáñez, Michael Römer, Quentin Cappart, and Louis-Martin Rousseau.
Combining Constraint Programming and Machine Learning: From Current Progress to Future Opportunities
Tias Guns
Michele Lombardi
Gilles Pesant
Dimos Tsouros
The integration of constraint programming (CP) together with machine learning (ML) has emerged as a promising direction for tackling complex… (see more) decision-making and combinatorial optimization problems. While CP offers expressive modeling capabilities and formal guarantees, ML provides adaptive methods for learning from data and generalizing across instances. This survey presents a comprehensive overview of recent advances in combining CP and ML. We first show how ML has been used to improve the CP toolbox, both in modeling and in the efficiency of solving. Then, we examine how CP can support ML, particularly in providing structure, guarantees, and symbolic reasoning capabilities. Finally, we identify key open challenges inherent to such hybrid approaches and outline promising directions for future research. This survey provides a first conceptual and structured review of recent advancements in this emerging field, aiming to serve as a resource for practitioners and researchers in both the CP and ML communities. To keep the progress up to date, a curated list of references is hosted on an accompanying repository (https://github.com/corail-research/CPML-paper-list) and is open to community contributions.
Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain
Chandra Kiran Reddy Evuru
Nazanin Sepahvand
Alexandre Lacoste
Krishnamurthy (DJ) Dvijotham
The practice of fine-tuning AI agents on data from their own interactions--such as web browsing or tool use--, while being a strong general … (see more)recipe for improving agentic capabilities, also introduces a critical security vulnerability within the AI supply chain. In this work, we show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors that are triggerred by specific target phrases, such that when the agent encounters these triggers, it performs an unsafe or malicious action. We formalize and validate three realistic threat models targeting different layers of the supply chain: 1) direct poisoning of fine-tuning data, where an attacker controls a fraction of the training traces; 2) environmental poisoning, where malicious instructions are injected into webpages scraped or tools called while creating training data; and 3) supply chain poisoning, where a pre-backdoored base model is fine-tuned on clean data to improve its agentic capabilities. Our results are stark: by poisoning as few as 2% of the collected traces, an attacker can embed a backdoor causing an agent to leak confidential user information with over 80% success when a specific trigger is present. This vulnerability holds across all three threat models. Furthermore, we demonstrate that prominent safeguards, including two guardrail models and one weight-based defense, fail to detect or prevent the malicious behavior. These findings highlight an urgent threat to agentic AI development and underscore the critical need for rigorous security vetting of data collection processes and end-to-end model supply chains.
DoomArena: A framework for Testing AI Agents Against Evolving Security Threats
Mihir Bansal
Chandra Kiran Reddy Evuru
Avinandan Bose
Maryam Fazel
Jason Stanley
Alexandre Lacoste
Krishnamurthy Dj Dvijotham
We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a plug-in framework… (see more) and integrates easily into realistic agentic frameworks like BrowserGym (for web agents) and
Silent Sabotage: Injecting Backdoors into AI Agents Through Fine-Tuning
Chandra Kiran Reddy Evuru
Joshua Kazdan
Avinandan Bose
Maryam Fazel
Sai Rajeswar
Jason Stanley
Krishnamurthy Dj Dvijotham
The rise of AI agents that can use tools, browse the web and interact with computers on behalf of a user, has sparked strong interest in imp… (see more)roving these capabilities by explicitly fine-tuning the LLMs/VLMs that power these agents. Several researchers have proposed collecting data by letting the agents interact with their environment (e.g., a computer operating system, the web or a collection of APIs exposed as tools), and improve agent performance by fine tuning on this data. In this work, we show that such data collection can be manipulated by adversaries to insert poisoned traces. By modifying just 5% of collected traces, adversaries can embed stealthy bad behaviors into agents—like leaking confidential user information whenever the tool or webpage exposes a trigger. Our results raise important security concerns in the development of AI agents, and underscore the importance of careful scrutiny of all data collection processes used to improve agentic AI.
The BrowserGym Ecosystem for Web Agent Research
Maxime Gasse
Alexandre Lacoste
Massimo Caccia
Lawrence Keunho Jang
Ori Yoran
Dehan Kong
Frank F. Xu
Graham Neubig
Ruslan Salakhutdinov
The BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents, particularly those leveraging a… (see more)utomation and Large Language Models (LLMs). Many existing benchmarks suffer from fragmentation and inconsistent evaluation methodologies, making it challenging to achieve reliable comparisons and reproducible results. In an earlier work, Drouin et al. (2024) introduced BrowserGym which aims to solve this by providing a unified, gym-like environment with well-defined observation and action spaces, facilitating standardized evaluation across diverse benchmarks. We propose an extended BrowserGym-based ecosystem for web agent research, which unifies existing benchmarks from the literature and includes AgentLab, a complementary framework that aids in agent creation, testing, and analysis. Our proposed ecosystem offers flexibility for integrating new benchmarks while ensuring consistent evaluation and comprehensive experiment management. As a supporting evidence, we conduct the first large-scale, multi-benchmark web agent experiment and compare the performance of 6 state-of-the-art LLMs across 6 popular web agent benchmarks made available in BrowserGym. Among other findings, our results highlight a large discrepancy between OpenAI and Anthropic's latests models, with Claude-3.5-Sonnet leading the way on almost all benchmarks, except on vision-related tasks where GPT-4o is superior. Despite these advancements, our results emphasize that building robust and efficient web agents remains a significant challenge, due to the inherent complexity of real-world web environments and the limitations of current models.
Learning and fine-tuning a generic value-selection heuristic inside a constraint programming solver
Tristan François
Pierre Tessier
Louis Gautier
Louis-Martin Rousseau
Constraint programming is known for being an efficient approach to solving combinatorial problems. Important design choices in a solver are … (see more)the branching heuristics, designed to lead the search to the best solutions in a minimum amount of time. However, developing these heuristics is a time-consuming process that requires problem-specific expertise. This observation has motivated many efforts to use machine learning to automatically learn efficient heuristics without expert intervention. Although several generic variable-selection heuristics are available in the literature, the options for value-selection heuristics are more scarce. We propose to tackle this issue by introducing a generic learning procedure that can be used to obtain a value-selection heuristic inside a constraint programming solver. This has been achieved thanks to the combination of a deep Q-learning algorithm, a tailored reward signal, and a heterogeneous graph neural network. Experiments on graph coloring, maximum independent set, maximum cut, and minimum vertex cover problems show that this framework competes with the well-known impact-based and activity-based search heuristics and can find solutions close to optimality without requiring a large number of backtracks. Additionally, we observe that fine-tuning a model with a different problem class can accelerate the learning process.
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
The ability of large language models (LLMs) to mimic human-like intelligence has led to a surge in LLM-based autonomous agents. Though recen… (see more)t LLMs seem capable of planning and reasoning given user instructions, their effectiveness in applying these capabilities for autonomous task solving remains underexplored. This is especially true in enterprise settings, where automated agents hold the promise of a high impact. To fill this gap, we propose WorkArena++, a novel benchmark consisting of 682 tasks corresponding to realistic workflows routinely performed by knowledge workers. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents. Our empirical studies across state-of-the-art LLMs and vision-language models (VLMs), as well as human workers, reveal several challenges for such models to serve as useful assistants in the workplace. In addition to the benchmark, we provide a mechanism to effortlessly generate thousands of ground-truth observation/action traces, which can be used for fine-tuning existing models. Overall, we expect this work to serve as a useful resource to help the community progress toward capable autonomous agents. The benchmark can be found at https://github.com/ServiceNow/WorkArena.
Learning Valid Dual Bounds in Constraint Programming: Boosted Lagrangian Decomposition with Self-Supervised Learning
Swann Bessa
Darius Dabert
Max Bourgeat
Louis-Martin Rousseau