Amine Mhedhbi

Associate Academic Member

Assistant Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering

Research Topics

Computer Systems

Data Science

Information Retrieval

Machine Learning Operations (MLOps)

Machine Learning Systems

Tabular Data

Website

Google Scholar

Biography

Amine Mhedhbi is an assistant professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Data and AI Systems (DAIS) group. He is an Associate Academic Member at Mila – Quebec Artificial Intelligence Institute and holds an FRQ-IVADO chair in multimodal data engineering.

His research interests include all aspects of data and information management with a focus on analytical and AI-driven data system architectures. His work includes tackling performance considerations, debuggability, interface design, and data applications.

Amine received his Ph.D. from the University of Waterloo, where he was awarded the Computer Science distinguished dissertation award and was a Microsoft Ph.D. fellow.

Current Students

Anas Dorbani

PhD - Polytechnique Montréal

Website

Github

Google Scholar

Eyad Salama

Master's Research - Polytechnique Montréal

Website

Nour Shaheen

Master's Research - Polytechnique Montréal

Co-supervisor :

PhD - Polytechnique Montréal

Google Scholar

Publications

Factorized and Vectorized Execution: Optimizing Analytical and Semantic Queries over Relations

Sunny Yasser

Anas Dorbani

Amine Mhedhbi

Many-to-many joins are central to analytical and semantic workloads such as fraud detection, network analysis, and recommendation, where ins… (see more)ights arise from relationships between entities. These workloads often suffer from an explosion of intermediate results, sometimes orders of magnitude larger than the inputs. Factorized representations address this problem by exploiting conditional independence among attributes to encode intermediates more compactly. In some cases, they can reduce the output size asymptotically below the worst-case output size. However, adopting factorization in modern vectorized query processors remains challenging: factorized representations are hierarchical, whereas vectorized execution is built around flat, block-oriented processing. Prior approaches either rely on full materialization or support only restricted factorization layouts, sacrificing much of the benefits of both factorization and vectorization. We present FFX, a novel engine for F ast F actorized e X ecution. FFX is the first pipelined engine to support arbitrary factorization schemes while preserving full vectorization. The engine introduces packed factorized vectors and operators that maintain cache-friendly, contiguous layouts. Beyond analytics, FFX also co-optimizes semantic operators by serializing factorized intermediates into compact prompts for large language models (LLMs), substantially reducing token usage and inference cost while maintaining output quality and, in some cases, improving it. Together, these contributions enable efficient execution of join-heavy analytical queries, including queries augmented with semantic operators.

2026-05-17

Proceedings of the ACM on Management of Data (published)

doi.org

Semantic Commit: Helping Users Update Intent Specifications for AI Memory at Scale

Priyan Vaithilingam

Munyeong Kim

Frida-Cecilia Acosta-Parenteau

Daniel Lee

Amine Mhedhbi

Elena L. Glassman

Ian Arawjo

2025-09-26

Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (published)

doi.org

arxiv.org

Towards Optimizing SQL Generation via LLM Routing

Mohammadhossein Malekpour

Nour Shaheen

Foutse Khomh

Amine Mhedhbi

Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capabl… (see more)e large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.

2024-11-05

ArXiv (preprint)

doi.org

arxiv.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Amine Mhedhbi

Biography

Current Students

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Amine Mhedhbi

Biography

Current Students

Publications