Foutse Khomh

Biography

Foutse Khomh is a full professor of software engineering at Polytechnique Montréal, a Canada CIFAR AI Chair – Trustworthy Machine Learning Software Systems, and an FRQ-IVADO Research Chair in Software Quality Assurance for Machine Learning Applications. Khomh completed a PhD in software engineering at Université de Montréal in 2011, for which he received an Award of Excellence. He was also awarded a CS-Can/Info-Can Outstanding Young Computer Science Researcher Prize in 2019.

His research interests include software maintenance and evolution, machine learning systems engineering, cloud engineering, and dependable and trustworthy ML/AI. His work has received four Ten-year Most Influential Paper (MIP) awards, and six Best/Distinguished Paper Awards. He has served on the steering committee of numerous organizations in software engineering, including SANER (chair), MSR, PROMISE, ICPC (chair), and ICSME (vice-chair). He initiated and co-organized Polytechnique Montréal‘s Software Engineering for Machine Learning Applications (SEMLA) symposium and the RELENG (release engineering) workshop series.

Khomh co-founded the NSERC CREATE SE4AI: A Training Program on the Development, Deployment and Servicing of Artificial Intelligence-based Software Systems, and is a principal investigator for the DEpendable Explainable Learning (DEEL) project.

He also co-founded Confiance IA, a Quebec consortium focused on building trustworthy AI, and is on the editorial board of multiple international software engineering journals, including IEEE Software, EMSE and JSEP. He is a senior member of IEEE.

Current Students

Nanda Assobjio Brice Yvan

Collaborating Alumni - Polytechnique Montréal

Gabriel Laberge

PhD - Polytechnique Montréal

Github

forough majidi

PhD - Polytechnique Montréal

Website

mohammadhossein.malekpour@gmail.com

Mo Malekpour

Master's Research - Polytechnique Montréal

Website

Github

Elnathan Tiokou Tiokou Fangang

Mohamed Amine Merzouk

Postdoctorate - Polytechnique Montréal

Co-supervisor :

Master's Research - Polytechnique Montréal

PhD - Polytechnique Montréal

Github

Ben Braiek Yasmine

Master's Research - Polytechnique Montréal

Publications

Fault Localization in Deep Learning-based Software: A System-level Approach

Mohammad Mehdi Morovati

Amin Nikanjam

Over the past decade, Deep Learning (DL) has become an integral part of our daily lives. This surge in DL usage has heightened the need for … (see more)developing reliable DL software systems. Given that fault localization is a critical task in reliability assessment, researchers have proposed several fault localization techniques for DL-based software, primarily focusing on faults within the DL model. While the DL model is central to DL components, there are other elements that significantly impact the performance of DL components. As a result, fault localization methods that concentrate solely on the DL model overlook a large portion of the system. To address this, we introduce FL4Deep, a system-level fault localization approach considering the entire DL development pipeline to effectively localize faults across the DL-based systems. In an evaluation using 100 faulty DL scripts, FL4Deep outperformed four previous approaches in terms of accuracy for three out of six DL-related faults, including issues related to data (84%), mismatched libraries between training and deployment (100%), and loss function (69%). Additionally, FL4Deep demonstrated superior precision and recall in fault localization for five categories of faults including three mentioned fault types in terms of accuracy, plus insufficient training iteration and activation function.

2024-11-12

ArXiv (preprint)

Impact of LLM-based Review Comment Generation in Practice: A Mixed Open-/Closed-source User Study

Doriane Olewicki

Léuson M. P. Da Silva

Suhaib Mujahid

Arezou Amini

Benjamin Mah

Marco Castelluccio

Sarra Habchi

Bram Adams

We conduct a large-scale empirical user study in a live setup to evaluate the acceptance of LLM-generated comments and their impact on the r… (see more)eview process. This user study was performed in two organizations, Mozilla (which has its codebase available as open source) and Ubisoft (fully closed-source). Inside their usual review environment, participants were given access to RevMate, an LLM-based assistive tool suggesting generated review comments using an off-the-shelf LLM with Retrieval Augmented Generation to provide extra code and review context, combined with LLM-as-a-Judge, to auto-evaluate the generated comments and discard irrelevant cases. Based on more than 587 patch reviews provided by RevMate, we observed that 8.1% and 7.2%, respectively, of LLM-generated comments were accepted by reviewers in each organization, while 14.6% and 20.5% other comments were still marked as valuable as review or development tips. Refactoring-related comments are more likely to be accepted than Functional comments (18.2% and 18.6% compared to 4.8% and 5.2%). The extra time spent by reviewers to inspect generated comments or edit accepted ones (36/119), yielding an overall median of 43s per patch, is reasonable. The accepted generated comments are as likely to yield future revisions of the revised patch as human-written comments (74% vs 73% at chunk-level).

2024-11-11

ArXiv (preprint)

Towards Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study

Mehil B. Shah

Mohammad Masudur Rahman

2024-11-09

Empirical Software Engineering (published)

Towards Optimizing SQL Generation via LLM Routing

Mohammadhossein Malekpour

Nour Shaheen

Amine Mhedhbi

Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capabl… (see more)e large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.

2024-11-06

ArXiv (preprint)

Towards Optimizing SQL Generation via LLM Routing

Mohammadhossein Malekpour

Nour Shaheen

Amine Mhedhbi

2024-11-06

ArXiv (preprint)

Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code

Vahid Majdinasab

Amin Nikanjam

Code auditing ensures that the developed code adheres to standards, regulations, and copyright protection by verifying that it does not cont… (see more)ain code from protected sources. The recent advent of Large Language Models (LLMs) as coding assistants in the software development process poses new challenges for code auditing. The dataset for training these models is mainly collected from publicly available sources. This raises the issue of intellectual property infringement as developers' codes are already included in the dataset. Therefore, auditing code developed using LLMs is challenging, as it is difficult to reliably assert if an LLM used during development has been trained on specific copyrighted codes, given that we do not have access to the training datasets of these models. Given the non-disclosure of the training datasets, traditional approaches such as code clone detection are insufficient for asserting copyright infringement. To address this challenge, we propose a new approach, TraWiC; a model-agnostic and interpretable method based on membership inference for detecting code inclusion in an LLM's training dataset. We extract syntactic and semantic identifiers unique to each program to train a classifier for detecting code inclusion. In our experiments, we observe that TraWiC is capable of detecting 83.87% of codes that were used to train an LLM. In comparison, the prevalent clone detection tool NiCad is only capable of detecting 47.64%. In addition to its remarkable performance, TraWiC has low resource overhead in contrast to pair-wise clone detection that is conducted during the auditing process of tools like CodeWhisperer reference tracker, across thousands of code snippets.

2024-11-02

ACM Transactions on Software Engineering and Methodology (published)

Impact of LLM-based Review Comment Generation in Practice: A Mixed Open-/Closed-source User Study

Doriane Olewicki

Leuson Da Silva

Suhaib Mujahid

Arezou Amini

Benjamin Mah

Marco Castelluccio

Sarra Habchi

Bram Adams

2024-11-01

arXiv (published)

Tracing Optimization for Performance Modeling and Regression Detection

Kaveh Shahedi

Heng Li

Maxime Lamothe

Software performance modeling plays a crucial role in developing and maintaining software systems. A performance model analytically describe… (see more)s the relationship between the performance of a system and its runtime activities. This process typically examines various aspects of a system's runtime behavior, such as the execution frequency of functions or methods, to forecast performance metrics like program execution time. By using performance models, developers can predict expected performance and thereby effectively identify and address unexpected performance regressions when actual performance deviates from the model's predictions. One common and precise method for capturing performance behavior is software tracing, which involves instrumenting the execution of a program, either at the kernel level (e.g., system calls) or application level (e.g., function calls). However, due to the nature of tracing, it can be highly resource-intensive, making it impractical for production environments where resources are limited. In this work, we propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions, particularly application-level functions, from tracing while still building accurate performance models that can capture performance degradations. By selecting an optimal set of functions to be traced, we can construct optimized performance models that achieve an R-2 score of up to 99% and, sometimes, outperform full tracing models (models using non-optimized tracing data), while significantly reducing the tracing overhead by more than 80% in most cases. Our optimized performance models can also capture performance regressions in our studied programs effectively, demonstrating their usefulness in real-world scenarios. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.

2024-11-01

arXiv (published)

Doctoral Symposium Committee

Anthony Cleve

Christian Lange

Silvia Breu

Manar H. Alalfi

Mario Luca Bernardi

Cornelia Boldyreff

Marco D'Ambros

Simon Denier

Natalia Dragan

Ekwa Duala-Ekoko

Fausto Fasano

Adnane Ghannem

Carmine Gravino

Maen Hammad

Imed Hammouda

Salima Hassaine

Yue Jia

Zhen Ming (Jack) Jiang

Adam Kiezun … (see 11 more)

Jay Kothari

Jonathan Memaitre

Naouel Moha

Rocco Oliveto

Denys Poshyvanyk

Michele Risi

Giuseppe Scanniello

Bonita Sharif

Andrew Sutton

Anis Yousefi

Eugenio Zimeo

Manar H. Alalfi Mario Luca Bernardi Cornelia Boldyreff Anthony Cleve Marco D'Ambros Simon Denier Natalia Dragan Ekwa Duala-Ekoko Fausto Fasa… (see more)no Adnane Ghannem Carmine Gravino Maen Hammad Imed Hammouda Salima Hassaine Yue Jia Zhen Ming Jiang Foutse Khomh Adam Kiezun Jay Kothari Jonathan Memaitre Naouel Moha Rocco Oliveto Denys Poshyvanyk Michele Risi Giuseppe Scanniello Bonita Sharif Andrew Sutton Anis Yousefi Eugenio Zimeo

2024-10-28

2024 IEEE 35th International Symposium on Software Reliability Engineering Workshops (ISSREW) (published)

Doctoral Symposium Committee

Anthony Cleve

Christian Lange

Silvia Breu

Manar H. Alalfi

Mario Luca Bernardi

Cornelia Boldyreff

Marco D'Ambros

Simon Denier

Natalia Dragan

Ekwa Duala-Ekoko

Fausto Fasano

Adnane Ghannem

Carmine Gravino

Maen Hammad

Imed Hammouda

Salima Hassaine

Yue Jia

Zhen Ming Jiang

Adam Kiezun … (see 11 more)

Jay Kothari

Jonathan Memaitre

Naouel Moha

Rocco Oliveto

Denys Poshyvanyk

Michele Risi

Giuseppe Scanniello

Bonita Sharif

Andrew Sutton

Anis Yousefi

Eugenio Zimeo

2024-10-28

2024 IEEE 35th International Symposium on Software Reliability Engineering Workshops (ISSREW) (published)

In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators

Dmytro Humeniuk

Houssem Ben Braiek

Thomas Reid

2024-10-27

Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (published)

LIBS-Raman Multimodal Architecture for Automated Lunar Prospecting

Jérôme Pigeon

Richard Boudreault

Ahmed Ashraf

P. Maghoul

2024-10-10

Earth and Space 2024 (published)