Portrait of Foutse Khomh

Foutse Khomh

Associate Academic Member
Canada CIFAR AI Chair
Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering

Biography

Foutse Khomh is a full professor of software engineering at Polytechnique Montréal, a Canada CIFAR AI Chair – Trustworthy Machine Learning Software Systems, and an FRQ-IVADO Research Chair in Software Quality Assurance for Machine Learning Applications. Khomh completed a PhD in software engineering at Université de Montréal in 2011, for which he received an Award of Excellence. He was also awarded a CS-Can/Info-Can Outstanding Young Computer Science Researcher Prize in 2019.

His research interests include software maintenance and evolution, machine learning systems engineering, cloud engineering, and dependable and trustworthy ML/AI. His work has received four Ten-year Most Influential Paper (MIP) awards, and six Best/Distinguished Paper Awards. He has served on the steering committee of numerous organizations in software engineering, including SANER (chair), MSR, PROMISE, ICPC (chair), and ICSME (vice-chair). He initiated and co-organized Polytechnique Montréal‘s Software Engineering for Machine Learning Applications (SEMLA) symposium and the RELENG (release engineering) workshop series.

Khomh co-founded the NSERC CREATE SE4AI: A Training Program on the Development, Deployment and Servicing of Artificial Intelligence-based Software Systems, and is a principal investigator for the DEpendable Explainable Learning (DEEL) project.

He also co-founded Confiance IA, a Quebec consortium focused on building trustworthy AI, and is on the editorial board of multiple international software engineering journals, including IEEE Software, EMSE and JSEP. He is a senior member of IEEE.

Current Students

Master's Research - Polytechnique Montréal
Master's Research - Polytechnique Montréal
Master's Research - Polytechnique Montréal
Master's Research - Polytechnique Montréal

Publications

Dev2vec: Representing Domain Expertise of Developers in an Embedding Space
Arghavan Moradi Dakhel
Michel C. Desmarais
Studying the challenges of developing hardware description language programs
Fatemeh Yousefifeshki
Heng Li
Responsible Design Patterns for Machine Learning Pipelines
Saud Hakem Al Harbi
Lionel Nganyewou Tidjon
Integrating ethical practices into the AI development process for artificial intelligence (AI) is essential to ensure safe, fair, and respon… (see more)sible operation. AI ethics involves applying ethical principles to the entire life cycle of AI systems. This is essential to mitigate potential risks and harms associated with AI, such as algorithm biases. To achieve this goal, responsible design patterns (RDPs) are critical for Machine Learning (ML) pipelines to guarantee ethical and fair outcomes. In this paper, we propose a comprehensive framework incorporating RDPs into ML pipelines to mitigate risks and ensure the ethical development of AI systems. Our framework comprises new responsible AI design patterns for ML pipelines identified through a survey of AI ethics and data management experts and validated through real-world scenarios with expert feedback. The framework guides AI developers, data scientists, and policy-makers to implement ethical practices in AI development and deploy responsible AI systems in production.
Testing Feedforward Neural Networks Training Programs
Houssem Ben Braiek
On Codex Prompt Engineering for OCL Generation: An Empirical Study
Seif Abukhalaf
Mohammad Hamdaqa
The Object Constraint Language (OCL) is a declarative language that adds constraints and object query expressions to Meta-Object Facility (M… (see more)OF) models. OCL can provide precision and conciseness to UML models. Nevertheless, the unfamiliar syntax of OCL has hindered its adoption by software practitioners. LLMs, such as GPT-3, have made significant progress in many NLP tasks, such as text generation and semantic parsing. Similarly, researchers have improved on the downstream tasks by fine-tuning LLMs for the target task. Codex, a GPT-3 descendant by OpenAI, has been fine-tuned on publicly available code from GitHub and has proven the ability to generate code in many programming languages, powering the AI-pair programmer Copilot. One way to take advantage of Codex is to engineer prompts for the target downstream task. In this paper, we investigate the reliability of the OCL constraints generated by Codex from natural language specifications. To achieve this, we compiled a dataset of 15 UML models and 168 specifications from various educational resources. We manually crafted a prompt template with slots to populate with the UML information and the target task in the prefix format to complete the template with the generated OCL constraint. We used both zero- and few-shot learning methods in the experiments. The evaluation is reported by measuring the syntactic validity and the execution accuracy metrics of the generated OCL constraints. Moreover, to get insight into how close or natural the generated OCL constraints are compared to human-written ones, we measured the cosine similarity between the sentence embedding of the correctly generated and human-written OCL constraints. Our findings suggest that by enriching the prompts with the UML information of the models and enabling few-shot learning, the reliability of the generated OCL constraints increases. Furthermore, the results reveal a close similarity based on sentence embedding between the generated OCL constraints and the human-written ones in the ground truth, implying a level of clarity and understandability in the generated OCL constraints by Codex.
RIGAA at the SBFT 2023 Tool Competition - Cyber-Physical Systems Track
Dmytro Humeniuk
Giuliano Antoniol
Testing and verification of autonomous systems is critically important. In the context of SBFT 2023 CPS testing tool competition, we present… (see more) our tool RIGAA for generating virtual roads to test an autonomous vehicle lane keeping assist system. RIGAA combines reinforcement learning as well as evolutionary search to generate test scenarios. It has achieved the second highest final score among 5 other submitted tools.
UnityLint: A Bad Smell Detector for Unity
Matteo Bosco
Pasquale Cavoto
Augusto Ungolo
Biruk Asmare Muse
Vittoria Nardone
Massimiliano Di Penta
The video game industry is particularly rewarding as it represents a large portion of the software development market. However, working in t… (see more)his domain may be challenging for developers, not only because of the need for heterogeneous skills (from software design to computer graphics), but also for the limited body of knowledge in terms of good and bad design and development principles, and the lack of tool support to assist them. This tool demo proposes UnityLint, a tool able to detect 18 types of bad smells in Unity video games. UnityLint builds upon a previously-defined and validated catalog of bad smells for video games. The tool, developed in C# and available both as open-source and binary releases, is composed of (i) analyzers that extract facts from video game source code and metadata, and (ii) smell detectors that leverage detection rules to identify smells on top of the extracted facts.Tool: https://github.com/mdipenta/UnityCodeSmellAnalyzerTeaser Video: https://youtu.be/HooegxZ8H6g
Leveraging Data Mining Algorithms to Recommend Source Code Changes
AmirHossein Naghshzan
Saeed Khalilazar
Pierre Poilane
Olga Baysal
Latifa Guerrouj
Ranking code clones to support maintenance activities
Osama Ehsan
Ying Zou
Dong Qiu
Bugs in machine learning-based systems: a faultload benchmark
Mohammad Mehdi Morovati
Amin Nikanjam
Z. Jiang
Can Ensembling Preprocessing Algorithms Lead to Better Machine Learning Fairness?
Khaled Badran
Pierre-Olivier Côté
Amanda Kolopanis
Rached Bouchoucha
Antonio Collante
Diego Elias Costa
Emad Shihab
In this work, we evaluate three popular fairness preprocessing algorithms and investigate the potential for combining all algorithms into a … (see more)more robust preprocessing ensemble. We report on lessons learned that can help practitioners better select fairness algorithms for their models.
Machine learning application development: practitioners’ insights
Md Saidur Rahman
Alaleh Hamidi
Jinghui Cheng
Giuliano Antoniol
Hironori Washizaki