Portrait of Foutse Khomh

Foutse Khomh

Associate Academic Member
Canada CIFAR AI Chair
Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Research Topics
Data Mining
Deep Learning
Distributed Systems
Generative Models
Learning to Program
Natural Language Processing
Reinforcement Learning

Biography

Foutse Khomh is a full professor of software engineering at Polytechnique Montréal, a Canada CIFAR AI Chair – Trustworthy Machine Learning Software Systems, and an FRQ-IVADO Research Chair in Software Quality Assurance for Machine Learning Applications. Khomh completed a PhD in software engineering at Université de Montréal in 2011, for which he received an Award of Excellence. He was also awarded a CS-Can/Info-Can Outstanding Young Computer Science Researcher Prize in 2019.

His research interests include software maintenance and evolution, machine learning systems engineering, cloud engineering, and dependable and trustworthy ML/AI. His work has received four Ten-year Most Influential Paper (MIP) awards, and six Best/Distinguished Paper Awards. He has served on the steering committee of numerous organizations in software engineering, including SANER (chair), MSR, PROMISE, ICPC (chair), and ICSME (vice-chair). He initiated and co-organized Polytechnique Montréal‘s Software Engineering for Machine Learning Applications (SEMLA) symposium and the RELENG (release engineering) workshop series.

Khomh co-founded the NSERC CREATE SE4AI: A Training Program on the Development, Deployment and Servicing of Artificial Intelligence-based Software Systems, and is a principal investigator for the DEpendable Explainable Learning (DEEL) project.

He also co-founded Confiance IA, a Quebec consortium focused on building trustworthy AI, and is on the editorial board of multiple international software engineering journals, including IEEE Software, EMSE and JSEP. He is a senior member of IEEE.

Current Students

Master's Research - Polytechnique Montréal
Master's Research - Polytechnique Montréal
PhD - Polytechnique Montréal
PhD - Polytechnique Montréal
Postdoctorate - Polytechnique Montréal
Master's Research - Polytechnique Montréal
PhD - Polytechnique Montréal

Publications

Exploring Security Practices in Infrastructure as Code: An Empirical Study
Alexandre Verdet
Mohammad Hamdaqa
Léuson M. P. Da Silva
Cloud computing has become popular thanks to the widespread use of Infrastructure as Code (IaC) tools, allowing the community to convenientl… (see more)y manage and configure cloud infrastructure using scripts. However, the scripting process itself does not automatically prevent practitioners from introducing misconfigurations, vulnerabilities, or privacy risks. As a result, ensuring security relies on practitioners understanding and the adoption of explicit policies, guidelines, or best practices. In order to understand how practitioners deal with this problem, in this work, we perform an empirical study analyzing the adoption of IaC scripted security best practices. First, we select and categorize widely recognized Terraform security practices promulgated in the industry for popular cloud providers such as AWS, Azure, and Google Cloud. Next, we assess the adoption of these practices by each cloud provider, analyzing a sample of 812 open-source projects hosted on GitHub. For that, we scan each project configuration files, looking for policy implementation through static analysis (checkov). Additionally, we investigate GitHub measures that might be correlated with adopting these best practices. The category Access policy emerges as the most widely adopted in all providers, while Encryption in rest are the most neglected policies. Regarding GitHub measures correlated with best practice adoption, we observe a positive, strong correlation between a repository number of stars and adopting practices in its cloud infrastructure. Based on our findings, we provide guidelines for cloud practitioners to limit infrastructure vulnerability and discuss further aspects associated with policies that have yet to be extensively embraced within the industry.
The Different Faces of AI Ethics Across the World: A Principle-to-Practice Gap Analysis
Lionel Nganyewou Tidjon
Artificial Intelligence (AI) is transforming our daily life with many applications in healthcare, space exploration, banking, and finance. T… (see more)his rapid progress in AI has brought increasing attention to the potential impacts of AI technologies on society, with ethically questionable consequences. In recent years, several ethical principles have been released by governments, national organizations, and international organizations. These principles outline high-level precepts to guide the ethical development, deployment, and governance of AI. However, the abstract nature, diversity, and context-dependence of these principles make them difficult to implement and operationalize, resulting in gaps between principles and their execution. Most recent work analyzed and summarized existing AI principles and guidelines but did not provide findings on principle-to-practice gaps nor how to mitigate them. These findings are particularly important to ensure that AI practical guidances are aligned with ethical principles and values. In this article, we provide a contextual and global evaluation of current ethical AI principles for all continents, with the aim to identify potential principle characteristics tailored to specific countries or applicable across countries. Next, we analyze the current level of AI readiness and current practical guidances of ethical AI principles in different countries, to identify gaps in the practical guidance of AI principles and their causes. Finally, we propose recommendations to mitigate the principle-to-practice gaps.
Intelligent Software Maintenance
Mohammad Masudur Rahman
Antoine Barbez
Chat2Code: A Chatbot for Model Specification and Code Generation, The Case of Smart Contracts
Ilham Qasse
Shailesh Mishra
Björn þór Jónsson
Mohammad Hamdaqa
The potential of automatic code generation through Model-Driven Engineering (MDE) frameworks has yet to be realized. Beyond their ability to… (see more) help software professionals write more accurate, reusable code, MDE frameworks could make programming accessible for a new class of domain experts. However, domain experts have been slow to embrace these tools, as they still need to learn how to specify their applications' requirements using the concrete syntax (i.e., textual or graphical) of the new and unified domain-specific language. Conversational interfaces (chatbots) could smooth the learning process and offer a more interactive way for domain experts to specify their application requirements and generate the desired code. If integrated with MDE frameworks, chatbots may offer domain experts with richer domain vocabulary without sacrificing the power of agnosticism that unified modelling frameworks provide. In this paper, we discuss the challenges of integrating chatbots within MDE frameworks and then examine a specific application: the auto-generation of smart contract code based on conversational syntax. We demonstrate how this can be done and evaluate our approach by conducting a user experience survey to assess the usability and functionality of the chatbot framework. The paper concludes by drawing attention to the potential benefits of leveraging Language Models (LLMs) in this context.
Dev2vec: Representing Domain Expertise of Developers in an Embedding Space
Arghavan Moradi Dakhel
Michel C. Desmarais
Studying the challenges of developing hardware description language programs
Fatemeh Yousefifeshki
Heng Li
Responsible Design Patterns for Machine Learning Pipelines
Saud Hakem Al Harbi
Lionel Nganyewou Tidjon
Integrating ethical practices into the AI development process for artificial intelligence (AI) is essential to ensure safe, fair, and respon… (see more)sible operation. AI ethics involves applying ethical principles to the entire life cycle of AI systems. This is essential to mitigate potential risks and harms associated with AI, such as algorithm biases. To achieve this goal, responsible design patterns (RDPs) are critical for Machine Learning (ML) pipelines to guarantee ethical and fair outcomes. In this paper, we propose a comprehensive framework incorporating RDPs into ML pipelines to mitigate risks and ensure the ethical development of AI systems. Our framework comprises new responsible AI design patterns for ML pipelines identified through a survey of AI ethics and data management experts and validated through real-world scenarios with expert feedback. The framework guides AI developers, data scientists, and policy-makers to implement ethical practices in AI development and deploy responsible AI systems in production.
Testing Feedforward Neural Networks Training Programs
Houssem Ben Braiek
On Codex Prompt Engineering for OCL Generation: An Empirical Study
Seif Abukhalaf
Mohammad Hamdaqa
The Object Constraint Language (OCL) is a declarative language that adds constraints and object query expressions to Meta-Object Facility (M… (see more)OF) models. OCL can provide precision and conciseness to UML models. Nevertheless, the unfamiliar syntax of OCL has hindered its adoption by software practitioners. LLMs, such as GPT-3, have made significant progress in many NLP tasks, such as text generation and semantic parsing. Similarly, researchers have improved on the downstream tasks by fine-tuning LLMs for the target task. Codex, a GPT-3 descendant by OpenAI, has been fine-tuned on publicly available code from GitHub and has proven the ability to generate code in many programming languages, powering the AI-pair programmer Copilot. One way to take advantage of Codex is to engineer prompts for the target downstream task. In this paper, we investigate the reliability of the OCL constraints generated by Codex from natural language specifications. To achieve this, we compiled a dataset of 15 UML models and 168 specifications from various educational resources. We manually crafted a prompt template with slots to populate with the UML information and the target task in the prefix format to complete the template with the generated OCL constraint. We used both zero- and few-shot learning methods in the experiments. The evaluation is reported by measuring the syntactic validity and the execution accuracy metrics of the generated OCL constraints. Moreover, to get insight into how close or natural the generated OCL constraints are compared to human-written ones, we measured the cosine similarity between the sentence embedding of the correctly generated and human-written OCL constraints. Our findings suggest that by enriching the prompts with the UML information of the models and enabling few-shot learning, the reliability of the generated OCL constraints increases. Furthermore, the results reveal a close similarity based on sentence embedding between the generated OCL constraints and the human-written ones in the ground truth, implying a level of clarity and understandability in the generated OCL constraints by Codex.
RIGAA at the SBFT 2023 Tool Competition - Cyber-Physical Systems Track
Dmytro Humeniuk
Giuliano Antoniol
Testing and verification of autonomous systems is critically important. In the context of SBFT 2023 CPS testing tool competition, we present… (see more) our tool RIGAA for generating virtual roads to test an autonomous vehicle lane keeping assist system. RIGAA combines reinforcement learning as well as evolutionary search to generate test scenarios. It has achieved the second highest final score among 5 other submitted tools.
UnityLint: A Bad Smell Detector for Unity
Matteo Bosco
Pasquale Cavoto
Augusto Ungolo
Biruk Asmare Muse
Vittoria Nardone
Massimiliano Di Penta
The video game industry is particularly rewarding as it represents a large portion of the software development market. However, working in t… (see more)his domain may be challenging for developers, not only because of the need for heterogeneous skills (from software design to computer graphics), but also for the limited body of knowledge in terms of good and bad design and development principles, and the lack of tool support to assist them. This tool demo proposes UnityLint, a tool able to detect 18 types of bad smells in Unity video games. UnityLint builds upon a previously-defined and validated catalog of bad smells for video games. The tool, developed in C# and available both as open-source and binary releases, is composed of (i) analyzers that extract facts from video game source code and metadata, and (ii) smell detectors that leverage detection rules to identify smells on top of the extracted facts.Tool: https://github.com/mdipenta/UnityCodeSmellAnalyzerTeaser Video: https://youtu.be/HooegxZ8H6g
Leveraging Data Mining Algorithms to Recommend Source Code Changes
AmirHossein Naghshzan
Saeed Khalilazar
Pierre Poilane
Olga Baysal
Latifa Guerrouj