Portrait de Foutse Khomh

Foutse Khomh

Membre académique associé
Chaire en IA Canada-CIFAR
Professeur, Polytechnique Montréal, Département de génie informatique et génie logiciel
Sujets de recherche
Apprentissage de la programmation
Apprentissage par renforcement
Apprentissage profond
Exploration des données
Modèles génératifs
Systèmes distribués
Traitement du langage naturel

Biographie

Foutse Khomh est professeur titulaire de génie logiciel à Polytechnique Montréal, titulaire d'une chaire en IA Canada-CIFAR dans le domaine des systèmes logiciels d'apprentissage automatique fiables, et titulaire d'une chaire de recherche FRQ-IVADO sur l'assurance qualité des logiciels pour les applications d'apprentissage automatique.

Il a obtenu un doctorat en génie logiciel de l'Université de Montréal en 2011, avec une bourse d'excellence. Il a également reçu le prix CS-Can/Info-Can du meilleur jeune chercheur en informatique en 2019. Ses recherches portent sur la maintenance et l'évolution des logiciels, l'ingénierie des systèmes d'apprentissage automatique, l'ingénierie en nuage et l’IA/apprentissage automatique fiable et digne de confiance.

Ses travaux ont été récompensés par quatre prix de l’article le plus important Most Influential Paper en dix ans et six prix du meilleur article ou de l’article exceptionnel (Best/Distinguished Paper). Il a également siégé au comité directeur de plusieurs conférences et rencontres : SANER (comme président), MSR, PROMISE, ICPC (comme président) et ICSME (en tant que vice-président). Il a initié et coorganisé le symposium Software Engineering for Machine Learning Applications (SEMLA) et la série d'ateliers Release Engineering (RELENG).

Il est cofondateur du projet CRSNG CREATE SE4AI : A Training Program on the Development, Deployment, and Servicing of Artificial Intelligence-based Software Systems et l'un des chercheurs principaux du projet Dependable Explainable Learning (DEEL). Il est également cofondateur de l'initiative québécoise sur l'IA digne de confiance (Confiance IA Québec). Il fait partie du comité de rédaction de plusieurs revues internationales de génie logiciel (dont IEEE Software, EMSE, JSEP) et est membre senior de l'Institute of Electrical and Electronics Engineers (IEEE).

Étudiants actuels

Maîtrise recherche - Polytechnique
Maîtrise recherche - Polytechnique
Doctorat - Polytechnique
Doctorat - Polytechnique
Postdoctorat - Polytechnique
Maîtrise recherche - Polytechnique
Doctorat - Polytechnique

Publications

Exploring Security Practices in Infrastructure as Code: An Empirical Study
Alexandre Verdet
Mohammad Hamdaqa
Léuson M. P. Da Silva
Cloud computing has become popular thanks to the widespread use of Infrastructure as Code (IaC) tools, allowing the community to convenientl… (voir plus)y manage and configure cloud infrastructure using scripts. However, the scripting process itself does not automatically prevent practitioners from introducing misconfigurations, vulnerabilities, or privacy risks. As a result, ensuring security relies on practitioners understanding and the adoption of explicit policies, guidelines, or best practices. In order to understand how practitioners deal with this problem, in this work, we perform an empirical study analyzing the adoption of IaC scripted security best practices. First, we select and categorize widely recognized Terraform security practices promulgated in the industry for popular cloud providers such as AWS, Azure, and Google Cloud. Next, we assess the adoption of these practices by each cloud provider, analyzing a sample of 812 open-source projects hosted on GitHub. For that, we scan each project configuration files, looking for policy implementation through static analysis (checkov). Additionally, we investigate GitHub measures that might be correlated with adopting these best practices. The category Access policy emerges as the most widely adopted in all providers, while Encryption in rest are the most neglected policies. Regarding GitHub measures correlated with best practice adoption, we observe a positive, strong correlation between a repository number of stars and adopting practices in its cloud infrastructure. Based on our findings, we provide guidelines for cloud practitioners to limit infrastructure vulnerability and discuss further aspects associated with policies that have yet to be extensively embraced within the industry.
The Different Faces of AI Ethics Across the World: A Principle-to-Practice Gap Analysis
Lionel Nganyewou Tidjon
Artificial Intelligence (AI) is transforming our daily life with many applications in healthcare, space exploration, banking, and finance. T… (voir plus)his rapid progress in AI has brought increasing attention to the potential impacts of AI technologies on society, with ethically questionable consequences. In recent years, several ethical principles have been released by governments, national organizations, and international organizations. These principles outline high-level precepts to guide the ethical development, deployment, and governance of AI. However, the abstract nature, diversity, and context-dependence of these principles make them difficult to implement and operationalize, resulting in gaps between principles and their execution. Most recent work analyzed and summarized existing AI principles and guidelines but did not provide findings on principle-to-practice gaps nor how to mitigate them. These findings are particularly important to ensure that AI practical guidances are aligned with ethical principles and values. In this article, we provide a contextual and global evaluation of current ethical AI principles for all continents, with the aim to identify potential principle characteristics tailored to specific countries or applicable across countries. Next, we analyze the current level of AI readiness and current practical guidances of ethical AI principles in different countries, to identify gaps in the practical guidance of AI principles and their causes. Finally, we propose recommendations to mitigate the principle-to-practice gaps.
Intelligent Software Maintenance
Mohammad Masudur Rahman
Antoine Barbez
Chat2Code: A Chatbot for Model Specification and Code Generation, The Case of Smart Contracts
Ilham Qasse
Shailesh Mishra
Björn þór Jónsson
Mohammad Hamdaqa
The potential of automatic code generation through Model-Driven Engineering (MDE) frameworks has yet to be realized. Beyond their ability to… (voir plus) help software professionals write more accurate, reusable code, MDE frameworks could make programming accessible for a new class of domain experts. However, domain experts have been slow to embrace these tools, as they still need to learn how to specify their applications' requirements using the concrete syntax (i.e., textual or graphical) of the new and unified domain-specific language. Conversational interfaces (chatbots) could smooth the learning process and offer a more interactive way for domain experts to specify their application requirements and generate the desired code. If integrated with MDE frameworks, chatbots may offer domain experts with richer domain vocabulary without sacrificing the power of agnosticism that unified modelling frameworks provide. In this paper, we discuss the challenges of integrating chatbots within MDE frameworks and then examine a specific application: the auto-generation of smart contract code based on conversational syntax. We demonstrate how this can be done and evaluate our approach by conducting a user experience survey to assess the usability and functionality of the chatbot framework. The paper concludes by drawing attention to the potential benefits of leveraging Language Models (LLMs) in this context.
Dev2vec: Representing Domain Expertise of Developers in an Embedding Space
Arghavan Moradi Dakhel
Michel C. Desmarais
Studying the challenges of developing hardware description language programs
Fatemeh Yousefifeshki
Heng Li
Responsible Design Patterns for Machine Learning Pipelines
Saud Hakem Al Harbi
Lionel Nganyewou Tidjon
Integrating ethical practices into the AI development process for artificial intelligence (AI) is essential to ensure safe, fair, and respon… (voir plus)sible operation. AI ethics involves applying ethical principles to the entire life cycle of AI systems. This is essential to mitigate potential risks and harms associated with AI, such as algorithm biases. To achieve this goal, responsible design patterns (RDPs) are critical for Machine Learning (ML) pipelines to guarantee ethical and fair outcomes. In this paper, we propose a comprehensive framework incorporating RDPs into ML pipelines to mitigate risks and ensure the ethical development of AI systems. Our framework comprises new responsible AI design patterns for ML pipelines identified through a survey of AI ethics and data management experts and validated through real-world scenarios with expert feedback. The framework guides AI developers, data scientists, and policy-makers to implement ethical practices in AI development and deploy responsible AI systems in production.
Testing Feedforward Neural Networks Training Programs
Houssem Ben Braiek
On Codex Prompt Engineering for OCL Generation: An Empirical Study
Seif Abukhalaf
Mohammad Hamdaqa
The Object Constraint Language (OCL) is a declarative language that adds constraints and object query expressions to Meta-Object Facility (M… (voir plus)OF) models. OCL can provide precision and conciseness to UML models. Nevertheless, the unfamiliar syntax of OCL has hindered its adoption by software practitioners. LLMs, such as GPT-3, have made significant progress in many NLP tasks, such as text generation and semantic parsing. Similarly, researchers have improved on the downstream tasks by fine-tuning LLMs for the target task. Codex, a GPT-3 descendant by OpenAI, has been fine-tuned on publicly available code from GitHub and has proven the ability to generate code in many programming languages, powering the AI-pair programmer Copilot. One way to take advantage of Codex is to engineer prompts for the target downstream task. In this paper, we investigate the reliability of the OCL constraints generated by Codex from natural language specifications. To achieve this, we compiled a dataset of 15 UML models and 168 specifications from various educational resources. We manually crafted a prompt template with slots to populate with the UML information and the target task in the prefix format to complete the template with the generated OCL constraint. We used both zero- and few-shot learning methods in the experiments. The evaluation is reported by measuring the syntactic validity and the execution accuracy metrics of the generated OCL constraints. Moreover, to get insight into how close or natural the generated OCL constraints are compared to human-written ones, we measured the cosine similarity between the sentence embedding of the correctly generated and human-written OCL constraints. Our findings suggest that by enriching the prompts with the UML information of the models and enabling few-shot learning, the reliability of the generated OCL constraints increases. Furthermore, the results reveal a close similarity based on sentence embedding between the generated OCL constraints and the human-written ones in the ground truth, implying a level of clarity and understandability in the generated OCL constraints by Codex.
RIGAA at the SBFT 2023 Tool Competition - Cyber-Physical Systems Track
Dmytro Humeniuk
Giuliano Antoniol
Testing and verification of autonomous systems is critically important. In the context of SBFT 2023 CPS testing tool competition, we present… (voir plus) our tool RIGAA for generating virtual roads to test an autonomous vehicle lane keeping assist system. RIGAA combines reinforcement learning as well as evolutionary search to generate test scenarios. It has achieved the second highest final score among 5 other submitted tools.
UnityLint: A Bad Smell Detector for Unity
Matteo Bosco
Pasquale Cavoto
Augusto Ungolo
Biruk Asmare Muse
Vittoria Nardone
Massimiliano Di Penta
The video game industry is particularly rewarding as it represents a large portion of the software development market. However, working in t… (voir plus)his domain may be challenging for developers, not only because of the need for heterogeneous skills (from software design to computer graphics), but also for the limited body of knowledge in terms of good and bad design and development principles, and the lack of tool support to assist them. This tool demo proposes UnityLint, a tool able to detect 18 types of bad smells in Unity video games. UnityLint builds upon a previously-defined and validated catalog of bad smells for video games. The tool, developed in C# and available both as open-source and binary releases, is composed of (i) analyzers that extract facts from video game source code and metadata, and (ii) smell detectors that leverage detection rules to identify smells on top of the extracted facts.Tool: https://github.com/mdipenta/UnityCodeSmellAnalyzerTeaser Video: https://youtu.be/HooegxZ8H6g
Leveraging Data Mining Algorithms to Recommend Source Code Changes
AmirHossein Naghshzan
Saeed Khalilazar
Pierre Poilane
Olga Baysal
Latifa Guerrouj