Portrait de Foutse Khomh

Foutse Khomh

Membre académique associé
Chaire en IA Canada-CIFAR
Professeur, Polytechnique Montréal, Département de génie informatique et génie logiciel
Sujets de recherche
Apprentissage de la programmation
Apprentissage par renforcement
Apprentissage profond
Exploration des données
Modèles génératifs
Systèmes distribués
Traitement du langage naturel

Biographie

Foutse Khomh est professeur titulaire de génie logiciel à Polytechnique Montréal, titulaire d'une chaire en IA Canada-CIFAR dans le domaine des systèmes logiciels d'apprentissage automatique fiables, et titulaire d'une chaire de recherche FRQ-IVADO sur l'assurance qualité des logiciels pour les applications d'apprentissage automatique.

Il a obtenu un doctorat en génie logiciel de l'Université de Montréal en 2011, avec une bourse d'excellence. Il a également reçu le prix CS-Can/Info-Can du meilleur jeune chercheur en informatique en 2019. Ses recherches portent sur la maintenance et l'évolution des logiciels, l'ingénierie des systèmes d'apprentissage automatique, l'ingénierie en nuage et l’IA/apprentissage automatique fiable et digne de confiance.

Ses travaux ont été récompensés par quatre prix de l’article le plus important Most Influential Paper en dix ans et six prix du meilleur article ou de l’article exceptionnel (Best/Distinguished Paper). Il a également siégé au comité directeur de plusieurs conférences et rencontres : SANER (comme président), MSR, PROMISE, ICPC (comme président) et ICSME (en tant que vice-président). Il a initié et coorganisé le symposium Software Engineering for Machine Learning Applications (SEMLA) et la série d'ateliers Release Engineering (RELENG).

Il est cofondateur du projet CRSNG CREATE SE4AI : A Training Program on the Development, Deployment, and Servicing of Artificial Intelligence-based Software Systems et l'un des chercheurs principaux du projet Dependable Explainable Learning (DEEL). Il est également cofondateur de l'initiative québécoise sur l'IA digne de confiance (Confiance IA Québec). Il fait partie du comité de rédaction de plusieurs revues internationales de génie logiciel (dont IEEE Software, EMSE, JSEP) et est membre senior de l'Institute of Electrical and Electronics Engineers (IEEE).

Étudiants actuels

Maîtrise recherche - Polytechnique
Doctorat - Polytechnique
Doctorat - Polytechnique
Postdoctorat - Polytechnique
Co-superviseur⋅e :
Postdoctorat - Polytechnique
Maîtrise recherche - Polytechnique
Doctorat - Polytechnique
Maîtrise recherche - Polytechnique

Publications

AmbieGen: A Search-based Framework for Autonomous Systems Testing
Dmytro Humeniuk
Giuliano Antoniol
GitHub Copilot AI pair programmer: Asset or Liability?
Arghavan Moradi Dakhel
Vahid Majdinasab
Amin Nikanjam
Michel C. Desmarais
Z. Jiang
Automatic program synthesis is a long-lasting dream in software engineering. Recently, a promising Deep Learning (DL) based solution, called… (voir plus) Copilot, has been proposed by OpenAI and Microsoft as an industrial product. Although some studies evaluate the correctness of Copilot solutions and report its issues, more empirical evaluations are necessary to understand how developers can benefit from it effectively. In this paper, we study the capabilities of Copilot in two different programming tasks: (i) generating (and reproducing) correct and efficient solutions for fundamental algorithmic problems, and (ii) comparing Copilot's proposed solutions with those of human programmers on a set of programming tasks. For the former, we assess the performance and functionality of Copilot in solving selected fundamental problems in computer science, like sorting and implementing data structures. In the latter, a dataset of programming problems with human-provided solutions is used. The results show that Copilot is capable of providing solutions for almost all fundamental algorithmic problems, however, some solutions are buggy and non-reproducible. Moreover, Copilot has some difficulties in combining multiple methods to generate a solution. Comparing Copilot to humans, our results show that the correct ratio of humans' solutions is greater than Copilot's suggestions, while the buggy solutions generated by Copilot require less effort to be repaired.
Impact in Software Engineering Activities After One Year of COVID-19 Restrictions for Startups and Established Companies
Hosna Hooshyar
Eduardo Guerra
Jorge Melegati
Dron Khanna
Abdullah Aldaeej
Gerardo Matturro
Luciana Zaina
Des Greer
Usman Rafiq
Rafael Chanin
Xiaofeng Wang
Juan Garbajosa
Pekka Abrahamsson
Anh Nguyen-Duc
The restrictions imposed by the COVID-19 pandemic required software development teams to adapt, being forced to work remotely and adjust the… (voir plus) software engineering activities accordingly. In the studies evaluating these effects, a few have assessed the impact on software engineering activities from a broader perspective and after a period of time when teams had time to adjust to the changes. No studies have been found comparing software startups and established companies either. This paper aims to investigate the impacts of COVID-19 on software development activities after one year of the pandemic restrictions, comparing the results between startups and established companies. Our approach was to design a cross-sectional survey and distribute it online among software development companies worldwide. The participants were asked about their perception of COVID-19’s pandemic impact on different software engineering activities: requirements engineering, software architecture, user experience design, software implementation, and software quality assurance. The survey received 170 valid answers from 29 countries, and for all the software engineering activities, we found that most respondents did not observe a significant impact. The results also showed that software startups and established companies were affected differently since, in some activities, we found a negative impact in the former and a positive impact in the latter. Regarding the time spent on each software engineering activity, most of the answers reported no change, but on those that did, the result points to an increase in time. Thus, we cannot find any relation between the change in time of effort and the reported positive or negative impact.
An Intentional Forgetting-Driven Self-Healing Method for Deep Reinforcement Learning Systems
Ahmed Haj Yahmed
Rached Bouchoucha
Houssem Ben Braiek
Deep reinforcement learning (DRL) is increasingly applied in large-scale productions like Netflix and Facebook. As with most data-driven sys… (voir plus)tems, DRL systems can exhibit undesirable behaviors due to environmental drifts, which often occur in constantly-changing production settings. Continual Learning (CL) is the inherent self-healing approach for adapting the DRL agent in response to the environment's conditions shifts. However, successive shifts of considerable magnitude may cause the production environment to drift from its original state. Recent studies have shown that these environmental drifts tend to drive CL into long, or even unsuccessful, healing cycles, which arise from inefficiencies such as catastrophic forgetting, warm-starting failure, and slow convergence. In this paper, we propose Dr. DRL, an effective self-healing approach for DRL systems that integrates a novel mechanism of intentional forgetting into vanilla CL (i.e., standard CL) to overcome its main issues. Dr. DRL deliberately erases the DRL system's minor behaviors to systematically prioritize the adaptation of the key problem-solving skills. Using well-established DRL algorithms, Dr. DRL is compared with vanilla CL on various drifted environments. Dr. DRL is able to reduce, on average, the healing time and fine-tuning episodes by, respectively, 18.74% and 17.72%. Dr. DRL successfully helps agents to adapt to 19.63% of drifted environments left unsolved by vanilla CL while maintaining and even enhancing by up to 45% the obtained rewards for drifted environments that are resolved by both approaches.
Maintenance Cost of Software Ecosystem Updates
Solomon Berhe
M. Maynard
Physics-Guided Adversarial Machine Learning for Aircraft Systems Simulation
Houssem Ben Braiek
Thomas Reid
In the context of aircraft system performance assessment, deep learning technologies allow us to quickly infer models from experimental meas… (voir plus)urements, with less detailed system knowledge than usually required by physics-based modeling. However, this inexpensive model development also comes with new challenges regarding model trustworthiness. This article presents a novel approach, physics-guided adversarial machine learning (ML), which improves the confidence over the physics consistency of the model. The approach performs, first, a physics-guided adversarial testing phase to search for test inputs revealing behavioral system inconsistencies, while still falling within the range of foreseeable operational conditions. Then, it proceeds with a physics-informed adversarial training to teach the model the system-related physics domain foreknowledge through iteratively reducing the unwanted output deviations on the previously uncovered counterexamples. Empirical evaluation on two aircraft system performance models shows the effectiveness of our adversarial ML approach in exposing physical inconsistencies of both the models and in improving their propensity to be consistent with physics domain knowledge.
PaReco: patched clones and missed patches among the divergent variants of a software family
Poedjadevie Kadjel Ramkisoen
John Businge
Brent van Bladel
Alexandre Decan
Serge Demeyer
Coen De Roover
Re-using whole repositories as a starting point for new projects is often done by maintaining a variant fork parallel to the original. Howev… (voir plus)er, the common artifacts between both are not always kept up to date. As a result, patches are not optimally integrated across the two repositories, which may lead to sub-optimal maintenance between the variant and the original project. A bug existing in both repositories can be patched in one but not the other (we see this as a missed opportunity) or it can be manually patched in both probably by different developers (we see this as effort duplication). In this paper we present a tool (named PaReCo) which relies on clone detection to mine cases of missed opportunity and effort duplication from a pool of patches. We analyzed 364 (source to target) variant pairs with 8,323 patches resulting in a curated dataset containing 1,116 cases of effort duplication and 1,008 cases of missed opportunities. We achieve a precision of 91%, recall of 80%, accuracy of 88%, and F1-score of 85%. Furthermore, we investigated the time interval between patches and found out that, on average, missed patches in the target variants have been introduced in the source variants 52 weeks earlier. Consequently, PaReCo can be used to manage variability in “time” by automatically identifying interesting patches in later project releases to be backported to supported earlier releases.
Revisiting the Impact of Anti-patterns on Fault-Proneness: A Differentiated Replication
Aurel Ikama
Vincent Du
Philippe Belias
Biruk Asmare Muse
Mohammad Hamdaqa
Anti-patterns manifesting on software code through code smells have been investigated in terms of their prevalence, detection, refactoring, … (voir plus)and impact on software quality attributes. In particular, leveraging heuristics to identify fault-fixing commits, Khomh et al. have found that anti-patterns and code smells have an impact on the fault-proneness of a software system. Similarly, Saboury et al. found a relationship between anti-pattern occurrences and fault-proneness, using heuristic to identify fault-fixing commits and fault-inducing changes. However, recent studies question the accuracy of heuristics, and thus the validity of empirical studies that leverage it. Hence, in this work, we would like to investigate to what extent the results of empirical studies using heuristics to identify bug fix commits are affected by the limitations of the heuristics based approach using manually validated bug fix commits as a ground truth. In particular, we conduct a differentiated replication of the work by Khomh et al. We particularly focused on the impact of anti-patterns on fault-proneness as it is the only dependent variable that may be affected by noise in the collected faults data. In our differentiated replication study, (1) we expanded the number of subject systems from 5 to 38, (2) utilized a manually validated dataset of bug-fixing commits from the work of Herbold et al., and (3) answered research questions from Khomh et al., that are related to the relationship between anti-pattern occurrences and fault-proneness. (4) We added an additional research question to investigate if combining results from several heuristic-based approaches could help reduce the impact of noise. Our findings show that the impact of the noise generated by the automatic algorithm heuristic based is negligible for the studied subject systems; meaning that the reported relation observed on noisy data still holds on the clean data. However, we also observed that combining results from several heuristic based approaches do not reduce this noise, quite the contrary.
Video Game Bad Smells: What They Are and How Developers Perceive Them
Vittoria Nardone
Biruk Asmare Muse
Mouna Abidi
Massimiliano Di Penta
Video games represent a substantial and increasing share of the software market. However, their development is particularly challenging as i… (voir plus)t requires multi-faceted knowledge, which is not consolidated in computer science education yet. This article aims at defining a catalog of bad smells related to video game development. To achieve this goal, we mined discussions on general-purpose and video game-specific forums. After querying such a forum, we adopted an open coding strategy on a statistically significant sample of 572 discussions, stratified over different forums. As a result, we obtained a catalog of 28 bad smells, organized into five categories, covering problems related to game design and logic, physics, animation, rendering, or multiplayer. Then, we assessed the perceived relevance of such bad smells by surveying 76 game development professionals. The survey respondents agreed with the identified bad smells but also provided us with further insights about the discussed smells. Upon reporting results, we discuss bad smell examples, their consequences, as well as possible mitigation/fixing strategies and trade-offs to be pursued by developers. The catalog can be used not only as a guideline for developers and educators but also can pave the way toward better automated tool support for video game developers.
FIXME: synchronize with database! An empirical study of data access self-admitted technical debt
Biruk Asmare Muse
Csaba Nagy
Anthony Cleve
Giuliano Antoniol
Studying the Practices of Deploying Machine Learning Projects on Docker
Moses Openja
Forough Majidi
Bhagya Chembakottu
Heng Li
Works for Me! Cannot Reproduce – A Large Scale Empirical Study of Non-reproducible Bugs
Mohammad Masudur Rahman
Marco Castelluccio