Portrait of Foutse Khomh

Foutse Khomh

Associate Academic Member
Canada CIFAR AI Chair
Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Research Topics
Data Mining
Deep Learning
Distributed Systems
Generative Models
Learning to Program
Natural Language Processing
Reinforcement Learning

Biography

Foutse Khomh is a full professor of software engineering at Polytechnique Montréal, a Canada CIFAR AI Chair – Trustworthy Machine Learning Software Systems, and an FRQ-IVADO Research Chair in Software Quality Assurance for Machine Learning Applications. Khomh completed a PhD in software engineering at Université de Montréal in 2011, for which he received an Award of Excellence. He was also awarded a CS-Can/Info-Can Outstanding Young Computer Science Researcher Prize in 2019.

His research interests include software maintenance and evolution, machine learning systems engineering, cloud engineering, and dependable and trustworthy ML/AI. His work has received four Ten-year Most Influential Paper (MIP) awards, and six Best/Distinguished Paper Awards. He has served on the steering committee of numerous organizations in software engineering, including SANER (chair), MSR, PROMISE, ICPC (chair), and ICSME (vice-chair). He initiated and co-organized Polytechnique Montréal‘s Software Engineering for Machine Learning Applications (SEMLA) symposium and the RELENG (release engineering) workshop series.

Khomh co-founded the NSERC CREATE SE4AI: A Training Program on the Development, Deployment and Servicing of Artificial Intelligence-based Software Systems, and is a principal investigator for the DEpendable Explainable Learning (DEEL) project.

He also co-founded Confiance IA, a Quebec consortium focused on building trustworthy AI, and is on the editorial board of multiple international software engineering journals, including IEEE Software, EMSE and JSEP. He is a senior member of IEEE.

Current Students

Collaborating Alumni - Polytechnique Montréal
PhD - Polytechnique Montréal
PhD - Polytechnique Montréal
Master's Research - Polytechnique Montréal
Postdoctorate - Polytechnique Montréal
Co-supervisor :
Master's Research - Polytechnique Montréal
PhD - Polytechnique Montréal
Master's Research - Polytechnique Montréal

Publications

Doctoral Symposium Committee
Anthony Cleve
Christian Lange
Silvia Breu
Manar H. Alalfi
Mario Luca Bernardi
Cornelia Boldyreff
Marco D'Ambros
Simon Denier
Natalia Dragan
Ekwa Duala-Ekoko
Fausto Fasano
Adnane Ghannem
Carmine Gravino
Maen Hammad
Imed Hammouda
Salima Hassaine
Yue Jia
Zhen Ming (Jack) Jiang
Adam Kiezun … (see 11 more)
Jay Kothari
Jonathan Memaitre
Naouel Moha
Rocco Oliveto
Denys Poshyvanyk
Michele Risi
Giuseppe Scanniello
Bonita Sharif
Andrew Sutton
Anis Yousefi
Eugenio Zimeo
Manar H. Alalfi Mario Luca Bernardi Cornelia Boldyreff Anthony Cleve Marco D'Ambros Simon Denier Natalia Dragan Ekwa Duala-Ekoko Fausto Fasa… (see more)no Adnane Ghannem Carmine Gravino Maen Hammad Imed Hammouda Salima Hassaine Yue Jia Zhen Ming Jiang Foutse Khomh Adam Kiezun Jay Kothari Jonathan Memaitre Naouel Moha Rocco Oliveto Denys Poshyvanyk Michele Risi Giuseppe Scanniello Bonita Sharif Andrew Sutton Anis Yousefi Eugenio Zimeo
Doctoral Symposium Committee
Anthony Cleve
Christian Lange
Silvia Breu
Manar H. Alalfi
Mario Luca Bernardi
Cornelia Boldyreff
Marco D'Ambros
Simon Denier
Natalia Dragan
Ekwa Duala-Ekoko
Fausto Fasano
Adnane Ghannem
Carmine Gravino
Maen Hammad
Imed Hammouda
Salima Hassaine
Yue Jia
Zhen Ming Jiang
Adam Kiezun … (see 11 more)
Jay Kothari
Jonathan Memaitre
Naouel Moha
Rocco Oliveto
Denys Poshyvanyk
Michele Risi
Giuseppe Scanniello
Bonita Sharif
Andrew Sutton
Anis Yousefi
Eugenio Zimeo
Manar H. Alalfi Mario Luca Bernardi Cornelia Boldyreff Anthony Cleve Marco D'Ambros Simon Denier Natalia Dragan Ekwa Duala-Ekoko Fausto Fasa… (see more)no Adnane Ghannem Carmine Gravino Maen Hammad Imed Hammouda Salima Hassaine Yue Jia Zhen Ming Jiang Foutse Khomh Adam Kiezun Jay Kothari Jonathan Memaitre Naouel Moha Rocco Oliveto Denys Poshyvanyk Michele Risi Giuseppe Scanniello Bonita Sharif Andrew Sutton Anis Yousefi Eugenio Zimeo
In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators
Dmytro Humeniuk
Houssem Ben Braiek
Thomas Reid
LIBS-Raman Multimodal Architecture for Automated Lunar Prospecting
Jérôme Pigeon
Richard Boudreault
Ahmed Ashraf
Pooneh Maghoul
LIBS-Raman Multimodal Architecture for Automated Lunar Prospecting
Jérôme Pigeon
Richard Boudreault
Ahmed Ashraf
P. Maghoul
Toward Debugging Deep Reinforcement Learning Programs with RLExplorer
Rached Bouchoucha
Ahmed Haj Yahmed
Darshan Patil
Janarthanan Rajendran
Amin Nikanjam
Deep reinforcement learning (DRL) has shown success in diverse domains such as robotics, computer games, and recommendation systems. However… (see more), like any other software system, DRL-based software systems are susceptible to faults that pose unique challenges for debugging and diagnosing. These faults often result in unexpected behavior without explicit failures and error messages, making debugging difficult and time-consuming. Therefore, automating the monitoring and diagnosis of DRL systems is crucial to alleviate the burden on developers. In this paper, we propose RLExplorer, the first fault diagnosis approach for DRL-based software systems. RLExplorer automatically monitors training traces and runs diagnosis routines based on properties of the DRL learning dynamics to detect the occurrence of DRL-specific faults. It then logs the results of these diagnoses as warnings that cover theoretical concepts, recommended practices, and potential solutions to the identified faults. We conducted two sets of evaluations to assess RLExplorer. Our first evaluation of faulty DRL samples from Stack Overflow revealed that our approach can effectively diagnose real faults in 83% of the cases. Our second evaluation of RLExplorer with 15 DRL experts/developers showed that (1) RLExplorer could identify 3.6 times more defects than manual debugging and (2) RLExplorer is easily integrated into DRL applications.
Toward Debugging Deep Reinforcement Learning Programs with RLExplorer
Rached Bouchoucha
Ahmed Haj Yahmed
Darshan Patil
Janarthanan Rajendran
Amin Nikanjam
Deep reinforcement learning (DRL) has shown success in diverse domains such as robotics, computer games, and recommendation systems. However… (see more), like any other software system, DRL-based software systems are susceptible to faults that pose unique challenges for debugging and diagnosing. These faults often result in unexpected behavior without explicit failures and error messages, making debugging difficult and time-consuming. Therefore, automating the monitoring and diagnosis of DRL systems is crucial to alleviate the burden on developers. In this paper, we propose RLExplorer, the first fault diagnosis approach for DRL-based software systems. RLExplorer automatically monitors training traces and runs diagnosis routines based on properties of the DRL learning dynamics to detect the occurrence of DRL-specific faults. It then logs the results of these diagnoses as warnings that cover theoretical concepts, recommended practices, and potential solutions to the identified faults. We conducted two sets of evaluations to assess RLExplorer. Our first evaluation of faulty DRL samples from Stack Overflow revealed that our approach can effectively diagnose real faults in 83% of the cases. Our second evaluation of RLExplorer with 15 DRL experts/developers showed that (1) RLExplorer could identify 3.6 times more defects than manual debugging and (2) RLExplorer is easily integrated into DRL applications.
Understanding Web Application Workloads and Their Applications: Systematic Literature Review and Characterization
Roozbeh Aghili
Qiaolin Qin
Heng Li
What Information Contributes to Log-based Anomaly Detection? Insights from a Configurable Transformer-Based Approach
Xingfang Wu
Heng Li
Log data are generated from logging statements in the source code, providing insights into the execution processes of software applications … (see more)and systems. State-of-the-art log-based anomaly detection approaches typically leverage deep learning models to capture the semantic or sequential information in the log data and detect anomalous runtime behaviors. However, the impacts of these different types of information are not clear. In addition, existing approaches have not captured the timestamps in the log data, which can potentially provide more fine-grained temporal information than sequential information. In this work, we propose a configurable transformer-based anomaly detection model that can capture the semantic, sequential, and temporal information in the log data and allows us to configure the different types of information as the model's features. Additionally, we train and evaluate the proposed model using log sequences of different lengths, thus overcoming the constraint of existing methods that rely on fixed-length or time-windowed log sequences as inputs. With the proposed model, we conduct a series of experiments with different combinations of input features to evaluate the roles of different types of information in anomaly detection. When presented with log sequences of varying lengths, the model can attain competitive and consistently stable performance compared to the baselines. The results indicate that the event occurrence information plays a key role in identifying anomalies, while the impact of the sequential and temporal information is not significant for anomaly detection in the studied public datasets. On the other hand, the findings also reveal the simplicity of the studied public datasets and highlight the importance of constructing new datasets that contain different types of anomalies to better evaluate the performance of anomaly detection models.
Understanding Web Application Workloads and Their Applications: Systematic Literature Review and Characterization
Roozbeh Aghili
Qiaolin Qin
Heng Li
Web applications, accessible via web browsers over the Internet, facilitate complex functionalities without local software installation. In … (see more)the context of web applications, a workload refers to the number of user requests sent by users or applications to the underlying system. Existing studies have leveraged web application workloads to achieve various objectives, such as workload prediction and auto-scaling. However, these studies are conducted in an ad hoc manner, lacking a systematic understanding of the characteristics of web application workloads. In this study, we first conduct a systematic literature review to identify and analyze existing studies leveraging web application workloads. Our analysis sheds light on their workload utilization, analysis techniques, and high-level objectives. We further systematically analyze the characteristics of the web application workloads identified in the literature review. Our analysis centers on characterizing these workloads at two distinct temporal granularities: daily and weekly. We successfully identify and categorize three daily and three weekly patterns within the workloads. By providing a statistical characterization of these workload patterns, our study highlights the uniqueness of each pattern, paving the way for the development of realistic workload generation and resource provisioning techniques that can benefit a range of applications and research areas.
An Empirical Study of Sensitive Information in Logs
Roozbeh Aghili
Heng Li
Protecting Privacy in Software Logs: What Should Be Anonymized?
Roozbeh Aghili
Heng Li