Publications

Quality issues in Machine Learning Software Systems
Pierre-Olivier Côté
Amin Nikanjam
Rached Bouchoucha
Ilan Basta
Mouna Abidi
Context: An increasing demand is observed in various domains to employ Machine Learning (ML) for solving complex problems. ML models are imp… (voir plus)lemented as software components and deployed in Machine Learning Software Systems (MLSSs). Problem: There is a strong need for ensuring the serving quality of MLSSs. False or poor decisions of such systems can lead to malfunction of other systems, significant financial losses, or even threat to human life. The quality assurance of MLSSs is considered as a challenging task and currently is a hot research topic. Moreover, it is important to cover all various aspects of the quality in MLSSs. Objective: This paper aims to investigate the characteristics of real quality issues in MLSSs from the viewpoint of practitioners. This empirical study aims to identify a catalog of bad-practices related to poor quality in MLSSs. Method: We plan to conduct a set of interviews with practitioners/experts, believing that interviews are the best method to retrieve their experience and practices when dealing with quality issues. We expect that the catalog of issues developed at this step will also help us later to identify the severity, root causes, and possible remedy for quality issues of MLSSs, allowing us to develop efficient quality assurance tools for ML models and MLSSs.
Are Heterophily-Specific GNNs and Homophily Metrics Really Effective? Evaluation Pitfalls and New Benchmarks
Qincheng Lu
Xinyu Wang
Jiaqi Zhu
Xiao-Wen Chang
Over the past decade, Graph Neural Networks (GNNs) have achieved great success on machine learning tasks with relational data. However, rece… (voir plus)nt studies have found that heterophily can cause significant performance degradation of GNNs, especially on node-level tasks. Numerous heterophilic benchmark datasets have been put forward to validate the efficacy of heterophily-specific GNNs and various homophily metrics have been designed to help people recognize these malignant datasets. Nevertheless, there still exist multiple pitfalls that severely hinder the proper evaluation of new models and metrics. In this paper, we point out three most serious pitfalls: 1) a lack of hyperparameter tuning; 2) insufficient model evaluation on the real challenging heterophilic datasets; 3) missing quantitative evaluation benchmark for homophily metrics on synthetic graphs. To overcome these challenges, we first train and fine-tune baseline models on
Correction: Economic evaluation of the effect of needle and syringe programs on skin, soft tissue, and vascular infections in people who inject drugs: a microsimulation modelling approach
Jihoon Lim
W Alton Russell
Mariam El-Sheikh
David L Buckeridge
Dimitra Panagiotoglou
Perspectives on virtual interviews and emerging technologies integration in family medicine residency programs: a cross-sectional survey study
Raymond Tolentino
Charo Rodriguez
Fanny Hersson-Edery
Julie Lane
Samira Abbasgholizadeh Rahimi
During the coronavirus disease of 2019 (COVID-19) pandemic, in-person interviews for the recruitment of family medicine residents shifted to… (voir plus) online (virtual) interviews. The purpose of this study was twofold: (1) to gather the ideas about virtual interviews of family medicine applicants (interviewees), and faculty and staff who interviewed these applicants (interviewers), and (2) to describe interviewers’ and interviewees’ opinions of use of emerging technologies such as artificial intelligence (AI) and virtual reality (VR) in the recruitment process as well as during clinical practice. This was a cross-sectional survey study. Participants were both interviewers and candidates who applied to the McGill University Family Medicine Residency Program for the 2020–2021 and 2021–2022 cycles. The study population was constituted by N = 132 applicants and N = 60 interviewers. The response rate was 91.7% (55/60) for interviewers and 43.2% (57/132) for interviewees. Both interviewers (43.7%) and interviewees (68.5%) were satisfied with connecting through virtual interviews. Interviewers (43.75%) and interviewees (55.5%) would prefer for both options to be available. Both interviewers (50%) and interviewees (72%) were interested in emerging technologies. Almost all interviewees (95.8%) were interested in learning about AI and VR and its application in clinical practice with the majority (60.8%) agreeing that it should be taught within medical training. Although experience of virtual interviewing during the COVID-19 pandemic has been positive for both interviewees and interviewers, the findings of this study suggest that it will be unlikely that virtual interviews completely replace in-person interviews for selecting candidates for family medicine residency programs in the long term as participants value aspects of in-person interviews and would want a choice in format. Since incoming family medicine physicians seem to be eager to learn and utilize emerging technologies such as AI and VR, educators and institutions should consider family physicians’ needs due to the changing technological landscape in family medicine education. The online version contains supplementary material available at 10.1186/s12909-024-05874-5.
The Strength of Fuel Refueling Location Problem Formulations
CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning
Evaluating autonomous vehicle stacks (AVs) in simulation typically involves replaying driving logs from real-world recorded traffic. However… (voir plus), agents replayed from offline data do not react to the actions of the AV, and their behaviour cannot be easily controlled to simulate counterfactual scenarios. Existing approaches have attempted to address these shortcomings by proposing methods that rely on heuristics or learned generative models of real-world data but these approaches either lack realism or necessitate costly iterative sampling procedures to control the generated behaviours. In this work, we take an alternative approach and propose CtRL-Sim, a method that leverages return-conditioned offline reinforcement learning within a physics-enhanced Nocturne simulator to efficiently generate reactive and controllable traffic agents. Specifically, we process real-world driving data through the Nocturne simulator to generate a diverse offline reinforcement learning dataset, annotated with various reward terms. With this dataset, we train a return-conditioned multi-agent behaviour model that allows for fine-grained manipulation of agent behaviours by modifying the desired returns for the various reward components. This capability enables the generation of a wide range of driving behaviours beyond the scope of the initial dataset, including those representing adversarial behaviours. We demonstrate that CtRL-Sim can efficiently generate diverse and realistic safety-critical scenarios while providing fine-grained control over agent behaviours. Further, we show that fine-tuning our model on simulated safety-critical scenarios generated by our model enhances this controllability.
Towards Robust Saliency Maps
Nham Le
Arie Gurfinkel
Saliency maps are one of the most popular tools to interpret the operation of a neural network: they compute input features deemed relevant … (voir plus)to the final prediction, which are often subsets of pixels that are easily understandable by a human being. However, it is known that relying solely on human assessment to judge a saliency map method can be misleading. In this work, we propose a new neural network verification specification called saliency-robustness, which aims to use formal methods to prove a relationship between Vanilla Gradient (VG) -- a simple yet surprisingly effective saliency map method -- and the network's prediction: given a network, if an input
Reputation Gaming in Crowd Technical Knowledge Sharing
Iren Mazloomzadeh
Gias Uddin
Ashkan Sami
Stack Overflow incentive system awards users with reputation scores to ensure quality. The decentralized nature of the forum may make the in… (voir plus)centive system prone to manipulation. This paper offers, for the first time, a comprehensive study of the reported types of reputation manipulation scenarios that might be exercised in Stack Overflow and the prevalence of such reputation gamers by a qualitative study of 1,697 posts from meta Stack Exchange sites. We found four different types of reputation fraud scenarios, such as voting rings where communities form to upvote each other repeatedly on similar posts. We developed algorithms that enable platform managers to automatically identify these suspicious reputation gaming scenarios for review. The first algorithm identifies isolated/semi-isolated communities where probable reputation frauds may occur mostly by collaborating with each other. The second algorithm looks for sudden unusual big jumps in the reputation scores of users. We evaluated the performance of our algorithms by examining the reputation history dashboard of Stack Overflow users from the Stack Overflow website. We observed that around 60-80% of users flagged as suspicious by our algorithms experienced reductions in their reputation scores by Stack Overflow.
Advancing EDGE Zones to identify spatial conservation priorities of tetrapod evolutionary history
Sebastian Pipins
Jonathan E. M. Baillie
Alex Bowmer
Nisha Owen
Rikki Gumbs
Online Convex Optimization for On-Board Routing in High-Throughput Satellites
Jean-Luc Lupien
Olfa Ben Yahia
Stéphane Martel
Gunes Karabulut Kurt
The rise in low Earth orbit (LEO) satellite Internet services has led to increasing demand, often exceeding available data rates and comprom… (voir plus)ising the quality of service. While deploying more satellites offers a short-term fix, designing higher-performance satellites with enhanced transmission capabilities provides a more sustainable solution. Achieving the necessary high capacity requires interconnecting multiple modem banks within a satellite payload. However, there is a notable gap in research on internal packet routing within extremely high-throughput satellites. To address this, we propose a real-time optimal flow allocation and priority queue scheduling method using online convex optimization-based model predictive control. We model the problem as a multi-commodity flow instance and employ an online interior-point method to solve the routing and scheduling optimization iteratively. This approach minimizes packet loss and supports real-time rerouting with low computational overhead. Our method is tested in simulation on a next-generation extremely high-throughput satellite model, demonstrating its effectiveness compared to a reference batch optimization and to traditional methods.
THInC: A Theory-Driven Framework for Computational Humor Detection
Victor De Marez
Thomas Winters
Humor is a fundamental aspect of human communication and cognition, as it plays a crucial role in social engagement. Although theories about… (voir plus) humor have evolved over centuries, there is still no agreement on a single, comprehensive humor theory. Likewise, computationally recognizing humor remains a significant challenge despite recent advances in large language models. Moreover, most computational approaches to detecting humor are not based on existing humor theories. This paper contributes to bridging this long-standing gap between humor theory research and computational humor detection by creating an interpretable framework for humor classification, grounded in multiple humor theories, called THInC (Theory-driven Humor Interpretation and Classification). THInC ensembles interpretable GA2M classifiers, each representing a different humor theory. We engineered a transparent flow to actively create proxy features that quantitatively reflect different aspects of theories. An implementation of this framework achieves an F1 score of 0.85. The associative interpretability of the framework enables analysis of proxy efficacy, alignment of joke features with theories, and identification of globally contributing features. This paper marks a pioneering effort in creating a humor detection framework that is informed by diverse humor theories and offers a foundation for future advancements in theory-driven humor classification. It also serves as a first step in automatically comparing humor theories in a quantitative manner.
Audio Editing with Non-Rigid Text Prompts
Zhepei Wang
Mirco Ravanaelli
Paris Smaragdis
Yusuf Cem Sübakan
In this paper, we explore audio-editing with non-rigid text edits. We show that the proposed editing pipeline is able to create audio edits … (voir plus)that remain faithful to the input audio. We explore text prompts that perform addition, style transfer, and in-painting. We quantitatively and qualitatively show that the edits are able to obtain results which outperform Audio-LDM, a recently released text-prompted audio generation model. Qualitative inspection of the results points out that the edits given by our approach remain more faithful to the input audio in terms of keeping the original onsets and offsets of the audio events.