Publications

Unveiling Mental Imagery: Enhanced Mental Images Reconstruction using EEG and the Bubbles Method
Audrey Lamy-Proulx
Laurence Leblond
Jasper van den Bosch
Catherine Landry
Peter Brotherwood
Frédéric Gosselin
When Machines Outshine Humans in Object Recognition, Benchmarking Dilemma
Md Rifat Arefin
Jocelyn Faubert
A high-throughput phenotypic screen combined with an ultra-large-scale deep learning-based virtual screening reveals novel scaffolds of antibacterial compounds
Gabriele Scalia
Steven T. Rutherford
Ziqing Lu
Kerry R. Buchholz
Nicholas Skelton
Kangway Chuang
Nathaniel Diamant
Jan-Christian Hütter
Jerome-Maxim Luescher
Anh Miu
Jeff Blaney
Leo Gendelev
Elizabeth Skippington
Greg Zynda
Nia Dickson
Aviv Regev
Man-Wah Tan
Tommaso Biancalani
The proliferation of multi-drug-resistant bacteria underscores an urgent need for novel antibiotics. Traditional discovery methods face chal… (see more)lenges due to limited chemical diversity, high costs, and difficulties in identifying structurally novel compounds. Here, we explore the integration of small molecule high-throughput screening with a deep learning-based virtual screening approach to uncover new antibacterial compounds. Leveraging a diverse library of nearly 2 million small molecules, we conducted comprehensive phenotypic screening against a sensitized Escherichia coli strain that, at a low hit rate, yielded thousands of hits. We trained a deep learning model, GNEprop, to predict antibacterial activity, ensuring robustness through out-of-distribution generalization techniques. Virtual screening of over 1.4 billion compounds identified potential candidates, of which 82 exhibited antibacterial activity, illustrating a 90X improved hit rate over the high-throughput screening experiment GNEprop was trained on. Importantly, a significant portion of these newly identified compounds exhibited high dissimilarity to known antibiotics, indicating promising avenues for further exploration in antibiotic discovery.
Trimming the Risk: Towards Reliable Continuous Training for Deep Learning Inspection Systems
Altaf Allah Abbassi
Thomas Reid
End-to-end Conditional Robust Optimization
Abhilash Reddy Chenreddy
Shedding Light on Large Generative Networks: Estimating Epistemic Uncertainty in Diffusion Models
Lucas Berry
Axel Brando
Generative diffusion models, notable for their large parameter count (exceeding 100 million) and operation within high-dimensional image spa… (see more)ces, pose significant challenges for traditional uncertainty estimation methods due to computational demands. In this work, we introduce an innovative framework, Diffusion Ensembles for Capturing Uncertainty (DECU), designed for estimating epistemic uncertainty for diffusion models. The DECU framework introduces a novel method that efficiently trains ensembles of conditional diffusion models by incorporating a static set of pre-trained parameters, drastically reducing the computational burden and the number of parameters that require training. Additionally, DECU employs Pairwise-Distance Estimators (PaiDEs) to accurately measure epistemic uncertainty by evaluating the mutual information between model outputs and weights in high-dimensional spaces. The effectiveness of this framework is demonstrated through experiments on the ImageNet dataset, highlighting its capability to capture epistemic uncertainty, specifically in under-sampled image classes.
Feedback-guided Data Synthesis for Imbalanced Classification
Current status quo in machine learning is to use static datasets of real images for training, which often come from long-tailed distribution… (see more)s. With the recent advances in generative models, researchers have started augmenting these static datasets with synthetic data, reporting moderate performance improvements on classification tasks. We hypothesize that these performance gains are limited by the lack of feedback from the classifier to the generative model, which would promote the usefulness of the generated samples to improve the classifier's performance. In this work, we introduce a framework for augmenting static datasets with useful synthetic samples, which leverages one-shot feedback from the classifier to drive the sampling of the generative model. In order for the framework to be effective, we find that the samples must be close to the support of the real data of the task at hand, and be sufficiently diverse. We validate three feedback criteria on a long-tailed dataset (ImageNet-LT, Places-LT) as well as a group-imbalanced dataset (NICO++). On ImageNet-LT, we achieve state-of-the-art results, with over
Quality issues in Machine Learning Software Systems
Pierre-Olivier Côté
Amin Nikanjam
Rached Bouchoucha
Ilan Basta
Mouna Abidi
Context: An increasing demand is observed in various domains to employ Machine Learning (ML) for solving complex problems. ML models are imp… (see more)lemented as software components and deployed in Machine Learning Software Systems (MLSSs). Problem: There is a strong need for ensuring the serving quality of MLSSs. False or poor decisions of such systems can lead to malfunction of other systems, significant financial losses, or even threat to human life. The quality assurance of MLSSs is considered as a challenging task and currently is a hot research topic. Moreover, it is important to cover all various aspects of the quality in MLSSs. Objective: This paper aims to investigate the characteristics of real quality issues in MLSSs from the viewpoint of practitioners. This empirical study aims to identify a catalog of bad-practices related to poor quality in MLSSs. Method: We plan to conduct a set of interviews with practitioners/experts, believing that interviews are the best method to retrieve their experience and practices when dealing with quality issues. We expect that the catalog of issues developed at this step will also help us later to identify the severity, root causes, and possible remedy for quality issues of MLSSs, allowing us to develop efficient quality assurance tools for ML models and MLSSs.
Are Heterophily-Specific GNNs and Homophily Metrics Really Effective? Evaluation Pitfalls and New Benchmarks
Qincheng Lu
Xinyu Wang
Jiaqi Zhu
Xiao-Wen Chang
Over the past decade, Graph Neural Networks (GNNs) have achieved great success on machine learning tasks with relational data. However, rece… (see more)nt studies have found that heterophily can cause significant performance degradation of GNNs, especially on node-level tasks. Numerous heterophilic benchmark datasets have been put forward to validate the efficacy of heterophily-specific GNNs and various homophily metrics have been designed to help people recognize these malignant datasets. Nevertheless, there still exist multiple pitfalls that severely hinder the proper evaluation of new models and metrics. In this paper, we point out three most serious pitfalls: 1) a lack of hyperparameter tuning; 2) insufficient model evaluation on the real challenging heterophilic datasets; 3) missing quantitative evaluation benchmark for homophily metrics on synthetic graphs. To overcome these challenges, we first train and fine-tune baseline models on
Correction: Economic evaluation of the effect of needle and syringe programs on skin, soft tissue, and vascular infections in people who inject drugs: a microsimulation modelling approach
Jihoon Lim
W Alton Russell
Mariam El-Sheikh
David L Buckeridge
Dimitra Panagiotoglou
Perspectives on virtual interviews and emerging technologies integration in family medicine residency programs: a cross-sectional survey study
Raymond Tolentino
Charo Rodriguez
Fanny Hersson-Edery
Julie Lane
Samira Abbasgholizadeh Rahimi
During the coronavirus disease of 2019 (COVID-19) pandemic, in-person interviews for the recruitment of family medicine residents shifted to… (see more) online (virtual) interviews. The purpose of this study was twofold: (1) to gather the ideas about virtual interviews of family medicine applicants (interviewees), and faculty and staff who interviewed these applicants (interviewers), and (2) to describe interviewers’ and interviewees’ opinions of use of emerging technologies such as artificial intelligence (AI) and virtual reality (VR) in the recruitment process as well as during clinical practice. This was a cross-sectional survey study. Participants were both interviewers and candidates who applied to the McGill University Family Medicine Residency Program for the 2020–2021 and 2021–2022 cycles. The study population was constituted by N = 132 applicants and N = 60 interviewers. The response rate was 91.7% (55/60) for interviewers and 43.2% (57/132) for interviewees. Both interviewers (43.7%) and interviewees (68.5%) were satisfied with connecting through virtual interviews. Interviewers (43.75%) and interviewees (55.5%) would prefer for both options to be available. Both interviewers (50%) and interviewees (72%) were interested in emerging technologies. Almost all interviewees (95.8%) were interested in learning about AI and VR and its application in clinical practice with the majority (60.8%) agreeing that it should be taught within medical training. Although experience of virtual interviewing during the COVID-19 pandemic has been positive for both interviewees and interviewers, the findings of this study suggest that it will be unlikely that virtual interviews completely replace in-person interviews for selecting candidates for family medicine residency programs in the long term as participants value aspects of in-person interviews and would want a choice in format. Since incoming family medicine physicians seem to be eager to learn and utilize emerging technologies such as AI and VR, educators and institutions should consider family physicians’ needs due to the changing technological landscape in family medicine education. The online version contains supplementary material available at 10.1186/s12909-024-05874-5.
The Strength of Fuel Refueling Location Problem Formulations