Publications

Improving microbial phylogeny with citizen science within a mass-market video game

Roman Sarrazin-Gendron

Parham Ghasemloo Gheidari

Alexander Butyaev

Timothy Keding

Eddie Cai

Jiayue Zheng

Renata Mutalova

Julien Mounthanyvong

Yuxue Zhu

Elena Nazarova

Chrisostomos Drogaris

Kornél Erhart

David Bélanger

Amélie Brouillette

Michael Bouffard

Gabriel Richard

Joshua Davidson

Randy Pitchford

Mathieu Falaise

Sébastien Caisse … (voir 14 de plus)

Vincent Fiset

Mathieu Blanchette

Steven Hebert

Daniel McDonald

Dan Hewitt

Rob Knight

Jonathan Huot

Attila Szantner

Seung Kim

Jérôme Waldispühl

Jonathan Moreau-Genest

David Najjab

Steve Prince

Ludger Saintélien

2024-04-15

Nature Biotechnology (publié)

doi.org

A Survey on Deep Learning for Theorem Proving

Zhaoyu Li

Jialiang Sun

Logan Murphy

Qidong Su

Zenan Li

Xian Zhang

Kaiyu Yang

Xujie Si

2024-04-15

ArXiv (prépublication)

arxiv.org

Towards Practical Tool Usage for Continually Learning LLMs

Jerry Huang

Prasanna Parthasarathi

Mehdi Rezagholizadeh

Sarath Chandar Anbil Parthipan

Large language models (LLMs) show an innate skill for solving language based tasks. But insights have suggested an inability to adjust for i… (voir plus)nformation or task-solving skills becoming outdated, as their knowledge, stored directly within their parameters, remains static in time. Tool use helps by offloading work to systems that the LLM can access through an interface, but LLMs that use them still must adapt to nonstationary environments for prolonged use, as new tools can emerge and existing tools can change. Nevertheless, tools require less specialized knowledge, therefore we hypothesize they are better suited for continual learning (CL) as they rely less on parametric memory for solving tasks and instead focus on learning when to apply pre-defined tools. To verify this, we develop a synthetic benchmark and follow this by aggregating existing NLP tasks to form a more realistic testing scenario. While we demonstrate scaling model size is not a solution, regardless of tool usage, continual learning techniques can enable tool LLMs to both adapt faster while forgetting less, highlighting their potential as continual learners.

2024-04-14

ArXiv (prépublication)

arxiv.org

Towards Causal Deep Learning for Vulnerability Detection

Md Mahbubur Rahman

Ira Ceka

Chengzhi Mao

Saikat Chakraborty

Baishakhi Ray

Wei Le

Deep learning vulnerability detection has shown promising results in recent years. However, an important challenge that still blocks it from… (voir plus) being very useful in practice is that the model is not robust under perturbation and it cannot generalize well over the out-of-distribution (OOD) data, e.g., applying a trained model to unseen projects in real world. We hypothesize that this is because the model learned non-robust features, e.g., variable names, that have spurious correlations with labels. When the perturbed and OOD datasets no longer have the same spurious features, the model prediction fails. To address the challenge, in this paper, we introduced causality into deep learning vulnerability detection. Our approach CausalVul consists of two phases. First, we designed novel perturbations to discover spurious features that the model may use to make predictions. Second, we applied the causal learning algorithms, specifically, do-calculus, on top of existing deep learning models to systematically remove the use of spurious features and thus promote causal based prediction. Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance for all the state-of-the-art models and datasets we experimented. To the best of our knowledge, this is the first work that introduces do calculus based causal learning to software engineering models and shows it's indeed useful for improving the model accuracy, robustness and generalization. Our replication package is located at https://figshare.com/s/0ffda320dcb96c249ef2.

2024-04-12

Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (publié)

doi.org

arxiv.org

Deep learning for high-resolution dose prediction in high dose rate brachytherapy for breast cancer treatment.

Sébastien Quetin

Boris Bahoric

Farhad Maleki

Shirin A. Enger

OBJECTIVE Monte Carlo (MC) simulations are the benchmark for accurate radiotherapy dose calculations, notably in patient-specific high dose … (voir plus)rate brachytherapy (HDR BT), in cases where considering tissue heterogeneities is critical. However, the lengthy computational time limits the practical application of MC simulations. Prior research used Deep Learning (DL) for dose prediction as an alternative to MC simulations. While accurate dose predictions akin to MC were attained, GPU limitations constrained these predictions to large voxels of 3mm × 3mm × 3mm. This study aimed to enable dose predictions as accurate as MC simulations in 1mm × 1mm × 1mm voxels within a clinically acceptable timeframe. Approach: Computed tomography scans of 98 breast cancer patients treated with Iridium-192-based HDR BT were used: 70 for training, 14 for validation, and 14 for testing. A new cropping strategy based on the distance to the seed was devised to reduce the volume size, enabling efficient training of 3D DL models using 1 mm × 1 mm × 1 mm dose grids. Additionally, novel DL architecture with layer-level fusion were proposed to predict MC simulated dose to medium-in-medium (Dm,m). These architectures fuse information from TG-43 dose to water-in-water (Dw,w) with patient tissue composition at the layer-level. Different inputs describing patient body composition were investigated. Main results: The proposed approach demonstrated state-of-the-art performance, on par with the MC Dm,m maps, but 300 times faster. The mean absolute percent error for dosimetric indices between the MC and DL-predicted complete treatment plans was 0.17%±0.15% for the planning target volume V100, 0.30%±0.32% for the skin D2cc, 0.82%±0.79% for the lung D2cc, 0.34%±0.29% for the chest wall D2cc and 1.08%±0.98% for the heart D2cc. Significance: Unlike the time-consuming MC simulations, the proposed novel strategy efficiently converts TG-43 Dw,w maps into precise Dm,m maps at high resolution, enabling clinical integration.

2024-04-11

Physics in Medicine and Biology (publié)

doi.org

From the Lab to the Theater: An Unconventional Field Robotics Journey

Ali Imran

Vivek Shankar Vardharajan

Rafael Gomes Braga

Yann Bouteiller

Abdalwhab Abdalwhab

Matthis Di-Giacomo

Alexandra Mercader

Giovanni Beltrame

David St-Onge

2024-04-11

ArXiv (prépublication)

arxiv.org

Scalable Hierarchical Self-Attention with Learnable Hierarchy for Long-Range Interactions

Thuan Nguyen Anh Trang

Khang Nhat Ngo

Hugo Sonnery

Thieu Vo

Siamak Ravanbakhsh

Truong Son Hy

Self-attention models have made great strides toward accurately modeling a wide array of data modalities, including, more recently, graph-st… (voir plus)ructured data. This paper demonstrates that adaptive hierarchical attention can go a long way toward successfully applying transformers to graphs. Our proposed model Sequoia provides a powerful inductive bias towards long-range interaction modeling, leading to better generalization. We propose an end-to-end mechanism for a data-dependent construction of a hierarchy which in turn guides the self-attention mechanism. Using adaptive hierarchy provides a natural pathway toward sparse attention by constraining node-to-node interactions with the immediate family of each node in the hierarchy (e.g., parent, children, and siblings). This in turn dramatically reduces the computational complexity of a self-attention layer from quadratic to log-linear in terms of the input size while maintaining or sometimes even surpassing the standard transformer's ability to model long-range dependencies across the entire input. Experimentally, we report state-of-the-art performance on long-range graph benchmarks while remaining computationally efficient. Moving beyond graphs, we also display competitive performance on long-range sequence modeling, point-clouds classification, and segmentation when using a fixed hierarchy. Our source code is publicly available at https://github.com/HySonLab/HierAttention

2024-04-11

TMLR (accepté)

openreview.net

Deep Generative Sampling in the Dual Divergence Space: A Data-efficient&Interpretative Approach for Generative AI

Sahil Garg

Anderson Schneider

Anant Raj

Kashif Rasul

Yuriy Nevmyvaka

S. Gopal

Amit Dhurandhar

Guillermo A. Cecchi

Irina Rish

Building on the remarkable achievements in generative sampling of natural images, we propose an innovative challenge, potentially overly amb… (voir plus)itious, which involves generating samples of entire multivariate time series that resemble images. However, the statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects. This issue is especially problematic for deep generative models that follow the conventional approach of generating samples from a canonical distribution and then decoding or denoising them to match the true data distribution. In contrast, our method is grounded in information theory and aims to implicitly characterize the distribution of images, particularly the (global and local) dependency structure between pixels. We achieve this by empirically estimating its KL-divergence in the dual form with respect to the respective marginal distribution. This enables us to perform generative sampling directly in the optimized 1-D dual divergence space. Specifically, in the dual space, training samples representing the data distribution are embedded in the form of various clusters between two end points. In theory, any sample embedded between those two end points is in-distribution w.r.t. the data distribution. Our key idea for generating novel samples of images is to interpolate between the clusters via a walk as per gradients of the dual function w.r.t. the data dimensions. In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution. We provide strong theoretical guarantees along with an extensive empirical evaluation using many real-world datasets from diverse domains, establishing the superiority of our approach w.r.t. state-of-the-art deep learning methods.

2024-04-10

ArXiv (prépublication)

arxiv.org

Zero-shot Logical Query Reasoning on any Knowledge Graph

Mikhail Galkin

Jincheng Zhou

Bruno Ribeiro

Jian Tang

Zhaocheng Zhu

Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (voir plus)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, an inductive reasoning model that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG reasoning model, UltraQuery can solve CLQA on any KG even if it is only finetuned on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 14 of them.

2024-04-10

ArXiv (prépublication)

arxiv.org

AI healthcare research: Pioneering iSMART Lab

Narges Armanfard

Dr Narges Armanfard, Professor, talks us through the AI healthcare research at McGill University which is spearheading a groundbreaking init… (voir plus)iative – the iSMART Lab. Access to high-quality healthcare is not just a fundamental human right; it is the bedrock of our societal wellbeing, with the crucial roles played by doctors, nurses, and hospitals. Yet, healthcare systems globally face mounting challenges, particularly from aging populations. Dr Narges Armanfard, affiliated with McGill University and Mila Quebec AI Institute in Montreal, Canada, has spearheaded a groundbreaking initiative – the iSMART Lab. This laboratory represents a revolutionary leap into the future of healthcare, with its pioneering research in AI for health applications garnering significant attention. Renowned for its innovative integration of AI across diverse domains, iSMART Lab stands at the forefront of harnessing Artificial Intelligence to elevate and streamline health services.

2024-04-09

Open Access Government (publié)

doi.org

Interpretable Machine Learning for Finding Intermediate-mass Black Holes

Mario Pasquato

Piero Trevisan

Abbas Askar

Pablo Lemos

Gaia Carenini

Michela Mapelli

Yashar Hezaveh

2024-04-09