Publications

Effects of gene dosage on cognitive ability: A function-based association study across brain and non-brain processes
Guillaume Huguet
Thomas Renne
Cécile Poulain
Alma Dubuc
Kuldeep Kumar
Sayeh Kazem
Worrawat Engchuan
Omar Shanta
Elise Douard
Catherine Proulx
Martineau Jean-Louis
Zohra Saci
Josephine Mollon
Laura Schultz
Emma E M Knowles
Simon R. Cox
David Porteous
Gail Davies
Paul Redmond
Sarah E. Harris … (see 10 more)
Gunter Schumann
Aurélie Labbe
Zdenka Pausova
Tomas Paus
Stephen W Scherer
Jonathan Sebat
Laura Almasy
David C. Glahn
Sébastien Jacquemont
Foundational Challenges in Assuring Alignment and Safety of Large Language Models
Usman Anwar
Abulhair Saparov
Javier Rando
Daniel Paleka
Miles Turpin
Peter Hase
Ekdeep Singh Lubana
Erik Jenner
Stephen Casper
Oliver Sourbut
Benjamin L. Edelman
Zhaowei Zhang
Mario Gunther
Anton Korinek
Jose Hernandez-Orallo
Lewis Hammond
Eric J Bigelow
Alexander Pan
Lauro Langosco
Tomasz Korbak … (see 18 more)
Heidi Zhang
Ruiqi Zhong
Sean 'o H'eigeartaigh
Gabriel Recchia
Giulio Corsi
Alan Chan
Markus Anderljung
Lilian Edwards
Danqi Chen
Samuel Albanie
Jakob Nicolaus Foerster
Florian Tramèr
He He
Atoosa Kasirzadeh
Yejin Choi
This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are o… (see more)rganized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose
Government Interventions to Avert Future Catastrophic AI Risks
Improving microbial phylogeny with citizen science within a mass-market video game
Roman Sarrazin-Gendron
Parham Ghasemloo Gheidari
Alexander Butyaev
Timothy Keding
Eddie Cai
Jiayue Zheng
Renata Mutalova
Julien Mounthanyvong
Yuxue Zhu
Elena Nazarova
Chrisostomos Drogaris
Kornél Erhart
David Bélanger
Amélie Brouillette
Michael Bouffard
Gabriel Richard
Joshua Davidson
Randy Pitchford
Mathieu Falaise
Sébastien Caisse … (see 14 more)
Vincent Fiset
Steven Hebert
Daniel McDonald
Dan Hewitt
Rob Knight
Jonathan Huot
Attila Szantner
Seung Kim
Jérôme Waldispühl
Jonathan Moreau-Genest
David Najjab
Steve Prince
Ludger Saintélien
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds
David Budaghyan
Charles Onu
Arsenii Gorin
This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is … (see more)a public speaker verification challenge based on cry sounds. We released more than 6 hours of manually segmented cry sounds from 786 newborns for academic use, aiming to encourage research in infant cry analysis. The inaugural public competition attracted 59 participants, 11 of whom improved the baseline performance. The top-performing system achieved a significant improvement scoring 25.8% equal error rate, which is still far from the performance of state-of-the-art adult speaker verification systems. Therefore, we believe there is room for further research on this dataset, potentially extending beyond the verification task.
Resource-Efficient Separation Transformer
Luca Della Libera
Samuele Cornell
Frédéric Lepoutre
François Grondin
Transformers have recently achieved state-of-the-art performance in speech separation. These models, however, are computationally demanding … (see more)and require a lot of learnable parameters. This paper explores Transformer-based speech separation with a reduced computational cost. Our main contribution is the development of the Resource-Efficient Separation Transformer (RE-SepFormer), a self-attention-based architecture that reduces the computational burden in two ways. First, it uses non-overlapping blocks in the latent space. Second, it operates on compact latent summaries calculated from each chunk. The RE-SepFormer reaches a competitive performance on the popular WSJ0-2Mix and WHAM! datasets in both causal and non-causal settings. Remarkably, it scales significantly better than the previous Transformer-based architectures in terms of memory and inference time, making it more suitable for processing long mixtures.
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning
Luca Zampierin
Ghouthi Boukli Hacene
Bac Nguyen
Towards Practical Tool Usage for Continually Learning LLMs
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Large language models (LLMs) show an innate skill for solving language based tasks. But insights have suggested an inability to adjust for i… (see more)nformation or task-solving skills becoming outdated, as their knowledge, stored directly within their parameters, remains static in time. Tool use helps by offloading work to systems that the LLM can access through an interface, but LLMs that use them still must adapt to nonstationary environments for prolonged use, as new tools can emerge and existing tools can change. Nevertheless, tools require less specialized knowledge, therefore we hypothesize they are better suited for continual learning (CL) as they rely less on parametric memory for solving tasks and instead focus on learning when to apply pre-defined tools. To verify this, we develop a synthetic benchmark and follow this by aggregating existing NLP tasks to form a more realistic testing scenario. While we demonstrate scaling model size is not a solution, regardless of tool usage, continual learning techniques can enable tool LLMs to both adapt faster while forgetting less, highlighting their potential as continual learners.
Why People Contribute Software Documentation
Deeksha M. Arya
Martin P. Robillard
Towards Causal Deep Learning for Vulnerability Detection
Md Mahbubur Rahman
Ira Ceka
Chengzhi Mao
Saikat Chakraborty
Baishakhi Ray
Wei Le
Deep learning vulnerability detection has shown promising results in recent years. However, an important challenge that still blocks it from… (see more) being very useful in practice is that the model is not robust under perturbation and it cannot generalize well over the out-of-distribution (OOD) data, e.g., applying a trained model to unseen projects in real world. We hypothesize that this is because the model learned non-robust features, e.g., variable names, that have spurious correlations with labels. When the perturbed and OOD datasets no longer have the same spurious features, the model prediction fails. To address the challenge, in this paper, we introduced causality into deep learning vulnerability detection. Our approach CausalVul consists of two phases. First, we designed novel perturbations to discover spurious features that the model may use to make predictions. Second, we applied the causal learning algorithms, specifically, do-calculus, on top of existing deep learning models to systematically remove the use of spurious features and thus promote causal based prediction. Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance for all the state-of-the-art models and datasets we experimented. To the best of our knowledge, this is the first work that introduces do calculus based causal learning to software engineering models and shows it's indeed useful for improving the model accuracy, robustness and generalization. Our replication package is located at https://figshare.com/s/0ffda320dcb96c249ef2.
Deep learning for high-resolution dose prediction in high dose rate brachytherapy for breast cancer treatment.
Sébastien Quetin
Boris Bahoric
Farhad Maleki
OBJECTIVE Monte Carlo (MC) simulations are the benchmark for accurate radiotherapy dose calculations, notably in patient-specific high dose … (see more)rate brachytherapy (HDR BT), in cases where considering tissue heterogeneities is critical. However, the lengthy computational time limits the practical application of MC simulations. Prior research used Deep Learning (DL) for dose prediction as an alternative to MC simulations. While accurate dose predictions akin to MC were attained, GPU limitations constrained these predictions to large voxels of 3mm × 3mm × 3mm. This study aimed to enable dose predictions as accurate as MC simulations in 1mm × 1mm × 1mm voxels within a clinically acceptable timeframe. Approach: Computed tomography scans of 98 breast cancer patients treated with Iridium-192-based HDR BT were used: 70 for training, 14 for validation, and 14 for testing. A new cropping strategy based on the distance to the seed was devised to reduce the volume size, enabling efficient training of 3D DL models using 1 mm × 1 mm × 1 mm dose grids. Additionally, novel DL architecture with layer-level fusion were proposed to predict MC simulated dose to medium-in-medium (Dm,m). These architectures fuse information from TG-43 dose to water-in-water (Dw,w) with patient tissue composition at the layer-level. Different inputs describing patient body composition were investigated. Main results: The proposed approach demonstrated state-of-the-art performance, on par with the MC Dm,m maps, but 300 times faster. The mean absolute percent error for dosimetric indices between the MC and DL-predicted complete treatment plans was 0.17%±0.15% for the planning target volume V100, 0.30%±0.32% for the skin D2cc, 0.82%±0.79% for the lung D2cc, 0.34%±0.29% for the chest wall D2cc and 1.08%±0.98% for the heart D2cc. Significance: Unlike the time-consuming MC simulations, the proposed novel strategy efficiently converts TG-43 Dw,w maps into precise Dm,m maps at high resolution, enabling clinical integration.
From the Lab to the Theater: An Unconventional Field Robotics Journey
Ali Imran
Vivek Shankar Vardharajan
Rafael Gomes Braga
Yann Bouteiller
Abdalwhab Abdalwhab
Matthis Di-Giacomo
Alexandra Mercader
David St-Onge