Deep learning for high-resolution dose prediction in high dose rate brachytherapy for breast cancer treatment.
Sébastien Quetin
Boris Bahoric
Farhad Maleki
OBJECTIVE Monte Carlo (MC) simulations are the benchmark for accurate radiotherapy dose calculations, notably in patient-specific high dose … (voir plus)rate brachytherapy (HDR BT), in cases where considering tissue heterogeneities is critical. However, the lengthy computational time limits the practical application of MC simulations. Prior research used Deep Learning (DL) for dose prediction as an alternative to MC simulations. While accurate dose predictions akin to MC were attained, GPU limitations constrained these predictions to large voxels of 3mm × 3mm × 3mm. This study aimed to enable dose predictions as accurate as MC simulations in 1mm × 1mm × 1mm voxels within a clinically acceptable timeframe. Approach: Computed tomography scans of 98 breast cancer patients treated with Iridium-192-based HDR BT were used: 70 for training, 14 for validation, and 14 for testing. A new cropping strategy based on the distance to the seed was devised to reduce the volume size, enabling efficient training of 3D DL models using 1 mm × 1 mm × 1 mm dose grids. Additionally, novel DL architecture with layer-level fusion were proposed to predict MC simulated dose to medium-in-medium (Dm,m). These architectures fuse information from TG-43 dose to water-in-water (Dw,w) with patient tissue composition at the layer-level. Different inputs describing patient body composition were investigated. Main results: The proposed approach demonstrated state-of-the-art performance, on par with the MC Dm,m maps, but 300 times faster. The mean absolute percent error for dosimetric indices between the MC and DL-predicted complete treatment plans was 0.17%±0.15% for the planning target volume V100, 0.30%±0.32% for the skin D2cc, 0.82%±0.79% for the lung D2cc, 0.34%±0.29% for the chest wall D2cc and 1.08%±0.98% for the heart D2cc. Significance: Unlike the time-consuming MC simulations, the proposed novel strategy efficiently converts TG-43 Dw,w maps into precise Dm,m maps at high resolution, enabling clinical integration.
From the Lab to the Theater: An Unconventional Field Robotics Journey
Ali Imran
Vivek Shankar Vardharajan
Rafael Gomes Braga
Yann Bouteiller
Abdalwhab Abdalwhab
Matthis Di-Giacomo
Alexandra Mercader
David St-Onge
Scalable Hierarchical Self-Attention with Learnable Hierarchy for Long-Range Interactions
Thuan Nguyen Anh Trang
Khang Nhat Ngo
Hugo Sonnery
Thieu Vo
Truong Son Hy
Self-attention models have made great strides toward accurately modeling a wide array of data modalities, including, more recently, graph-st… (voir plus)ructured data. This paper demonstrates that adaptive hierarchical attention can go a long way toward successfully applying transformers to graphs. Our proposed model Sequoia provides a powerful inductive bias towards long-range interaction modeling, leading to better generalization. We propose an end-to-end mechanism for a data-dependent construction of a hierarchy which in turn guides the self-attention mechanism. Using adaptive hierarchy provides a natural pathway toward sparse attention by constraining node-to-node interactions with the immediate family of each node in the hierarchy (e.g., parent, children, and siblings). This in turn dramatically reduces the computational complexity of a self-attention layer from quadratic to log-linear in terms of the input size while maintaining or sometimes even surpassing the standard transformer's ability to model long-range dependencies across the entire input. Experimentally, we report state-of-the-art performance on long-range graph benchmarks while remaining computationally efficient. Moving beyond graphs, we also display competitive performance on long-range sequence modeling, point-clouds classification, and segmentation when using a fixed hierarchy. Our source code is publicly available at https://github.com/HySonLab/HierAttention
Temporal trends in disparities in COVID-19 seropositivity among Canadian blood donors
Yuan Yu
Matthew J Knight
Diana Gibson
Sheila F O’Brien
W Alton Russell
Abstract Background In Canada’s largest COVID-19 serological study, SARS-CoV-2 antibodies in blood donors have been monitored since 2020. … (voir plus)No study has analysed changes in the association between anti-N seropositivity (a marker of recent infection) and geographic and sociodemographic characteristics over the pandemic. Methods Using Bayesian multi-level models with spatial effects at the census division level, we analysed changes in correlates of SARS-CoV-2 anti-N seropositivity across three periods in which different variants predominated (pre-Delta, Delta and Omicron). We analysed disparities by geographic area, individual traits (age, sex, race) and neighbourhood factors (urbanicity, material deprivation and social deprivation). Data were from 420 319 blood donations across four regions (Ontario, British Columbia [BC], the Prairies and the Atlantic region) from December 2020 to November 2022. Results Seropositivity was higher for racialized minorities, males and individuals in more materially deprived neighbourhoods in the pre-Delta and Delta waves. These subgroup differences dissipated in the Omicron wave as large swaths of the population became infected. Across all waves, seropositivity was higher in younger individuals and those with lower neighbourhood social deprivation. Rural residents had high seropositivity in the Prairies, but not other regions. Compared to generalized linear models, multi-level models with spatial effects had better fit and lower error when predicting SARS-CoV-2 anti-N seropositivity by geographic region. Conclusions Correlates of recent COVID-19 infection have evolved over the pandemic. Many disparities lessened during the Omicron wave, but public health intervention may be warranted to address persistently higher burden among young people and those with less social deprivation.
Association between arterial oxygen and mortality across critically ill patients with hematologic malignancies: results from an international collaborative network
Idunn S. Morris
Tamishta Hensman
Sean M. Bagshaw
Alexandre Demoule
Bruno Ferreyro
Achille Kouatchet
Virginie Lemiale
Djamel Mokart
Frédéric Pène
Sangeeta Mehta
Elie Azoulay
Laveena Munshi
Laurent Argaud
François Barbier
Dominique Benoit
Naike Bigé
Fabrice Bruneel
Emmanuel Canet
Yves Cohen … (voir 30 de plus)
Michaël Darmon
Didier Gruson
Kada Klouche
Loay Kontar
Alexandre Lautrette
Christine Lebert
Guillaume Louis
Julien Mayaux
Anne-Pascale Meert
Anne-Sophie Moreau
Martine Nyunga
Vincent Peigne
Pierre Perez
Jean Herlé Raphalen
Carole Schwebel
Jean-Marie Tonnelier
Florent Wallet
Lara Zafrani
Bram Rochwerg
Farah Shoukat
Dean Fergusson
Paul Heffernan
Margaret Herridge
Sheldon Magder
Mark Minden
Rakesh Patel
Salman Qureshi
Aaron Schimmer
Santhosh Thyagu
Han Ting Wang
Deep Generative Sampling in the Dual Divergence Space: A Data-efficient&Interpretative Approach for Generative AI
Sahil Garg
Anderson Schneider
Anant Raj
Kashif Rasul
Yuriy Nevmyvaka
S. Gopal
Amit Dhurandhar
Guillermo A. Cecchi
Building on the remarkable achievements in generative sampling of natural images, we propose an innovative challenge, potentially overly amb… (voir plus)itious, which involves generating samples of entire multivariate time series that resemble images. However, the statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects. This issue is especially problematic for deep generative models that follow the conventional approach of generating samples from a canonical distribution and then decoding or denoising them to match the true data distribution. In contrast, our method is grounded in information theory and aims to implicitly characterize the distribution of images, particularly the (global and local) dependency structure between pixels. We achieve this by empirically estimating its KL-divergence in the dual form with respect to the respective marginal distribution. This enables us to perform generative sampling directly in the optimized 1-D dual divergence space. Specifically, in the dual space, training samples representing the data distribution are embedded in the form of various clusters between two end points. In theory, any sample embedded between those two end points is in-distribution w.r.t. the data distribution. Our key idea for generating novel samples of images is to interpolate between the clusters via a walk as per gradients of the dual function w.r.t. the data dimensions. In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution. We provide strong theoretical guarantees along with an extensive empirical evaluation using many real-world datasets from diverse domains, establishing the superiority of our approach w.r.t. state-of-the-art deep learning methods.
AI healthcare research: Pioneering iSMART Lab
Dr Narges Armanfard, Professor, talks us through the AI healthcare research at McGill University which is spearheading a groundbreaking init… (voir plus)iative – the iSMART Lab. Access to high-quality healthcare is not just a fundamental human right; it is the bedrock of our societal wellbeing, with the crucial roles played by doctors, nurses, and hospitals. Yet, healthcare systems globally face mounting challenges, particularly from aging populations. Dr Narges Armanfard, affiliated with McGill University and Mila Quebec AI Institute in Montreal, Canada, has spearheaded a groundbreaking initiative – the iSMART Lab. This laboratory represents a revolutionary leap into the future of healthcare, with its pioneering research in AI for health applications garnering significant attention. Renowned for its innovative integration of AI across diverse domains, iSMART Lab stands at the forefront of harnessing Artificial Intelligence to elevate and streamline health services.
Interpretable Machine Learning for Finding Intermediate-mass Black Holes
Mario Pasquato
Piero Trevisan
Abbas Askar
Pablo Lemos
Gaia Carenini
Michela Mapelli
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Parishad BehnamGhader
Vaibhav Adlakha
Marius Mosbach
Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is… (voir plus) only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 4 popular LLMs ranging from 1.3B to 8B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data (as of May 24, 2024). Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Parishad BehnamGhader
Vaibhav Adlakha
Marius Mosbach
Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is… (voir plus) only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 4 popular LLMs ranging from 1.3B to 8B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data (as of May 24, 2024). Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Parishad BehnamGhader
Vaibhav Adlakha
Marius Mosbach
Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is… (voir plus) only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 3 popular LLMs ranging from 1.3B to 7B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data. Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Parishad BehnamGhader
Vaibhav Adlakha
Marius Mosbach
Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is… (voir plus) only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consists of three simple steps: 1) enabling bidirectional attention, 2) masked next token prediction, and 3) unsupervised contrastive learning. We demonstrate the effectiveness of LLM2Vec by applying it to 4 popular LLMs ranging from 1.3B to 8B parameters and evaluate the transformed models on English word- and sequence-level tasks. We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB). Moreover, when combining LLM2Vec with supervised contrastive learning, we achieve state-of-the-art performance on MTEB among models that train only on publicly available data (as of May 24, 2024). Our strong empirical results and extensive analysis demonstrate that LLMs can be effectively transformed into universal text encoders in a parameter-efficient manner without the need for expensive adaptation or synthetic GPT-4 generated data.