Publications

Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset

Alexandre Galashov

Michalis K. Titsias

Andr'as Gyorgy

Clare Lyle

Razvan Pascanu

Yee Whye Teh

Maneesh Sahani

Neural networks are traditionally trained under the assumption that data come from a stationary distribution. However, settings which violat… (voir plus)e this assumption are becoming more popular; examples include supervised learning under distributional shifts, reinforcement learning, continual learning and non-stationary contextual bandits. In this work we introduce a novel learning approach that automatically models and adapts to non-stationarity, via an Ornstein-Uhlenbeck process with an adaptive drift parameter. The adaptive drift tends to draw the parameters towards the initialisation distribution, so the approach can be understood as a form of soft parameter reset. We show empirically that our approach performs well in non-stationary supervised and off-policy reinforcement learning settings.

2024-11-06

ArXiv (prépublication)

SCIseg: Automatic Segmentation of Intramedullary Lesions in Spinal Cord Injury on T2-weighted MRI Scans.

Enamundram Naga Karthik

Jan Valosek

Andrew C. Smith

Dario Pfyffer

Simon Schading-Sassenhausen

Lynn Farner

KA Weber

Patrick Freund

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This ar… (voir plus)ticle will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a deep learning tool for the automatic segmentation of the spinal cord and intramedullary lesions in spinal cord injury (SCI) on T2-weighted MRI scans. Materials and Methods This retrospective study included MRI data acquired between July 2002 and February 2023 from 191 patients with SCI (mean age, 48.1 years ± 17.9 [SD]; 142 males). The data consisted of T2-weighted MRI acquired using different scanner manufacturers with various image resolutions (isotropic and anisotropic) and orientations (axial and sagittal). Patients had different lesion etiologies (traumatic, ischemic, and hemorrhagic) and lesion locations across the cervical, thoracic and lumbar spine. A deep learning model, SCIseg, was trained in a three-phase process involving active learning for the automatic segmentation of intramedullary SCI lesions and the spinal cord. The segmentations from the proposed model were visually and quantitatively compared with those from three other open-source methods (PropSeg, DeepSeg and contrast-agnostic, all part of the Spinal Cord Toolbox). Wilcoxon signed-rank test was used to compare quantitative MRI biomarkers of SCI (lesion volume, lesion length, and maximal axial damage ratio) derived from the manual reference standard lesion masks and biomarkers obtained automatically with SCIseg segmentations. Results SCIseg achieved a Dice score of 0.92 ± 0.07 (mean ± SD) and 0.61 ± 0.27 for spinal cord and SCI lesion segmentation, respectively. There was no evidence of a difference between lesion length (P = .42) and maximal axial damage ratio (P = .16) computed from manually annotated lesions and the lesion segmentations obtained using SCIseg. Conclusion SCIseg accurately segmented intramedullary lesions on a diverse dataset of T2-weighted MRI scans and extracted relevant lesion biomarkers (namely, lesion volume, lesion length, and maximal axial damage ratio). SCIseg is open-source and accessible through the Spinal Cord Toolbox (v6.2 and above). Published under a CC BY 4.0 license.

2024-11-06

Radiology: Artificial Intelligence (publié)

SCIseg: Automatic Segmentation of Intramedullary Lesions in Spinal Cord Injury on T2-weighted MRI Scans.

Enamundram Naga Karthik

Jan Valosek

Andrew C. Smith

Dario Pfyffer

Simon Schading-Sassenhausen

Lynn Farner

KA Weber

Kenneth A. Weber

Patrick Freund

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This ar… (voir plus)ticle will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a deep learning tool for the automatic segmentation of the spinal cord and intramedullary lesions in spinal cord injury (SCI) on T2-weighted MRI scans. Materials and Methods This retrospective study included MRI data acquired between July 2002 and February 2023 from 191 patients with SCI (mean age, 48.1 years ± 17.9 [SD]; 142 males). The data consisted of T2-weighted MRI acquired using different scanner manufacturers with various image resolutions (isotropic and anisotropic) and orientations (axial and sagittal). Patients had different lesion etiologies (traumatic, ischemic, and hemorrhagic) and lesion locations across the cervical, thoracic and lumbar spine. A deep learning model, SCIseg, was trained in a three-phase process involving active learning for the automatic segmentation of intramedullary SCI lesions and the spinal cord. The segmentations from the proposed model were visually and quantitatively compared with those from three other open-source methods (PropSeg, DeepSeg and contrast-agnostic, all part of the Spinal Cord Toolbox). Wilcoxon signed-rank test was used to compare quantitative MRI biomarkers of SCI (lesion volume, lesion length, and maximal axial damage ratio) derived from the manual reference standard lesion masks and biomarkers obtained automatically with SCIseg segmentations. Results SCIseg achieved a Dice score of 0.92 ± 0.07 (mean ± SD) and 0.61 ± 0.27 for spinal cord and SCI lesion segmentation, respectively. There was no evidence of a difference between lesion length (P = .42) and maximal axial damage ratio (P = .16) computed from manually annotated lesions and the lesion segmentations obtained using SCIseg. Conclusion SCIseg accurately segmented intramedullary lesions on a diverse dataset of T2-weighted MRI scans and extracted relevant lesion biomarkers (namely, lesion volume, lesion length, and maximal axial damage ratio). SCIseg is open-source and accessible through the Spinal Cord Toolbox (v6.2 and above). Published under a CC BY 4.0 license.

2024-11-06

Radiology: Artificial Intelligence (publié)

Spinal cord evaluation in multiple sclerosis: clinical and radiological associations, present and future

B Mark Keegan

Martina Absinta

Eoin P Flanagan

Roland G Henry

Eric C Klawiter

Shannon Kolind

Stephen Krieger

Cornelia Laule

John A Lincoln

Steven Messina

Jiwon Oh

Nico Papinutto

Seth Aaron Smith

Anthony Traboulsee

2024-11-06

Brain Communications (publié)

Spinal cord evaluation in multiple sclerosis: clinical and radiological associations, present and future

B Mark Keegan

Martina Absinta

Eoin P Flanagan

Roland G Henry

Eric C Klawiter

Shannon Kolind

Stephen Krieger

Cornelia Laule

John A Lincoln

Steven Messina

Jiwon Oh

Nico Papinutto

Seth Aaron Smith

Anthony Traboulsee

Abstract Spinal cord disease is important in most people with multiple sclerosis, but assessment remains less emphasized in patient care, ba… (voir plus)sic and clinical research and therapeutic trials. The North American Imaging in Multiple Sclerosis Spinal Cord Interest Group was formed to determine and present the contemporary landscape of multiple sclerosis spinal cord evaluation, further existing and advanced spinal cord imaging techniques, and foster collaborative work. Important themes arose: (i) multiple sclerosis spinal cord lesions (differential diagnosis, association with clinical course); (ii) spinal cord radiological–pathological associations; (iii) ‘critical’ spinal cord lesions; (iv) multiple sclerosis topographical model; (v) spinal cord atrophy; and (vi) automated and special imaging techniques. Distinguishing multiple sclerosis from other myelopathic aetiology is increasingly refined by imaging and serological studies. Post-mortem spinal cord findings and MRI pathological correlative studies demonstrate MRI’s high sensitivity in detecting microstructural demyelination and axonal loss. Spinal leptomeninges include immune inflammatory infiltrates, some in B-cell lymphoid-like structures. ‘Critical’ demyelinating lesions along spinal cord corticospinal tracts are anatomically consistent with and may be disproportionately associated with motor progression. Multiple sclerosis topographical model implicates the spinal cord as an area where threshold impairment associates with multiple sclerosis disability. Progressive spinal cord atrophy and ‘silent’ multiple sclerosis progression may be emerging as an important multiple sclerosis prognostic biomarker. Manual atrophy assessment is complicated by rater bias, while automation (e.g. Spinal Cord Toolbox), and artificial intelligence may reduce this. Collaborative research by the North American Imaging in Multiple Sclerosis and similar groups with experts combining distinct strengths is key to advancing assessment and treatment of people with multiple sclerosis spinal cord disease.

2024-11-06

Brain Communications (publié)

Towards Optimizing SQL Generation via LLM Routing

Mohammadhossein Malekpour

Nour Shaheen

Foutse Khomh

Amine Mhedhbi

Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capabl… (voir plus)e large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.

2024-11-06

ArXiv (prépublication)

Towards Optimizing SQL Generation via LLM Routing

Mohammadhossein Malekpour

Nour Shaheen

Foutse Khomh

Amine Mhedhbi

Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capabl… (voir plus)e large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.

2024-11-06

ArXiv (prépublication)

On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models

Tariq Berrada

Pietro Astolfi

Melissa Hall

Reyhane Askari Hemmat

Yohann Benchetrit

Marton Havasi

Matthew J. Muckley

Karteek Alahari

Adriana Romero Soriano

Jakob Verbeek

Michal Drozdzal

Large-scale training of latent diffusion models (LDMs) has enabled unprecedented quality in image generation. However, the key components of… (voir plus) the best performing LDM training recipes are oftentimes not available to the research community, preventing apple-to-apple comparisons and hindering the validation of progress in the field. In this work, we perform an in-depth study of LDM training recipes focusing on the performance of models and their training efficiency. To ensure apple-to-apple comparisons, we re-implement five previously published models with their corresponding recipes. Through our study, we explore the effects of (i)~the mechanisms used to condition the generative model on semantic information (e.g., text prompt) and control metadata (e.g., crop size, random flip flag, etc.) on the model performance, and (ii)~the transfer of the representations learned on smaller and lower-resolution datasets to larger ones on the training efficiency and model performance. We then propose a novel conditioning mechanism that disentangles semantic and control metadata conditionings and sets a new state-of-the-art in class-conditional generation on the ImageNet-1k dataset -- with FID improvements of 7% on 256 and 8% on 512 resolutions -- as well as text-to-image generation on the CC12M dataset -- with FID improvements of 8% on 256 and 23% on 512 resolution.

2024-11-05

ArXiv (prépublication)

Temporal Residual Jacobians For Rig-free Motion Transfer

Sanjeev Muralikrishnan

Niladri Dutt

Siddhartha Chaudhuri

Noam Aigerman

Vladimir Kim

Matthew Fisher

Niloy J. Mitra

We introduce Temporal Residual Jacobians as a novel representation to enable data-driven motion transfer. Our approach does not assume acces… (voir plus)s to any rigging or intermediate shape keyframes, produces geometrically and temporally consistent motions, and can be used to transfer long motion sequences. Central to our approach are two coupled neural networks that individually predict local geometric and temporal changes that are subsequently integrated, spatially and temporally, to produce the final animated meshes. The two networks are jointly trained, complement each other in producing spatial and temporal signals, and are supervised directly with 3D positional information. During inference, in the absence of keyframes, our method essentially solves a motion extrapolation problem. We test our setup on diverse meshes (synthetic and scanned shapes) to demonstrate its superiority in generating realistic and natural-looking animations on unseen body shapes against SoTA alternatives. Supplemental video and code are available at https://temporaljacobians.github.io/ .

2024-11-05

Lecture Notes in Computer Science (publié)

Efficient Assignment with Time Constraints for Heterogeneous DSP Systems

Jiajie Li

Christophe Dubach

Warren Gross

High-level synthesis (HLS) produces hardware au-tomatically by scheduling and assigning resources based on an input control/data-flow graph.… (voir plus) One particular aspect of HLS for the digital signal processing (DSP) architecture is the het-erogeneous assignment problem (HAP) which maps operations into different types of functional units available in the electronic design automation tools to build efficient implementations. An optimal solution to this assignment problem can be found by formulating the problem as integer linear programming (ILP) and using a solver. However, given the slow nature of this process, heuristics tend to be used instead leading to sub-optimal designs. This paper revisits the classical ILP formulation of the HAP with time constraints for the DSP architecture by identifying redundant constraints. This paper proves theoretically, and demonstrates experimentally, that removing these constraints does not affect the obtained solution. This technique achieves speedups of more than 100 × in terms of runtime and reductions of more than 50 × in terms of memory usage of the solver. Also, this work proposes an updated heuristic that keeps reducing the latency of a path instead of finding a new critical path after giving a new node assignment. Runtime reductions (more than another 10×) due to reduced numbers of critical path searches are observed while returning similar results.

2024-11-04

IEEE Workshop on Signal Processing Systems (publié)

Efficient Assignment with Time Constraints for Heterogeneous DSP Systems

Jiajie Li

Christophe Dubach

Warren Gross

High-level synthesis (HLS) produces hardware au-tomatically by scheduling and assigning resources based on an input control/data-flow graph.… (voir plus) One particular aspect of HLS for the digital signal processing (DSP) architecture is the het-erogeneous assignment problem (HAP) which maps operations into different types of functional units available in the electronic design automation tools to build efficient implementations. An optimal solution to this assignment problem can be found by formulating the problem as integer linear programming (ILP) and using a solver. However, given the slow nature of this process, heuristics tend to be used instead leading to sub-optimal designs. This paper revisits the classical ILP formulation of the HAP with time constraints for the DSP architecture by identifying redundant constraints. This paper proves theoretically, and demonstrates experimentally, that removing these constraints does not affect the obtained solution. This technique achieves speedups of more than 100 × in terms of runtime and reductions of more than 50 × in terms of memory usage of the solver. Also, this work proposes an updated heuristic that keeps reducing the latency of a path instead of finding a new critical path after giving a new node assignment. Runtime reductions (more than another 10×) due to reduced numbers of critical path searches are observed while returning similar results.

2024-11-04

IEEE Workshop on Signal Processing Systems (publié)