Towards contrast-agnostic soft segmentation of the spinal cord
Sandrine Bédard
Enamundram Naga Karthik
Charidimos Tsagkas
Emanuele Pravatà
Cristina Granziera
Andrew C. Smith
Kenneth Arnold Weber
Spinal cord segmentation is clinically relevant and is notably used to compute spinal cord cross-sectional area (CSA) for the diagnosis and … (see more)monitoring of cord compression or neurodegenerative diseases such as multiple sclerosis. While several semi and automatic methods exist, one key limitation remains: the segmentation depends on the MRI contrast, resulting in different CSA across contrasts. This is partly due to the varying appearance of the boundary between the spinal cord and the cerebrospinal fluid that depends on the sequence and acquisition parameters. This contrast-sensitive CSA adds variability in multi-center studies where protocols can vary, reducing the sensitivity to detect subtle atrophies. Moreover, existing methods enhance the CSA variability by training one model per contrast, while also producing binary masks that do not account for partial volume effects. In this work, we present a deep learning-based method that produces soft segmentations of the spinal cord. Using the Spine Generic Public Database of healthy participants (
Training Language Models to Self-Correct via Reinforcement Learning
Aviral Kumar
Vincent Zhuang
Yi Su
John D Co-Reyes
Avi Singh
Kate Baumli
Shariq Iqbal
Colton Bishop
Rebecca Roelofs
Lei M Zhang
Kay McKinney
Disha Shrivastava
Cosmin Paduraru
George Tucker
Feryal Behbahani
Aleksandra Faust
Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffecti… (see more)ve in modern LLMs. Existing approaches for training self-correction either require multiple models or rely on a more capable model or other forms of supervision. To this end, we develop a multi-turn online reinforcement learning (RL) approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data. To build SCoRe, we first show that variants of supervised fine-tuning (SFT) on offline model-generated correction traces are insufficient for instilling self-correction behavior. In particular, we observe that training via SFT either suffers from a distribution mismatch between the training data and the model's own responses or implicitly prefers only a certain mode of correction behavior that is often not effective at test time. SCoRe addresses these challenges by training under the model's own distribution of self-generated correction traces and using appropriate regularization to steer the learning process into learning a self-correction strategy that is effective at test time as opposed to simply fitting high-reward responses for a given prompt. This regularization prescribes running a first phase of RL on a base model to generate a policy initialization that is less susceptible to collapse and then using a reward bonus to amplify self-correction during training. When applied to Gemini 1.0 Pro and 1.5 Flash models, we find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on the MATH and HumanEval benchmarks.
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
Brian R. Bartoldson
James Diffenderfer
Moksh J. Jain
Tal Ben-Nun
Minsu Kim
Bhavya Kailkhura
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
Brian R. Bartoldson
James Diffenderfer
Moksh J. Jain
Tal Ben-Nun
Minsu Kim
Bhavya Kailkhura
TransCeption: Enhancing medical image segmentation with an inception-like transformer design for efficient feature fusion
Reza Azad
Yiwei Jia
Ehsan Khodapanah Aghdam
Dorit Merhof
Variation Matters: from Mitigating to Embracing Zero-Shot NAS Ranking Function Variation
Pavel Rumiantsev
Neural Architecture Search (NAS) is a powerful automatic alternative to manual design of a neural network. In the zero-shot version, a fast … (see more)ranking function is used to compare architectures without training them. The outputs of the ranking functions often vary significantly due to different sources of randomness, including the evaluated architecture's weights' initialization or the batch of data used for calculations. A common approach to addressing the variation is to average a ranking function output over several evaluations. We propose taking into account the variation in a different manner, by viewing the ranking function output as a random variable representing a proxy performance metric. During the search process, we strive to construct a stochastic ordering of the performance metrics to determine the best architecture. Our experiments show that the proposed stochastic ordering can effectively boost performance of a search on standard benchmark search spaces.
Sociodemographic characteristics of SARS-CoV-2 serosurveillance studies with diverse recruitment strategies, Canada, 2020 to 2023
Matthew J Knight
Yuan Yu
Jiacheng Chen
Sheila F O’Brien
Carmen Charlton
W Alton Russell
Background. Serological testing was a key component of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) surveillance. Social dis… (see more)tancing interventions, resource limitations, and the need for timely data led to serosurveillance studies using a range of recruitment strategies, which likely influenced study representativeness. Characterizing representativeness in surveillance is crucial to identify gaps in sampling coverage and to assess health inequities. Methods. We retrospectively analyzed three pre-existing longitudinal cohorts, two convenience samples using residual blood, and one de novo probabilistic survey conducted in Canada between April 2020 - November 2023. We calculated study specimen counts by age, sex, urbanicity, race/ethnicity, and neighborhood deprivation quintiles. We derived a 'representation ratio' as a simple metric to assess generalizability to a target population and various sociodemographic strata. Results. The six studies included 1,321,675 specimens. When stratifying by age group and sex, 65% of racialized minority subgroups were moderately underrepresented (representation ratio 0.75). Representation was generally higher for older Canadians, urban neighborhoods, and neighborhoods with low material deprivation. Rural representation was highest in a study that used outpatient laboratory blood specimens. Racialized minority representation was highest in a de novo probabilistic survey cohort. Conclusion. While no study had adequate representation of all subgroups, less traditional recruitment strategies were more representative of some population dimensions. Understanding demographic representativeness and barriers to recruitment are important considerations when designing population health surveillance studies.
The Romantic Historicism and The Rise of the Historical Novel in the 19th Century Romanian Literature
Medium-scale flexible integrated circuits based on 2D semiconductors
Yalin Peng
Chenyang Cui
Lu Liu
Yuchen Wang
Qinqin Wang
Jinpeng Tian
Zhiheng Huang
Biying Huang
Yangkun Zhang
Xiuzhen Li
Yanbang Chu
Wei Yang
Dongxia Shi
Luojun Du
Na Li
Guangyu Zhang
Sliding ferroelectric memories and synapses based on rhombohedral-stacked bilayer MoS2
Xiuzhen Li
Biao Qin
Yaxian Wang
Yue Xi
Zhiheng Huang
Mengze Zhao
Yalin Peng
Zitao Chen
Zitian Pan
Jundong Zhu
Chenyang Cui
Rong Yang
Wei Yang
Sheng Meng
Dongxia Shi
Xuedong Bai
Can Liu
Na Li
Kaihui Liu … (see 3 more)
Kai-Wen Liu
Luojun Du
Guangyu Zhang
AfriHG: News headline generation for African Languages
Toyib Ogunremi
Serah Akojenu
Anthony Soronnadi
Olubayo Adekanmbi
This paper introduces AfriHG -- a news headline generation dataset created by combining from XLSum and MasakhaNEWS datasets focusing on 16 l… (see more)anguages widely spoken by Africa. We experimented with two seq2eq models (mT5-base and AfriTeVa V2), and Aya-101 LLM. Our results show that Africa-centric seq2seq models such as AfriTeVa V2 outperform the massively multilingual mT5-base model. Finally, we show that the performance of fine-tuning AfriTeVa V2 with 313M parameters is competitive to prompting Aya-101 LLM with more than 13B parameters.
Exploring Compound Loss Functions for Brain Tumor Segmentation