An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration
Hiroki Naganuma
Ryuichiro Hataya
Kotaro Yoshida
Influence of scanning plane on Human Spinal Cord functional Magnetic Resonance echo planar imaging
Marta Moraschi
Silvia Tommasin
Laura Maugeri
Mauro DiNuzzo
Marco Masullo
Fabio Mangini
Lorenzo Giovannelli
Daniele Mascali
Tommaso Gili
Valerio Pisani
Ugo Nocentini
Federico Giove
Michela Fratini
BACKGROUND: Functional Magnetic Resonance Imaging (fMRI) is based on the Blood Oxygenation Level Dependent contrast and has been exploited f… (voir plus)or the indirect study of the neuronal activity within both the brain and the spinal cord. However, the interpretation of spinal cord fMRI (scfMRI) is still controversial and its diffusion is rather limited because of technical limitations. Overcoming these limitations would have a beneficial effect for the assessment and follow-up of spinal injuries and neurodegenerative diseases. PURPOSE: This study was aimed at systematically verify whether sagittal scanning in scfMRI using EPI readout is a viable alternative to the more common axial scanning, and at optimizing a pipeline for EPI-based scfMRI data analysis, based on Spinal Cord Toolbox (SCT). METHODS: Forty-five healthy subjects underwent MRI acquisition in a Philips Achieva 3T MRI scanner. T2*-weighted fMRI data were acquired using a GE-EPI sequence along sagittal and axial planes during an isometric motor task. Differences on benchmarks were assessed via paired two-sample t-test at p=0.05. RESULTS: We investigated the impact of the acquisition strategy by means of various metrics such as Temporal Signal to Noise Ratio (tSNR), Dice Coefficient to assess geometric distortions, Reproducibility and Sensitivity. tSNR was higher in axial than in sagittal scans, as well as reproducibility within the whole cord mask (t=7.4, p0.01) and within the GM mask (t=4.2, p0.01). The other benchmarks, associated with distortion and functional response, showed no differenc
Learning Penalty for Optimal Partitioning via Automatic Feature Extraction
Tung L. Nguyen
Changepoint detection identifies significant shifts in data sequences, making it important in areas like finance, genetics, and healthcare. … (voir plus)The Optimal Partitioning algorithms efficiently detect these changes, using a penalty parameter to limit the changepoints number. Determining the appropriate value for this penalty can be challenging. Traditionally, this process involved manually extracting statistical features, such as sequence length or variance to make the prediction. This study proposes a novel approach that uses recurrent neural networks to learn this penalty directly from raw sequences by automatically extracting features. Experiments conducted on 20 benchmark genomic datasets show that this novel method surpasses traditional methods in partitioning accuracy in most cases.
Learning Penalty for Optimal Partitioning via Automatic Feature Extraction
Tung L. Nguyen
OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations
Christina Kassab
Sacha Morin
Martin Büchner
Matias Mattamala
Kumaraditya Gupta
Abhinav Valada
Maurice Fallon
Ctrl-V: Higher Fidelity Autonomous Vehicle Video Generation with Bounding-Box Controlled Object Motion
Ge Ya Luo
Zhi Hao Luo
Anthony Gosselin
Alexia Jolicoeur-Martineau
Efficient Morphology-Aware Policy Transfer to New Embodiments
Michael Przystupa
Hongyao Tang
Mariano Phielipp
Santiago Miret
Martin Jägersand
Matthew E. Taylor
Morphology-aware policy learning is a means of enhancing policy sample efficiency by aggregating data from multiple agents. These types of p… (voir plus)olicies have previously been shown to help generalize over dynamic, kinematic, and limb configuration variations between agent morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on morphologies at deployment. This limitation has ramifications in practical applications such as robotics because further data collection to perform end-to-end finetuning can be computationally expensive. In this work, we investigate combining morphology-aware pretraining with \textit{parameter efficient finetuning} (PEFT) techniques to help reduce the learnable parameters necessary to specialize a morphology-aware policy to a target embodiment. We compare directly tuning sub-sets of model weights, input learnable adapters, and prefix tuning techniques for online finetuning. Our analysis reveals that PEFT techniques in conjunction with policy pre-training generally help reduce the number of samples to necessary to improve a policy compared to training models end-to-end from scratch. We further find that tuning as few as less than 1\% of total parameters will improve policy performance compared the zero-shot performance of the base pretrained a policy.
Efficient Morphology-Aware Policy Transfer to New Embodiments
Michael Przystupa
Hongyao Tang
Mariano Phielipp
Santiago Miret
Martin Jägersand
Matthew E. Taylor
Morphology-aware policy learning is a means of enhancing policy sample efficiency by aggregating data from multiple agents. These types of p… (voir plus)olicies have previously been shown to help generalize over dynamic, kinematic, and limb configuration variations between agent morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on morphologies at deployment. This limitation has ramifications in practical applications such as robotics because further data collection to perform end-to-end finetuning can be computationally expensive. In this work, we investigate combining morphology-aware pretraining with \textit{parameter efficient finetuning} (PEFT) techniques to help reduce the learnable parameters necessary to specialize a morphology-aware policy to a target embodiment. We compare directly tuning sub-sets of model weights, input learnable adapters, and prefix tuning techniques for online finetuning. Our analysis reveals that PEFT techniques in conjunction with policy pre-training generally help reduce the number of samples to necessary to improve a policy compared to training models end-to-end from scratch. We further find that tuning as few as less than 1\% of total parameters will improve policy performance compared the zero-shot performance of the base pretrained a policy.
Mitigating Goal Misgeneralization via Minimax Regret
Karim Ahmed Abdel Sadek
Matthew Farrugia-Roberts
Usman Anwar
Hannah Erlebach
Christian Schroeder de Witt
Michael D Dennis
Robustness research in reinforcement learning often focuses on ensuring that the policy consistently exhibits capable, goal-driven behavior.… (voir plus) However, not every capable behavior is the intended behavior. *Goal misgeneralization* can occur when the policy generalizes capably with respect to a 'proxy goal' whose optimal behavior correlates with the intended goal on the training distribution, but not out of distribution. Though the intended goal would be ambiguous if they were perfectly correlated in training, we show progress can be made if the goals are only *nearly ambiguous*, with the training distribution containing a small proportion of *disambiguating* levels. We observe that the training signal from disambiguating levels could be amplified by regret-based prioritization. We formally show that approximately optimal policies on maximal-regret levels avoid the harmful effects of goal misgeneralization, which may exist without this prioritization. Empirically, we find that current regret-based Unsupervised Environment Design (UED) methods can mitigate the effects of goal misgeneralization, though do not always entirely eliminate it. Our theoretical and empirical results show that as UED methods improve they could further mitigate goal misgeneralization in practice.
Multi-Task Reinforcement Learning Enables Parameter Scaling
Reginald McLean
Evangelos Chatzaroulas
J K Terry
Isaac Woungang
Nariman Farsad
Multi-task reinforcement learning (MTRL) aims to endow a single agent with the ability to perform well on multiple tasks. Recent works have … (voir plus)focused on developing novel sophisticated architectures to improve performance, often resulting in larger models; it is unclear, however, whether the performance gains are a consequence of the architecture design or the extra parameters. We argue that gains are mostly due to scale by demonstrating that naively scaling up a simple MTRL baseline to match parameter counts outperforms the more sophisticated architectures, and these gains benefit most from scaling the critic over the actor. Additionally, we explore the training stability advantages that come with task diversity, demonstrating that increasing the number of tasks can help mitigate plasticity loss. Our findings suggest that MTRL's simultaneous training across multiple tasks provides a natural framework for beneficial parameter scaling in reinforcement learning, challenging the need for complex architectural innovations.
Multi-Task Reinforcement Learning Enables Parameter Scaling
Reginald McLean
Evangelos Chatzaroulas
J K Terry
Isaac Woungang
Nariman Farsad
Multi-task reinforcement learning (MTRL) aims to endow a single agent with the ability to perform well on multiple tasks. Recent works have … (voir plus)focused on developing novel sophisticated architectures to improve performance, often resulting in larger models; it is unclear, however, whether the performance gains are a consequence of the architecture design or the extra parameters. We argue that gains are mostly due to scale by demonstrating that naively scaling up a simple MTRL baseline to match parameter counts outperforms the more sophisticated architectures, and these gains benefit most from scaling the critic over the actor. Additionally, we explore the training stability advantages that come with task diversity, demonstrating that increasing the number of tasks can help mitigate plasticity loss. Our findings suggest that MTRL's simultaneous training across multiple tasks provides a natural framework for beneficial parameter scaling in reinforcement learning, challenging the need for complex architectural innovations.
Optimal discounting for offline input-driven MDP
Randy Lefebvre
Offline reinforcement learning has gained a lot of popularity for its potential to solve industry challenges. However, real-world environmen… (voir plus)ts are often highly stochastic and partially observable, leading long-term planners to overfit to offline data in model-based settings. Input-driven Markov Decision Processes (IDMDPs) offer a way to work with some of the uncertainty by letting designers separate what the agent has control over (states) from what it cannot (inputs) in the environnement. These stochastic external inputs are often difficult to model. Under the assumption that the input model will be imperfect, we investigate the bias-variance tradeoff under shallow planning in IDMDPs. Paving the way to input-driven planning horizons, we also investigate the similarity of optimal planning horizons at different inputs given the structure of the input space.