Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
Yinlam Chow
Guy Tennenholtz
Izzeddin Gur
Vincent Zhuang
Bo Dai
Aviral Kumar
Sridhar Thiagarajan
Craig Boutilier
Aleksandra Faust
Recent studies have indicated that effectively utilizing inference-time compute is crucial for attaining better performance from large langu… (voir plus)age models (LLMs). In this work, we propose a novel inference-aware fine-tuning paradigm, in which the model is fine-tuned in a manner that directly optimizes the performance of the inference-time strategy. We study this paradigm using the simple yet effective Best-of-N (BoN) inference strategy, in which a verifier selects the best out of a set of LLM-generated responses. We devise the first imitation learning and reinforcement learning~(RL) methods for BoN-aware fine-tuning, overcoming the challenging, non-differentiable argmax operator within BoN. We empirically demonstrate that our BoN-aware models implicitly learn a meta-strategy that interleaves best responses with more diverse responses that might be better suited to a test-time input -- a process reminiscent of the exploration-exploitation trade-off in RL. Our experiments demonstrate the effectiveness of BoN-aware fine-tuning in terms of improved performance and inference-time compute. In particular, we show that our methods improve the Bo32 performance of Gemma 2B on Hendrycks MATH from 26.8% to 30.8%, and pass@32 from 60.0% to 67.0%, as well as the pass@16 on HumanEval from 61.6% to 67.1%.
Integer Programming Games.
Gabriele Dragotto
Andrea Lodi
Sriram Sankaranarayanan 0002
Integer Programming Games.
Gabriele Dragotto
Andrea Lodi 0001
Sriram Sankaranarayanan 0002
Integrating Generative and Experimental Platforms for Biomolecular Design
Cheng-Hao Liu
Jarrid Rector-Brooks
Soojung Yang
Sidney L Lisanza
Francesca-Zhoufan Li
Hannes Stärk
Jacob Gershon
Lauren Hong
Pranam Chatterjee
Tommi Jaakkola
Regina Barzilay
David Baker
Frances H. Arnold
Biomolecular design, through artificial engineering of proteins, ligands, and nucleic acids, holds immense promise in addressing pressing me… (voir plus)dical, industrial, and environmental challenges. While generative machine learning has shown significant potential in this area, a palpable disconnect exists with experimental biology: many ML research efforts prioritize static benchmark performance, potentially sidelining impactful biological applications. This workshop seeks to bridge this gap by bringing computationalists and experimentalists together, catalyzing a deeper interdisciplinary discourse. Together, we will explore the strengths and challenges of generative ML in biology, experimental integration of generative ML, and biological problems ready for ML. To attract high-quality and diverse research, we partnered with Nature Biotechnology for a special collection, and we created dedicated tracks for in-silico ML research and hybrid ML-experimental biology research. Our lineup features emerging leaders as speakers and renowned scientists as panelists, encapsulating a spectrum from high-throughput experimentation and computational biology to generative ML. With a diverse organizing team and backed by industry sponsors, we dedicate the workshop to pushing the boundaries of ML's role in biology.
Investigating the Effect of Providing Required Training to Mothers of Children with Surgery and Its Effect on Mothers' Anxiety
Julia Ferreira
Nadia Safa
Fabio Botelho
Robin Petroze
Hussein Wissanji
Pramod Puligandla
Kenneth Shaw
Maeve Trudeau
Elena Guadagno
Jean-Martin Laberge
Sherif Emil
Investigating the Effect of Providing Required Training to Mothers of Children with Surgery and Its Effect on Mothers' Anxiety
Julia Ferreira
Nadia Safa
Fabio Botelho
Robin Petroze
Hussein Wissanji
Pramod Puligandla
Kenneth Shaw
Maeve Trudeau
Elena Guadagno
Jean Martin Laberge
Sherif Emil
Longitudinal reproducibility of brain and spinal cord quantitative MRI biomarkers
Mathieu Boudreau
Agah Karakuzu
Arnaud Boré
Basile Pinsard
Kiril Zelenkovski
Eva Alonso‐Ortiz
Julie Boyle
Lune Bellec
Abstract Quantitative MRI (qMRI) promises better specificity, accuracy, repeatability, and reproducibility relative to its clinically-used q… (voir plus)ualitative MRI counterpart. Longitudinal reproducibility is particularly important in qMRI. The goal is to reliably quantify tissue properties that may be assessed in longitudinal clinical studies throughout disease progression or during treatment. In this work, we present the initial data release of the quantitative MRI portion of the Courtois project on neural modelling (CNeuroMod), where the brain and cervical spinal cord of six participants were scanned at regular intervals over the course of several years. This first release includes 3 years of data collection and up to 10 sessions per participant using quantitative MRI imaging protocols (T1, magnetization transfer (MTR, MTsat), and diffusion). In the brain, T1MP2RAGE, fractional anisotropy (FA), mean diffusivity (MD), and radial diffusivity (RD) all exhibited high longitudinal reproducibility (intraclass correlation coefficient – ICC ≃ 1 and within-subject coefficient of variations – wCV 1%). The spinal cord cross-sectional area (CSA) computed using T2w images and T1MTsat exhibited the best longitudinal reproducibility (ICC ≃ 1 and 0.7 respectively, and wCV 2.4% and 6.9%). Results from this work show the level of longitudinal reproducibility that can be expected from qMRI protocols in the brain and spinal cord in the absence of hardware and software upgrades, and could help in the design of future longitudinal clinical studies.
Longitudinal reproducibility of brain and spinal cord quantitative MRI biomarkers
Mathieu Boudreau
Agah Karakuzu
Arnaud Boré
Basile Pinsard
Kiril Zelenkovski
Eva Alonso‐Ortiz
Julie Boyle
Lune Bellec
Abstract Quantitative MRI (qMRI) promises better specificity, accuracy, repeatability, and reproducibility relative to its clinically-used q… (voir plus)ualitative MRI counterpart. Longitudinal reproducibility is particularly important in qMRI. The goal is to reliably quantify tissue properties that may be assessed in longitudinal clinical studies throughout disease progression or during treatment. In this work, we present the initial data release of the quantitative MRI portion of the Courtois project on neural modelling (CNeuroMod), where the brain and cervical spinal cord of six participants were scanned at regular intervals over the course of several years. This first release includes 3 years of data collection and up to 10 sessions per participant using quantitative MRI imaging protocols (T1, magnetization transfer (MTR, MTsat), and diffusion). In the brain, T1MP2RAGE, fractional anisotropy (FA), mean diffusivity (MD), and radial diffusivity (RD) all exhibited high longitudinal reproducibility (intraclass correlation coefficient – ICC ≃ 1 and within-subject coefficient of variations – wCV 1%). The spinal cord cross-sectional area (CSA) computed using T2w images and T1MTsat exhibited the best longitudinal reproducibility (ICC ≃ 1 and 0.7 respectively, and wCV 2.4% and 6.9%). Results from this work show the level of longitudinal reproducibility that can be expected from qMRI protocols in the brain and spinal cord in the absence of hardware and software upgrades, and could help in the design of future longitudinal clinical studies.
Machine-learning-assisted preoperative prediction of pediatric appendicitis severity
Aylin Erman
Julia Ferreira
Waseem Abu Ashour
Elena Guadagno
Etienne St-Louis
Sherif Emil
Jackie Cheung
Machine-learning-assisted Preoperative Prediction of Pediatric Appendicitis Severity.
Aylin Erman
Julia Ferreira
Waseem Abu Ashour
Elena Guadagno
Etienne St-Louis
Sherif Emil
Jackie Cheung
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
Claas Voelcker
Marcel Hussing
Eric R. Eaton
Amir-massoud Farahmand
Igor Gilitschenski
Building deep reinforcement learning (RL) agents that find a good policy with few samples has proven notoriously challenging. To achieve sam… (voir plus)ple efficiency, recent work has explored updating neural networks with large numbers of gradient steps for every new sample. While such high update-to-data (UTD) ratios have shown strong empirical performance, they also introduce instability to the training process. Previous approaches need to rely on periodic neural network parameter resets to address this instability, but restarting the training process is infeasible in many real-world applications and requires tuning the resetting interval. In this paper, we focus on one of the core difficulties of stable training with limited samples: the inability of learned value functions to generalize to unobserved on-policy actions. We mitigate this issue directly by augmenting the off-policy RL training process with a small amount of data generated from a learned world model. Our method, Model-Augmented Data for Temporal Difference learning (MAD-TD) uses small amounts of generated data to stabilize high UTD training and achieve competitive performance on the most challenging tasks in the DeepMind control suite. Our experiments further highlight the importance of employing a good model to generate data, MAD-TD's ability to combat value overestimation, and its practical stability gains for continued learning.
Maximizing Data and Hardware Reuse for HLS with Early-Stage Symbolic Partitioning
Tzung-Han Juang