Publications

Relative biological effectiveness of clinically relevant photon energies for the survival of human colorectal, cervical, and prostate cancer cell lines

Joanna Li

N. Chabaytah

Joud Babik

Behnaz Behmand

H. Bekerat

Tanner Connell

Michael D C Evans

Russell Ruo

T. Vuong

Shirin A. Enger

2024-09-19

Physics in Medicine & Biology (published)

doi.org

Training Language Models to Self-Correct via Reinforcement Learning

Aviral Kumar

Vincent Zhuang

Rishabh Agarwal

Yi Su

John D Co-Reyes

Avi Singh

Kate Baumli

Shariq N Iqbal

Colton Bishop

Rebecca Roelofs

Lei M Zhang

Kay McKinney

Disha Shrivastava

Cosmin Paduraru

George Tucker

Doina Precup

Feryal Behbahani

Aleksandra Faust

Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffecti… (see more)ve in modern LLMs. Existing approaches for training self-correction either require multiple models or rely on a more capable model or other forms of supervision. To this end, we develop a multi-turn online reinforcement learning (RL) approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data. To build SCoRe, we first show that variants of supervised fine-tuning (SFT) on offline model-generated correction traces are insufficient for instilling self-correction behavior. In particular, we observe that training via SFT either suffers from a distribution mismatch between the training data and the model's own responses or implicitly prefers only a certain mode of correction behavior that is often not effective at test time. SCoRe addresses these challenges by training under the model's own distribution of self-generated correction traces and using appropriate regularization to steer the learning process into learning a self-correction strategy that is effective at test time as opposed to simply fitting high-reward responses for a given prompt. This regularization prescribes running a first phase of RL on a base model to generate a policy initialization that is less susceptible to collapse and then using a reward bonus to amplify self-correction during training. When applied to Gemini 1.0 Pro and 1.5 Flash models, we find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on the MATH and HumanEval benchmarks.

2024-09-19

ArXiv (preprint)

doi.org

arxiv.org

Understanding Web Application Workloads and Their Applications: Systematic Literature Review and Characterization

Roozbeh Aghili

Qiaolin Qin

Heng Li

Foutse Khomh

2024-09-18

ArXiv (preprint)

doi.org

arxiv.org

An Attentive Approach for Building Partial Reasoning Agents from Pixels

Safa Alver

Doina Precup

We study the problem of building reasoning agents that are able to generalize in an effective manner. Towards this goal, we propose an end-t… (see more)o-end approach for building model-based reinforcement learning agents that dynamically focus their reasoning to the relevant aspects of the environment: after automatically identifying the distinct aspects of the environment, these agents dynamically filter out the relevant ones and then pass them to their simulator to perform partial reasoning. Unlike existing approaches, our approach works with pixel-based inputs and it allows for interpreting the focal points of the agent. Our quantitative analyses show that the proposed approach allows for effective generalization in high-dimensional domains with raw observational inputs. We also perform ablation analyses to validate our design choices. Finally, we demonstrate through qualitative analyses that our approach actually allows for building agents that focus their reasoning on the relevant aspects of the environment.

2024-09-17

TMLR (accepted)

openreview.net

An Attentive Approach for Building Partial Reasoning Agents from Pixels

Safa Alver

Doina Precup

We study the problem of building reasoning agents that are able to generalize in an effective manner. Towards this goal, we propose an end-t… (see more)o-end approach for building model-based reinforcement learning agents that dynamically focus their reasoning to the relevant aspects of the environment: after automatically identifying the distinct aspects of the environment, these agents dynamically filter out the relevant ones and then pass them to their simulator to perform partial reasoning. Unlike existing approaches, our approach works with pixel-based inputs and it allows for interpreting the focal points of the agent. Our quantitative analyses show that the proposed approach allows for effective generalization in high-dimensional domains with raw observational inputs. We also perform ablation analyses to validate of design choices. Finally, we demonstrate through qualitative analyses that our approach actually allows for building agents that focus their reasoning on the relevant aspects of the environment.

2024-09-17

TMLR (accepted)

openreview.net

Deep Learning in Ultrasound Localization Microscopy: Applications and Perspectives.

Brice Rauby

Paul Xing

Maxime Gasse

Jean Provost

Ultrasound Localization Microscopy (ULM) is a novel super-resolution imaging technique that can image the vasculature in vivo at depth with … (see more)resolution far beyond the conventional limit of diffraction. By relying on the localization and tracking of clinically approved microbubbles injected in the blood stream, ULM can provide not only anatomical visualization but also hemodynamic quantification of the microvasculature of different tissues. Various deep-learning approaches have been proposed to address challenges in ULM including denoising, improving microbubble localization, estimating blood flow velocity or performing aberration correction. Proposed deep learning methods often outperform their conventional counterparts by improving image quality and reducing processing time. In addition, their robustness to high concentrations of microbubbles can lead to reduced acquisition times in ULM, addressing a major hindrance to ULM clinical application. Herein, we propose a comprehensive review of the diversity of deep learning applications in ULM focusing on approaches assuming a sparse microbubbles distribution. We first provide an overview of how existing studies vary in the constitution of their datasets or in the tasks targeted by deep learning model. We also take a deeper look into the numerous approaches that have been proposed to improve the localization of microbubbles since they differ highly in their formulation of the optimization problem, their evaluation, or their network architectures. We finally discuss the current limitations and challenges of these methods, as well as the promises and potential of deep learning for ULM in the future.

2024-09-17

IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control (published)

doi.org

An Empirical Study of Sensitive Information in Logs

Roozbeh Aghili

Heng Li

Foutse Khomh

2024-09-17

ArXiv (preprint)

doi.org

arxiv.org

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Manfred Diaz

Liam Paull

Andrea Tacchetti

Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and le… (see more)arning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that is found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, there exists an equivalent cooperative game, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represent a novel foundation for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning.

2024-09-17

TMLR (accepted)

doi.org

openreview.net

Deconvolving X-ray Galaxy Cluster Spectra Using a Recurrent Inference Machine

C. Rhea

Julie Hlavacek-larrondo

Alexandre Adam

Ralph P. Kraft

Ákos Bogdán

Laurence Perreault-Levasseur

Marine Prunier

Recent advances in machine learning algorithms have unlocked new insights in observational astronomy by allowing astronomers to probe new fr… (see more)ontiers. In this article, we present a methodology to disentangle the intrinsic X-ray spectrum of galaxy clusters from the instrumental response function. Employing state-of-the-art modeling software and data mining techniques of the Chandra data archive, we construct a set of 100,000 mock Chandra spectra. We train a recurrent inference machine (RIM) to take in the instrumental response and mock observation and output the intrinsic X-ray spectrum. The RIM can recover the mock intrinsic spectrum below the 1-

2024-09-16

ArXiv (preprint)

arxiv.org

Swarming Out of the Lab: Comparing Relative Localization Methods for Collective Behavior

Rafael Gomes Braga

Vivek Shankar Vardharajan

Giovanni Beltrame

David St-Onge

2024-09-15

Lecture Notes in Computer Science (published)

doi.org

When Machines Outshine Humans in Object Recognition, Benchmarking Dilemma

Mohammad Javad Darvishi Bayazi

Md Rifat Arefin

Jocelyn Faubert

Irina Rish

2024-09-15

Journal of Vision (published)

doi.org

A high-throughput phenotypic screen combined with an ultra-large-scale deep learning-based virtual screening reveals novel scaffolds of antibacterial compounds

Gabriele Scalia

Steven T. Rutherford

Ziqing Lu

Kerry R. Buchholz

Nicholas Skelton

Kangway Chuang

Nathaniel Diamant

Jan-Christian Hütter

Jerome-Maxim Luescher

Anh Miu

Jeff Blaney

Leo Gendelev

Elizabeth Skippington

Greg Zynda

Nia Dickson

Michał Koziarski

Yoshua Bengio

Aviv Regev

Man-Wah Tan

Tommaso Biancalani

2024-09-14

bioRxiv (preprint)

doi.org

Is Bigger Always Better? Democratizing AI Protein Discovery

AI Policy Compass

Supervision Requests

Publications

Is Bigger Always Better? Democratizing AI Protein Discovery

AI Policy Compass

Supervision Requests

Popular keywords:

Publications