Publications

Strong Model Collapse

Elvis Dohmatob

Yunzhen Feng

Arjun Subramonian

Julia Kempe

2024-10-07

ArXiv (preprint)

doi.org

arxiv.org

Strong Model Collapse

Elvis Dohmatob

Yunzhen Feng

Arjun Subramonian

Julia Kempe

Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised reg… (see more)ression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1\% of the total training dataset) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and feed-forward neural networks for images.

2024-10-07

ArXiv (preprint)

doi.org

arxiv.org

Strong Model Collapse

Elvis Dohmatob

Yunzhen Feng

Arjun Subramonian

Julia Kempe

2024-10-07

ArXiv (preprint)

doi.org

arxiv.org

Brain-like neural dynamics for behavioral control develop through reinforcement learning

Olivier Codol

Nanda H Krishna

Guillaume Lajoie

M.G. Perich

During development, neural circuits are shaped continuously as we learn to control our bodies. The ultimate goal of this process is to produ… (see more)ce neural dynamics that enable the rich repertoire of behaviors we perform with our limbs. What begins as a series of “babbles” coalesces into skilled motor output as the brain rapidly learns to control the body. However, the nature of the teaching signal underlying this normative learning process remains elusive. Here, we test two well-established and biologically plausible theories—supervised learning (SL) and reinforcement learning (RL)—that could explain how neural circuits develop the capacity for skilled movements. We trained recurrent neural networks to control a biomechanical model of a primate arm using either SL or RL and compared the resulting neural dynamics to populations of neurons recorded from the motor cortex of monkeys performing the same movements. Intriguingly, only RL-trained networks produced neural activity that matched their biological counterparts in terms of both the geometry and dynamics of population activity. We show that the similarity between RL-trained networks and biological brains depends critically on matching biomechanical properties of the limb. We then demonstrated that monkeys and RL-trained networks, but not SL-trained networks, show a strikingly similar capacity for robust short-term behavioral adaptation to a movement perturbation, indicating a fundamental and general commonality in the neural control policy. Together, our results support the hypothesis that neural dynamics for behavioral control emerge through a process akin to reinforcement learning. The resulting neural circuits offer numerous advantages for adaptable behavioral control over simpler and more efficient learning rules and expand our understanding of how developmental processes shape neural dynamics.

2024-10-06

bioRxiv (preprint)

doi.org

Brain-like neural dynamics for behavioral control develop through reinforcement learning

Olivier Codol

Nanda H Krishna

Guillaume Lajoie

M.G. Perich

During development, neural circuits are shaped continuously as we learn to control our bodies. The ultimate goal of this process is to produ… (see more)ce neural dynamics that enable the rich repertoire of behaviors we perform with our limbs. What begins as a series of “babbles” coalesces into skilled motor output as the brain rapidly learns to control the body. However, the nature of the teaching signal underlying this normative learning process remains elusive. Here, we test two well-established and biologically plausible theories—supervised learning (SL) and reinforcement learning (RL)—that could explain how neural circuits develop the capacity for skilled movements. We trained recurrent neural networks to control a biomechanical model of a primate arm using either SL or RL and compared the resulting neural dynamics to populations of neurons recorded from the motor cortex of monkeys performing the same movements. Intriguingly, only RL-trained networks produced neural activity that matched their biological counterparts in terms of both the geometry and dynamics of population activity. We show that the similarity between RL-trained networks and biological brains depends critically on matching biomechanical properties of the limb. We then demonstrated that monkeys and RL-trained networks, but not SL-trained networks, show a strikingly similar capacity for robust short-term behavioral adaptation to a movement perturbation, indicating a fundamental and general commonality in the neural control policy. Together, our results support the hypothesis that neural dynamics for behavioral control emerge through a process akin to reinforcement learning. The resulting neural circuits offer numerous advantages for adaptable behavioral control over simpler and more efficient learning rules and expand our understanding of how developmental processes shape neural dynamics.

2024-10-06

bioRxiv (preprint)

doi.org

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Hyeonah Kim

Minsu Kim

Taeyoung Yun

Sanghyeok Choi

Emmanuel Bengio

Alex Hernandez-Garcia

Jinkyoo Park

Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high … (see more)cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce

2024-10-06

ArXiv (preprint)

doi.org

arxiv.org

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Hyeonah Kim

Minsu Kim

Taeyoung Yun

Sanghyeok Choi

Emmanuel Bengio

Alex Hern'andez-Garc'ia

Jinkyoo Park

Designing biological sequences with desired properties is a significant challenge due to the combinatorially vast search space and the high … (see more)cost of evaluating each candidate sequence. To address these challenges, reinforcement learning (RL) methods, such as GFlowNets, utilize proxy models for rapid reward evaluation and annotated data for policy training. Although these approaches have shown promise in generating diverse and novel sequences, the limited training data relative to the vast search space often leads to the misspecification of proxy for out-of-distribution inputs. We introduce

2024-10-06

ArXiv (preprint)

doi.org

arxiv.org

Toward Debugging Deep Reinforcement Learning Programs with RLExplorer

Rached Bouchoucha

Ahmed Haj Yahmed

Darshan Patil

Janarthanan Rajendran

Amin Nikanjam

Sarath Chandar

Foutse Khomh

Deep reinforcement learning (DRL) has shown success in diverse domains such as robotics, computer games, and recommendation systems. However… (see more), like any other software system, DRL-based software systems are susceptible to faults that pose unique challenges for debugging and diagnosing. These faults often result in unexpected behavior without explicit failures and error messages, making debugging difficult and time-consuming. Therefore, automating the monitoring and diagnosis of DRL systems is crucial to alleviate the burden on developers. In this paper, we propose RLExplorer, the first fault diagnosis approach for DRL-based software systems. RLExplorer automatically monitors training traces and runs diagnosis routines based on properties of the DRL learning dynamics to detect the occurrence of DRL-specific faults. It then logs the results of these diagnoses as warnings that cover theoretical concepts, recommended practices, and potential solutions to the identified faults. We conducted two sets of evaluations to assess RLExplorer. Our first evaluation of faulty DRL samples from Stack Overflow revealed that our approach can effectively diagnose real faults in 83% of the cases. Our second evaluation of RLExplorer with 15 DRL experts/developers showed that (1) RLExplorer could identify 3.6 times more defects than manual debugging and (2) RLExplorer is easily integrated into DRL applications.

2024-10-06

ArXiv (preprint)

doi.org

arxiv.org

Toward Debugging Deep Reinforcement Learning Programs with RLExplorer

Rached Bouchoucha

Ahmed Haj Yahmed

Darshan Patil

Janarthanan Rajendran

Amin Nikanjam

Sarath Chandar

Foutse Khomh

Deep reinforcement learning (DRL) has shown success in diverse domains such as robotics, computer games, and recommendation systems. However… (see more), like any other software system, DRL-based software systems are susceptible to faults that pose unique challenges for debugging and diagnosing. These faults often result in unexpected behavior without explicit failures and error messages, making debugging difficult and time-consuming. Therefore, automating the monitoring and diagnosis of DRL systems is crucial to alleviate the burden on developers. In this paper, we propose RLExplorer, the first fault diagnosis approach for DRL-based software systems. RLExplorer automatically monitors training traces and runs diagnosis routines based on properties of the DRL learning dynamics to detect the occurrence of DRL-specific faults. It then logs the results of these diagnoses as warnings that cover theoretical concepts, recommended practices, and potential solutions to the identified faults. We conducted two sets of evaluations to assess RLExplorer. Our first evaluation of faulty DRL samples from Stack Overflow revealed that our approach can effectively diagnose real faults in 83% of the cases. Our second evaluation of RLExplorer with 15 DRL experts/developers showed that (1) RLExplorer could identify 3.6 times more defects than manual debugging and (2) RLExplorer is easily integrated into DRL applications.

2024-10-06

2024 IEEE International Conference on Software Maintenance and Evolution (ICSME) (published)

doi.org

arxiv.org

Understanding Web Application Workloads and Their Applications: Systematic Literature Review and Characterization

Roozbeh Aghili

Qiaolin Qin

Heng Li

Foutse Khomh

2024-10-06

2024 IEEE International Conference on Software Maintenance and Evolution (ICSME) (published)

doi.org

arxiv.org

Assessment of the Climate Trace global powerplant CO2 emissions

Kevin R. Gurney

Bilal Aslam

Pawlok Dass

Lech Gawuc

Toby Dylan Hocking

Jarrett J Barber

Anna Kato

Accurate estimation of planetary greenhouse gas (GHG) emissions at the scale of individual emitting activities is a critical need for both s… (see more)cientific and policy applications. Powerplants represent the single largest and most concentrated form of global GHG emissions. Climate Trace, co-founded and promoted by former U.S. Vice President Al Gore, is a new effort using, in part, artificial intelligence (AI) approaches to estimate asset-scale GHG emissions. Climate Trace recently released a database of global powerplant CO2 emissions at the facility-scale that uses both AI and non-AI estimation approaches. However, no independent peer-reviewed assessment has been made of this important global emissions database. Here, we compare the Climate Trace powerplant CO2 emissions to an atmospherically calibrated, multi-constraint estimate of powerplant CO2 emissions in the United States. The 3.7% (65) of compared facilities that used an AI-based approach show a mean relative difference (MRD) of −1.1% (SD: 46.4%) in the year 2019. The 96.3% (1726) of the facilities that used a non-AI-based approach show a MRD of −50.0% (SD: 117.7%). Of the non-AI estimated facilities, 151 (8.7%) facilities agree to within ±20%. The large differences between Climate Trace and Vulcan-power emission estimates for these facilities is primarily caused by Climate Trace’ use of a national-mean power plant capacity factor (CF) which is a poor representation of the reported power plant CFs of individual US facilities and leads to very large errors at those same 1726 facilities.

2024-10-04

Environmental Research Letters (published)

doi.org

DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images

Chen Liu

Danqi Liao

Alejandro Parada-Mayorga

Alejandro Ribeiro

Marcello DiStasio

Smita Krishnaswamy

The proliferation of digital microscopy images, driven by advances in automated whole slide scanning, presents significant opportunities for… (see more) biomedical research and clinical diagnostics. However, accurately annotating densely packed information in these images remains a major challenge. To address this, we introduce DiffKillR, a novel framework that reframes cell annotation as the combination of archetype matching and image registration tasks. DiffKillR employs two complementary neural networks: one that learns a diffeomorphism-invariant feature space for robust cell matching and another that computes the precise warping field between cells for annotation mapping. Using a small set of annotated archetypes, DiffKillR efficiently propagates annotations across large microscopy images, reducing the need for extensive manual labeling. More importantly, it is suitable for any type of pixel-level annotation. We will discuss the theoretical properties of DiffKillR and validate it on three microscopy tasks, demonstrating its advantages over existing supervised, semi-supervised, and unsupervised methods.

2024-10-04

ArXiv (preprint)

doi.org

arxiv.org

AI Advantage

Mila AI Policy Fellowship

Leveraging AI for a Sustainable Future

AI Advantage

Mila AI Policy Fellowship

Publications

AI Advantage

Mila AI Policy Fellowship

Leveraging AI for a Sustainable Future

AI Advantage

Mila AI Policy Fellowship

Popular keywords:

Publications