Portrait de Elvis  Dohmatob

Elvis Dohmatob

Membre académique associé
Professeur agrégé, Concordia University, Département d'informatique et de génie logiciel
Chercheur, Meta Facebook AI Research (FAIR)
Sujets de recherche
Équité algorithmique
Optimisation
Robustesse antagoniste
Théorie de l'apprentissage automatique

Publications

Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised reg… (voir plus)ression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1\% of the total training dataset) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and feed-forward neural networks for images.
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised reg… (voir plus)ression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1\% of the total training dataset) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and feed-forward neural networks for images.
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised reg… (voir plus)ression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1\% of the total training dataset) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and feed-forward neural networks for images.
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Strong Model Collapse
Yunzhen Feng
Arjun Subramonian
Julia Kempe
Consistent Adversarially Robust Linear Classification: Non-Parametric Setting
For binary classification in …
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Yunzhen Feng
Pu Yang
Francois Charton
Julia Kempe
As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing … (voir plus)capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ”un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.
Individual Brain Charting dataset extension, third release for movie watching and retinotopy data
Ana Lúısa Pinho
Hugo Richard
Ana Fernanda Ponce
Michael Eickenberg
Alexis Amadon
Isabelle Denghien
Juan Jesús Torre
Swetha Shankar
Himanshu Aggarwal
Alexis Thual
Thomas Chapalain
Chantal Ginisty
Séverine Becuwe-Desmidt
Séverine Roger
Yann Lecomte
Valérie Berland
Laurence Laurier
Véronique Joly-Testault
Gaëlle Médiouni-Cloarec … (voir 6 de plus)
Christine Doublé
Bernadette Martins
Gael Varoquaux
Stanislas Dehaene
Lucie Hertz-Pannier
Bertrand Thirion