Automated UML Visualization of Software Ecosystems: Tracking Versions, Dependencies, and Security Updates
Vanessa Kan
M. P. Lnu
Solomon Berhe
C. El Kari
Marc Maynard
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
Yunzhen Feng
Pu Yang
Francois Charton
Julia Kempe
Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of… (see more) the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation. This raises concerns about \emph{model collapse}, a drop in model performance when their training sets include generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investigate the use of verification on synthesized data to prevent model collapse. We provide a theoretical characterization using Gaussian mixtures, linear classifiers, and linear verifiers to derive conditions with measurable proxies to assess whether the verifier can effectively select synthesized data that leads to optimal performance. We experiment with two practical tasks -- computing matrix eigenvalues with transformers and news summarization with LLMs -- which both exhibit model collapse when trained on generated data, and show that verifiers, even imperfect ones, can indeed be harnessed to prevent model collapse and that our proposed proxy measure strongly correlates with performance.
Body size and intracranial volume interact with the structure of the central nervous system: A multi-center in vivo neuroimaging study
René Labounek
Monica T. Bondy
Amy L. Paulson
Sandrine Bédard
Mihael Abramovic
Eva Alonso‐Ortiz
Nicole Atcheson
Laura R. Barlow
Robert L. Barry
Markus Barth
Marco Battiston
Christian Büchel
Matthew D. Budde
Virginie Callot
Anna Combes
Benjamin De Leener
Maxime Descoteaux
Paulo Loureiro de Sousa
Marek Dostál
Julien Doyon … (see 74 more)
Adam Dvorak
Falk Eippert
Karla R. Epperson
Kevin S. Epperson
Patrick Freund
Jürgen Finsterbusch
Alexandru Foias
Michela Fratini
Issei Fukunaga
Claudia A. M. Gandini Wheeler-Kingshott
Giancarlo Germani
Guillaume Gilbert
Federico Giove
Francesco Grussu
Akifumi Hagiwara
Pierre-Gilles Henry
Tomáš Horák
Masaaki Hori
James Joers
Kouhei Kamiya
Haleh Karbasforoushan
Miloš Keřkovský
Ali Khatibi
Joo-won Kim
Nawal Kinany
Hagen H. Kitzler
Shannon Kolind
Yazhuo Kong
Petr Kudlička
Paul Kuntke
Nyoman D. Kurniawan
Slawomir Kusmia
Maria Marcella Lagana
Cornelia Laule
Christine S. W. Law
Csw Law
Tobias Leutritz
Yaou Liu
Sara Llufriu
Sean Mackey
Allan R. Martin
Eloy Martinez-Heras
Loan Mattera
Kristin P. O’Grady
Nico Papinutto
Daniel Papp
Deborah Pareto
Todd B. Parrish
Anna Pichiecchio
Ferran Prados
Àlex Rovira
Marc J. Ruitenberg
Rebecca S. Samson
Giovanni Savini
Maryam Seif
Alan C. Seifert
Alex K. Smith
Seth Aaron Smith
Zachary A. Smith
Elisabeth Solana
Yuichi Suzuki
George Tackley
Alexandra Tinnermann
Jan Valošek
Dimitri Van De Ville
Marios C. Yiannakas
Kenneth A. Weber
Nikolaus Weiskopf
Richard G. Wise
Patrik O. Wyss
Junqian Xu
Christophe Lenglet
Igor Nestrašil
Changer le regard des étudiants sur les métiers de la comptabilité : Les effets de la simulation de gestion
Yann QUÉMÉNER
La comptabilité véhicule souvent injustement, une image terne et ennuyeuse, auprès du grand public et des jeunes étudiants choisissant l… (see more)eur orientation. Dans cet article, nous questionnons l’effet de pratiques pédagogiques sur la perception par les étudiants, des soft skills attendues par les employeurs. Pour cela nous réalisons une quasi-expérimentation dans laquelle nous comparons les perceptions des étudiants selon que le cours ait été animé sous un format classique (application des connaissances par le biais d’exercices avec corrigé par l’enseignant) ou sous la forme d’une simulation de gestion (application des connaissances en vue de prendre des décisions et piloter une entreprise fictive). Les résultats de la recherche montrent qu’une simulation de gestion, plus que les travaux dirigés classiques, permettent aux primo-apprenants en comptabilité, d’avoir une meilleure perception des soft skills attendues par les praticiens et les recruteurs. Nos résultats rappellent l’importance de donner une représentation réaliste (éloignée des clichés) de la profession, afin de rendre les filières d’enseignement de la comptabilité plus attractives.
Child- and Proxy-Reported Differences in Patient-Reported Outcome and Experience Measures in Pediatric Surgery: Systematic Review and Meta-Analysis
Zanib Nafees
Siena O’Neill
Alexandra Dimmer
Elena Guadagno
Julia Ferreira
Nancy Mayo
Child- and Proxy-reported Differences in Patient-reported Outcome and Experience Measures in Pediatric Surgery: Systematic Review and Meta-analysis
Zanib Nafees
Siena O'Neill
Alexandra Dimmer
Elena Guadagno
Julia Ferreira
Nancy Mayo
Ctrl-V: Higher Fidelity Autonomous Vehicle Video Generation with Bounding-Box Controlled Object Motion
Ge Ya Luo
Zhi Hao Luo
Anthony Gosselin
Alexia Jolicoeur-Martineau
Deflated Dynamics Value Iteration
Jongmin Lee
Amin Rakhsha
Ernest K. Ryu
Amir-massoud Farahmand
The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of… (see more) many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a function of iteration
A Distributed ADMM-based Deep Learning Approach for Thermal Control in Multi-Zone Buildings
Vincent Taboga
The surge in electricity use, coupled with the dependency on intermittent renewable energy sources, poses significant hurdles to effectively… (see more) managing power grids, particularly during times of peak demand. Demand Response programs and energy conservation measures are essential to operate energy grids while ensuring a responsible use of our resources This research combines distributed optimization using ADMM with Deep Learning models to plan indoor temperature setpoints effectively. A two-layer hierarchical structure is used, with a central building coordinator at the upper layer and local controllers at the thermal zone layer. The coordinator must limit the building's maximum power by translating the building's total power to local power targets for each zone. Local controllers can modify the temperature setpoints to meet the local power targets. The resulting control algorithm, called Distributed Planning Networks, is designed to be both adaptable and scalable to many types of buildings, tackling two of the main challenges in the development of such systems. The proposed approach is tested on an 18-zone building modeled in EnergyPlus. The algorithm successfully manages Demand Response peak events.
A Distributed ADMM-Based Deep Learning Approach for Thermal Control in Multi-Zone Buildings Under Demand Response Events.
Vincent Taboga
An Effective Theory of Bias Amplification
Arjun Subramonian
Samuel J. Bell
Levent Sagun
Machine learning models may capture and amplify biases present in data, leading to disparate test performance across social groups. To bette… (see more)r understand, evaluate, and mitigate these possible biases, a deeper theoretical understanding of how model design choices and data distribution properties could contribute to bias is needed. In this work, we contribute a precise analytical theory in the context of ridge regression, both with and without random projections, where the former models neural networks in a simplified regime. Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias in various feature and parameter regimes. For example, we demonstrate that there may be an optimal regularization penalty or training time to avoid bias amplification, and there can be fundamental differences in test error between groups that do not vanish with increased parameterization. Importantly, our theoretical predictions align with several empirical observations reported in the literature. We extensively empirically validate our theory on diverse synthetic and semi-synthetic datasets.
Efficient Deep Reinforcement Learning-Based Supplementary Damping Control with a Coordinated RMS Training and EMT Testing Scheme
Tao Xue
Mingxuan Zhao
Ilhan Kocar
Mohsen Ghafouri
Siqi Bu
Ziqing Zhu