Publications

Automated UML Visualization of Software Ecosystems: Tracking Versions, Dependencies, and Security Updates

Vanessa Kan

M. P. Lnu

Solomon Berhe

C. El Kari

Marc Maynard

Foutse Khomh

2025-01-01

ANT/EDI40 (published)

doi.org

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification

Yunzhen Feng

Elvis Dohmatob

Pu Yang

Francois Charton

Julia Kempe

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of… (see more) the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation. This raises concerns about \emph{model collapse}, a drop in model performance when their training sets include generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investigate the use of verification on synthesized data to prevent model collapse. We provide a theoretical characterization using Gaussian mixtures, linear classifiers, and linear verifiers to derive conditions with measurable proxies to assess whether the verifier can effectively select synthesized data that leads to optimal performance. We experiment with two practical tasks -- computing matrix eigenvalues with transformers and news summarization with LLMs -- which both exhibit model collapse when trained on generated data, and show that verifiers, even imperfect ones, can indeed be harnessed to prevent model collapse and that our proposed proxy measure strongly correlates with performance.

2025-01-01

ICLR (published)

doi.org

arxiv.org

Body size and intracranial volume interact with the structure of the central nervous system: A multi-center in vivo neuroimaging study

René Labounek

Monica T. Bondy

Amy L. Paulson

Sandrine Bédard

Mihael Abramovic

Eva Alonso‐Ortiz

Nicole Atcheson

Laura R. Barlow

Robert L. Barry

Markus Barth

Marco Battiston

Christian Büchel

Matthew D. Budde

Virginie Callot

Anna Combes

Benjamin De Leener

Maxime Descoteaux

Paulo Loureiro de Sousa

Marek Dostál

Julien Doyon … (see 74 more)

Adam Dvorak

Falk Eippert

Karla R. Epperson

Kevin S. Epperson

Patrick Freund

Jürgen Finsterbusch

Alexandru Foias

Michela Fratini

Issei Fukunaga

Claudia A. M. Gandini Wheeler-Kingshott

Giancarlo Germani

Guillaume Gilbert

Federico Giove

Francesco Grussu

Akifumi Hagiwara

Pierre-Gilles Henry

Tomáš Horák

Masaaki Hori

James Joers

Kouhei Kamiya

Haleh Karbasforoushan

Miloš Keřkovský

Ali Khatibi

Joo-won Kim

Nawal Kinany

Hagen H. Kitzler

Shannon Kolind

Yazhuo Kong

Petr Kudlička

Paul Kuntke

Nyoman D. Kurniawan

Slawomir Kusmia

Maria Marcella Lagana

Cornelia Laule

Christine S. W. Law

Csw Law

Tobias Leutritz

Yaou Liu

Sara Llufriu

Sean Mackey

Allan R. Martin

Eloy Martinez-Heras

Loan Mattera

Kristin P. O’Grady

Nico Papinutto

Daniel Papp

Deborah Pareto

Todd B. Parrish

Anna Pichiecchio

Ferran Prados

Àlex Rovira

Marc J. Ruitenberg

Rebecca S. Samson

Giovanni Savini

Maryam Seif

Alan C. Seifert

Alex K. Smith

Seth Aaron Smith

Zachary A. Smith

Elisabeth Solana

Yuichi Suzuki

George Tackley

Alexandra Tinnermann

Jan Valošek

Dimitri Van De Ville

Marios C. Yiannakas

Kenneth A. Weber

Nikolaus Weiskopf

Richard G. Wise

Patrik O. Wyss

Junqian Xu

Julien Cohen-Adad

Christophe Lenglet

Igor Nestrašil

2025-01-01

Imaging Neuroscience (published)

doi.org

Changer le regard des étudiants sur les métiers de la comptabilité : Les effets de la simulation de gestion

Guillaume Dumas

Yann QUÉMÉNER

La comptabilité véhicule souvent injustement, une image terne et ennuyeuse, auprès du grand public et des jeunes étudiants choisissant l… (see more)eur orientation. Dans cet article, nous questionnons l’effet de pratiques pédagogiques sur la perception par les étudiants, des soft skills attendues par les employeurs. Pour cela nous réalisons une quasi-expérimentation dans laquelle nous comparons les perceptions des étudiants selon que le cours ait été animé sous un format classique (application des connaissances par le biais d’exercices avec corrigé par l’enseignant) ou sous la forme d’une simulation de gestion (application des connaissances en vue de prendre des décisions et piloter une entreprise fictive). Les résultats de la recherche montrent qu’une simulation de gestion, plus que les travaux dirigés classiques, permettent aux primo-apprenants en comptabilité, d’avoir une meilleure perception des soft skills attendues par les praticiens et les recruteurs. Nos résultats rappellent l’importance de donner une représentation réaliste (éloignée des clichés) de la profession, afin de rendre les filières d’enseignement de la comptabilité plus attractives.

2025-01-01

Finance Contrôle Stratégie (published)

doi.org

Child- and Proxy-Reported Differences in Patient-Reported Outcome and Experience Measures in Pediatric Surgery: Systematic Review and Meta-Analysis

Zanib Nafees

Siena O’Neill

Alexandra Dimmer

Elena Guadagno

Julia Ferreira

Nancy Mayo

Dan Poenaru

2025-01-01

Journal of Pediatric Surgery (published)

doi.org

Child- and Proxy-reported Differences in Patient-reported Outcome and Experience Measures in Pediatric Surgery: Systematic Review and Meta-analysis

Zanib Nafees

Siena O'Neill

Alexandra Dimmer

Elena Guadagno

Julia Ferreira

Nancy Mayo

Dan Poenaru

2025-01-01

Journal of Pediatric Surgery (published)

doi.org

Ctrl-V: Higher Fidelity Autonomous Vehicle Video Generation with Bounding-Box Controlled Object Motion

Ge Ya Luo

Zhi Hao Luo

Anthony Gosselin

Alexia Jolicoeur-Martineau

Chris Pal

2025-01-01

Trans. Mach. Learn. Res. (published)

openreview.net

Deflated Dynamics Value Iteration

Jongmin Lee

Amin Rakhsha

Ernest K. Ryu

Amir-massoud Farahmand

The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of… (see more) many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a function of iteration

2025-01-01

Trans. Mach. Learn. Res. (published)

doi.org

arxiv.org

A Distributed ADMM-based Deep Learning Approach for Thermal Control in Multi-Zone Buildings

Vincent Taboga

Hanane Dagdougui

The surge in electricity use, coupled with the dependency on intermittent renewable energy sources, poses significant hurdles to effectively… (see more) managing power grids, particularly during times of peak demand. Demand Response programs and energy conservation measures are essential to operate energy grids while ensuring a responsible use of our resources This research combines distributed optimization using ADMM with Deep Learning models to plan indoor temperature setpoints effectively. A two-layer hierarchical structure is used, with a central building coordinator at the upper layer and local controllers at the thermal zone layer. The coordinator must limit the building's maximum power by translating the building's total power to local power targets for each zone. Local controllers can modify the temperature setpoints to meet the local power targets. The resulting control algorithm, called Distributed Planning Networks, is designed to be both adaptable and scalable to many types of buildings, tackling two of the main challenges in the development of such systems. The proposed approach is tested on an 18-zone building modeled in EnergyPlus. The algorithm successfully manages Demand Response peak events.

2025-01-01

IEEE Transactions on Automation Science and Engineering (published)

doi.org

arxiv.org

A Distributed ADMM-Based Deep Learning Approach for Thermal Control in Multi-Zone Buildings Under Demand Response Events.

Vincent Taboga

Hanane Dagdougui

2025-01-01

IEEE Trans Autom. Sci. Eng. (published)

doi.org

arxiv.org

An Effective Theory of Bias Amplification

Arjun Subramonian

Samuel J. Bell

Levent Sagun

Elvis Dohmatob

Machine learning models may capture and amplify biases present in data, leading to disparate test performance across social groups. To bette… (see more)r understand, evaluate, and mitigate these possible biases, a deeper theoretical understanding of how model design choices and data distribution properties could contribute to bias is needed. In this work, we contribute a precise analytical theory in the context of ridge regression, both with and without random projections, where the former models neural networks in a simplified regime. Our theory offers a unified and rigorous explanation of machine learning bias, providing insights into phenomena such as bias amplification and minority-group bias in various feature and parameter regimes. For example, we demonstrate that there may be an optimal regularization penalty or training time to avoid bias amplification, and there can be fundamental differences in test error between groups that do not vanish with increased parameterization. Importantly, our theoretical predictions align with several empirical observations reported in the literature. We extensively empirically validate our theory on diverse synthetic and semi-synthetic datasets.

2025-01-01

ICLR (published)

doi.org

arxiv.org

Efficient Deep Reinforcement Learning-Based Supplementary Damping Control with a Coordinated RMS Training and EMT Testing Scheme

Tao Xue

Mingxuan Zhao

Ilhan Kocar

Mohsen Ghafouri

Antoine Lesage-Landry

Siqi Bu

Ziqing Zhu

2025-01-01

IEEE Transactions on Power Delivery (published)

doi.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications