Publications

Sound and Modular Activity Analysis for Automatic Differentiation in MLIR

Mai Jacob Peng

William S. Moses

Oleksandr Zinenko

Christophe Dubach

2025-10-09

Proceedings of the ACM on Programming Languages (published)

doi.org

Sound and Modular Activity Analysis for Automatic Differentiation in MLIR

Mai Jacob Peng

William S. Moses

Oleksandr Zinenko

Christophe Dubach

2025-10-09

Proceedings of the ACM on Programming Languages (published)

doi.org

Wavefunction Flows: Efficient Quantum Simulation of Continuous Flow Models

David Layden

Ryan Sweke

Vojtvech Havl'ivcek

Anirban Chowdhury

Kirill Neklyudov

Flow models are a cornerstone of modern machine learning. They are generative models that progressively transform probability distributions … (see more)according to learned dynamics. Specifically, they learn a continuous-time Markov process that efficiently maps samples from a simple source distribution into samples from a complex target distribution. We show that these models are naturally related to the Schr\"odinger equation, for an unusual Hamiltonian on continuous variables. Moreover, we prove that the dynamics generated by this Hamiltonian can be efficiently simulated on a quantum computer. Together, these results give a quantum algorithm for preparing coherent encodings (a.k.a., qsamples) for a vast family of probability distributions--namely, those expressible by flow models--by reducing the task to an existing classical learning problem, plus Hamiltonian simulation. For statistical problems defined by flow models, such as mean estimation and property testing, this enables the use of quantum algorithms tailored to qsamples, which may offer advantages over classical algorithms based only on samples from a flow model. More broadly, these results reveal a close connection between state-of-the-art machine learning models, such as flow matching and diffusion models, and one of the main expected capabilities of quantum computers: simulating quantum dynamics.

2025-10-09

ArXiv (preprint)

arxiv.org

Wavefunction Flows: Efficient Quantum Simulation of Continuous Flow Models

David Layden

Ryan Sweke

Vojtvech Havl'ivcek

Anirban Chowdhury

Kirill Neklyudov

2025-10-09

ArXiv (preprint)

arxiv.org

Comparison of Speech Tasks in Human Expert and Machine Detection of Parkinson's Disease

Peter Plantinga

Roozbeh Sattari

Karine Marcotte

Carla Di Gironimo

Madeleine Sharp

Liziane Bouvier

Maiya Geddes

Ingrid Verduyckt

'Etienne de Villers-Sidani

Mirco Ravanelli

Denise Klein

2025-10-08

ArXiv (preprint)

arxiv.org

Comparison of Speech Tasks in Human Expert and Machine Detection of Parkinson's Disease

Peter William VanHarn Plantinga

Roozbeh Sattari

Karine Marcotte

Carla Di Gironimo

Madeleine Sharp

Liziane Bouvier

Maiya Geddes

Ingrid Verduyckt

'Etienne de Villers-Sidani

Mirco Ravanelli

Denise Klein

The speech of people with Parkinson's Disease (PD) has been shown to hold important clues about the presence and progression of the disease.… (see more) We investigate the factors based on which humans experts make judgments of the presence of disease in speech samples over five different speech tasks: phonations, sentence repetition, reading, recall, and picture description. We make comparisons by conducting listening tests to determine clinicians accuracy at recognizing signs of PD from audio alone, and we conduct experiments with a machine learning system for detection based on Whisper. Across tasks, Whisper performs on par or better than human experts when only audio is available, especially on challenging but important subgroups of the data: younger patients, mild cases, and female patients. Whisper's ability to recognize acoustic cues in difficult cases complements the multimodal and contextual strengths of human experts.

2025-10-08

ArXiv (preprint)

arxiv.org

High-Rate Mixout: Revisiting Mixout for Robust Domain Generalization

Masih Aminbeidokhti

Heitor Rapela Medeiros

Srikanth Muralidharan

Eric Granger

Marco Pedersoli

Ensembling fine-tuned models initialized from powerful pre-trained weights is a common strategy to improve robustness under distribution shi… (see more)fts, but it comes with substantial computational costs due to the need to train and store multiple models. Dropout offers a lightweight alternative by simulating ensembles through random neuron deactivation; however, when applied to pre-trained models, it tends to over-regularize and disrupt critical representations necessary for generalization. In this work, we investigate Mixout, a stochastic regularization technique that provides an alternative to Dropout for domain generalization. Rather than deactivating neurons, Mixout mitigates overfitting by probabilistically swapping a subset of fine-tuned weights with their pre-trained counterparts during training, thereby maintaining a balance between adaptation and retention of prior knowledge. Our study reveals that achieving strong performance with Mixout on domain generalization benchmarks requires a notably high masking probability of 0.9 for ViTs and 0.8 for ResNets. While this may seem like a simple adjustment, it yields two key advantages for domain generalization: (1) higher masking rates more strongly penalize deviations from the pre-trained parameters, promoting better generalization to unseen domains; and (2) high-rate masking substantially reduces computational overhead, cutting gradient computation by up to 45% and gradient memory usage by up to 90%. Experiments across five domain generalization benchmarks, PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet, using ResNet and ViT architectures, show that our approach, High-rate Mixout, achieves out-of-domain accuracy comparable to ensemble-based methods while significantly reducing training costs.

2025-10-08

ArXiv (preprint)

arxiv.org

High-Rate Mixout: Revisiting Mixout for Robust Domain Generalization

Masih Aminbeidokhti

Heitor Rapela Medeiros

Srikanth Muralidharan

Eric Granger

Marco Pedersoli

2025-10-08

ArXiv (preprint)

arxiv.org

High-Rate Mixout: Revisiting Mixout for Robust Domain Generalization

Masih Aminbeidokhti

Heitor Rapela Medeiros

Eric Granger

Marco Pedersoli

Ensembling fine-tuned models initialized from powerful pre-trained weights is a common strategy to improve robustness under distribution shi… (see more)fts, but it comes with substantial computational costs due to the need to train and store multiple models. Dropout offers a lightweight alternative by simulating ensembles through random neuron deactivation; however, when applied to pre-trained models, it tends to over-regularize and disrupt critical representations necessary for generalization. In this work, we investigate Mixout, a stochastic regularization technique that provides an alternative to Dropout for domain generalization. Rather than deactivating neurons, Mixout mitigates overfitting by probabilistically swapping a subset of fine-tuned weights with their pre-trained counterparts during training, thereby maintaining a balance between adaptation and retention of prior knowledge. Our study reveals that achieving strong performance with Mixout on domain generalization benchmarks requires a notably high masking probability of 0.9 for ViTs and 0.8 for ResNets. While this may seem like a simple adjustment, it yields two key advantages for domain generalization: (1) higher masking rates more strongly penalize deviations from the pre-trained parameters, promoting better generalization to unseen domains; and (2) high-rate masking substantially reduces computational overhead, cutting gradient computation by up to 45% and gradient memory usage by up to 90%. Experiments across five domain generalization benchmarks, PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet, using ResNet and ViT architectures, show that our approach, High-rate Mixout, achieves out-of-domain accuracy comparable to ensemble-based methods while significantly reducing training costs.

2025-10-08

ArXiv (preprint)

arxiv.org

Online HD-tRNS over the right temporoparietal junction modulates social inference but not motor coordination

Quentin Moreau

Vincent Chamberland

Lisane Moses

Gabriela Milanova

Guillaume Dumas

2025-10-08

eNeuro (published)

doi.org

Online HD-tRNS Over the Right Temporoparietal Junction Modulates Social Inference But Not Motor Coordination

Quentin Moreau

Vincent Chamberland

Lisane Moses

Gabriela Milanova

Guillaume Dumas

2025-10-08

eNeuro (published)

doi.org

Revisiting Mixout: An Overlooked Path to Robust Finetuning

Masih Aminbeidokhti

Heitor Rapela Medeiros

Eric Granger

Marco Pedersoli

Finetuning vision foundation models often improves in-domain accuracy but comes at the cost of robustness under distribution shift. We revis… (see more)it Mixout, a stochastic regularizer that intermittently replaces finetuned weights with their pretrained reference, through the lens of a single-run, weight-sharing implicit ensemble. This perspective reveals three key levers that govern robustness: the \emph{masking anchor}, \emph{resampling frequency}, and \emph{mask sparsity}. Guided by this analysis, we introduce GMixout, which (i) replaces the fixed anchor with an exponential moving-average snapshot that adapts during training, and (ii) regulates masking period via an explicit resampling-frequency hyperparameter. Our sparse-kernel implementation updates only a small fraction of parameters with no inference-time overhead, enabling training on consumer-grade GPUs. Experiments on benchmarks covering covariate shift, corruption, and class imbalance, ImageNet / ImageNet-LT, DomainNet, iWildCam, and CIFAR100-C, GMixout consistently improves in-domain accuracy beyond zero-shot performance while surpassing both Model Soups and strong parameter-efficient finetuning baselines under distribution shift.

2025-10-08

ArXiv (preprint)

arxiv.org

Mil'Haq Fest 2025

Mila Community of Practice

Custom AI Learning Programs

Supervision Requests

Publications

Mil'Haq Fest 2025

Mila Community of Practice

Custom AI Learning Programs

Supervision Requests

Popular keywords:

Publications