Publications

Towards Enhancing the Reproducibility of Deep Learning Bugs: An Empirical Study

Mehil B. Shah

Mohammad Masudur Rahman

2024-11-08

Empirical Software Engineering (published)

A new species of Hoplostethus from Sumatra, eastern Indian Ocean, with comments on its most similar congeners (Trachichthyiformes: Trachichthyidae).

Yo Su

Alexander N. Kotlyar

Hsiu-Chin Lin

Toshio Kawai

HSUAN-CHING HO

2024-11-07

Journal of Fish Biology (published)

doi.org

Unlearning in- vs. out-of-distribution data in LLMs under gradient-based method

Teodora Baluta

Pascal Lamblin

Daniel Tarlow

Fabian Pedregosa

Gintare Karolina Dziugaite

Machine unlearning aims to solve the problem of removing the influence of selected training examples from a learned model. Despite the incre… (see more)asing attention to this problem, it remains an open research question how to evaluate unlearning in large language models (LLMs), and what are the critical properties of the data to be unlearned that affect the quality and efficiency of unlearning. This work formalizes a metric to evaluate unlearning quality in generative models, and uses it to assess the trade-offs between unlearning quality and performance. We demonstrate that unlearning out-of-distribution examples requires more unlearning steps but overall presents a better trade-off overall. For in-distribution examples, however, we observe a rapid decay in performance as unlearning progresses. We further evaluate how example's memorization and difficulty affect unlearning under a classical gradient ascent-based approach.

2024-11-06

ArXiv (preprint)

doi.org

arxiv.org

CSGraph2Vec: Distributed Graph-Based Representation Learning for Assembly Functions

Wael J. Alhashemi

Benjamin C. M. Fung

Adel Abusitta

Claude Fachkha

2024-11-05

2024 IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE) (published)

doi.org

GAPS Phase III: incorporation of capacity based weighting in the global assessment for pediatric surgery

Yasmine Yousef

Emmanuel Ameh

Luc Malemo Kalisya

Dan Poenaru

2024-11-05

Pediatric Surgery International (published)

doi.org

Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset

Alexandre Galashov

Michalis K. Titsias

Andr'as Gyorgy

Clare Lyle

Razvan Pascanu

Yee Whye Teh

Maneesh Sahani

Neural networks are traditionally trained under the assumption that data come from a stationary distribution. However, settings which violat… (see more)e this assumption are becoming more popular; examples include supervised learning under distributional shifts, reinforcement learning, continual learning and non-stationary contextual bandits. In this work we introduce a novel learning approach that automatically models and adapts to non-stationarity, via an Ornstein-Uhlenbeck process with an adaptive drift parameter. The adaptive drift tends to draw the parameters towards the initialisation distribution, so the approach can be understood as a form of soft parameter reset. We show empirically that our approach performs well in non-stationary supervised and off-policy reinforcement learning settings.

2024-11-05

ArXiv (preprint)

doi.org

arxiv.org

SCIseg: Automatic Segmentation of Intramedullary Lesions in Spinal Cord Injury on T2-weighted MRI Scans

Enamundram Naga Karthik

Jan Valosek

Andrew C. Smith

Dario Pfyffer

Simon Schading-Sassenhausen

Lynn Farner

Kenneth A. Weber

Patrick Freund

Julien Cohen-Adad

The proposed deep learning model accurately segmented the spinal cord and spinal cord injury lesions in a diverse, multicenter dataset of T2… (see more)-weighted MRI scans.

2024-11-05

Radiology: Artificial Intelligence (published)

doi.org

Spinal cord evaluation in multiple sclerosis: clinical and radiological associations, present and future

B Mark Keegan

Martina Absinta

Julien Cohen-Adad

Eoin P Flanagan

Roland G Henry

Eric C Klawiter

Shannon Kolind

Stephen Krieger

Cornelia Laule

John A Lincoln

Steven Messina

Jiwon Oh

Nico Papinutto

Seth Aaron Smith

Anthony Traboulsee

2024-11-05

Brain Communications (published)

doi.org

Towards Optimizing SQL Generation via LLM Routing

Mohammadhossein Malekpour

Nour Shaheen

Foutse Khomh

Amine Mhedhbi

Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capabl… (see more)e large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.

2024-11-05

ArXiv (preprint)

doi.org

arxiv.org

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models

Terry Yue Zhuo

Massimo Caccia

2024-11-04

arXiv (preprint)

doi.org

arxiv.org

Temporal Residual Jacobians For Rig-free Motion Transfer

Sanjeev Muralikrishnan

Niladri Dutt

Siddhartha Chaudhuri

Noam Aigerman

Vladimir Kim

Matthew Fisher

Niloy J. Mitra

We introduce Temporal Residual Jacobians as a novel representation to enable data-driven motion transfer. Our approach does not assume acces… (see more)s to any rigging or intermediate shape keyframes, produces geometrically and temporally consistent motions, and can be used to transfer long motion sequences. Central to our approach are two coupled neural networks that individually predict local geometric and temporal changes that are subsequently integrated, spatially and temporally, to produce the final animated meshes. The two networks are jointly trained, complement each other in producing spatial and temporal signals, and are supervised directly with 3D positional information. During inference, in the absence of keyframes, our method essentially solves a motion extrapolation problem. We test our setup on diverse meshes (synthetic and scanned shapes) to demonstrate its superiority in generating realistic and natural-looking animations on unseen body shapes against SoTA alternatives. Supplemental video and code are available at https://temporaljacobians.github.io/ .

2024-11-04

Lecture Notes in Computer Science (published)

doi.org

arxiv.org

Efficient Assignment with Time Constraints for Heterogeneous DSP Systems.

Jiajie Li

Christophe Dubach

Warren J. Gross

High-level synthesis (HLS) produces hardware au-tomatically by scheduling and assigning resources based on an input control/data-flow graph.… (see more) One particular aspect of HLS for the digital signal processing (DSP) architecture is the het-erogeneous assignment problem (HAP) which maps operations into different types of functional units available in the electronic design automation tools to build efficient implementations. An optimal solution to this assignment problem can be found by formulating the problem as integer linear programming (ILP) and using a solver. However, given the slow nature of this process, heuristics tend to be used instead leading to sub-optimal designs. This paper revisits the classical ILP formulation of the HAP with time constraints for the DSP architecture by identifying redundant constraints. This paper proves theoretically, and demonstrates experimentally, that removing these constraints does not affect the obtained solution. This technique achieves speedups of more than 100 × in terms of runtime and reductions of more than 50 × in terms of memory usage of the solver. Also, this work proposes an updated heuristic that keeps reducing the latency of a path instead of finding a new critical path after giving a new node assignment. Runtime reductions (more than another 10×) due to reduced numbers of critical path searches are observed while returning similar results.

2024-11-03

IEEE Workshop on Signal Processing Systems (published)

doi.org

Mila Ventures Launchpad

Mila on Udemy

AI Policy Fellowship Publications

Publications

Mila Ventures Launchpad

Mila on Udemy

AI Policy Fellowship Publications

Popular keywords:

Publications