Eugene Belilovsky

Paul Janson

Doctorat - Concordia

Co-superviseur⋅e :

Charles-Etienne Joseph

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Zafir Khalid

Maîtrise recherche - Concordia

Co-superviseur⋅e :

Irina Rish

Site web

Gwen Legate

Doctorat - Concordia

Co-superviseur⋅e :

Maîtrise recherche - Concordia

Co-superviseur⋅e :

Melika Minaei Bidgoli

Stagiaire de recherche - Concordia University

Doctorat - Concordia

Adel Nabli

Doctorat - Concordia

Google Scholar

Geraldin Nanfack

Postdoctorat - Concordia

Co-superviseur⋅e :

geraldin.nanfack@mila.quebec

Site web

Google Scholar

Albert Orozco Camacho

Doctorat - Concordia

Co-superviseur⋅e :

Doctorat - Concordia

Co-superviseur⋅e :

Irina Rish

Benjamin Therien

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - UdeM

Superviseur⋅e principal⋅e :

Maîtrise recherche - Concordia

AmirHossein Zamani

Doctorat - Concordia

Co-superviseur⋅e :

Congshu Zou

Maîtrise recherche - Concordia

Publications

Generalization of deep learning models for hepatic steatosis grading using B-mode ultrasound images

Pedro Vianna

Yue Qi

Michael Chassé

An Tang

Guy Cloutier

2024-03-01

The Journal of the Acoustical Society of America (publié)

Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation

Pedro Vianna

Muawiz Chaudhary

Paria Mehrbod

An Tang

Guy Cloutier

Michael Eickenberg

Deep neural networks have useful applications in many different tasks, however their performance can be severely affected by changes in the … (voir plus)data distribution. For example, in the biomedical field, their performance can be affected by changes in the data (different machines, populations) between training and test datasets. To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust models to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks. It is implemented by recalculating batch normalization statistics on test batches. Prior work has focused on analysis with test data that has the same label distribution as the training data. However, in many practical applications this technique is vulnerable to label distribution shifts, sometimes producing catastrophic failure. This presents a risk in applying test time adaptation methods in deployment. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. Our selection scheme is based on two principles that we empirically motivate: (1) later layers of networks are more sensitive to label shift (2) individual features can be sensitive to specific classes. We apply the proposed technique to three classification tasks, including CIFAR10-C, Imagenet-C, and diagnosis of fatty liver, where we explore both covariate and label distribution shifts. We find that our method allows to bring the benefits of TTA while significantly reducing the risk of failure common in other methods, while being robust to choice in hyperparameters.

2024-02-07

ArXiv (prépublication)

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

MohammadReza Davari

2023-12-11

ArXiv (prépublication)

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

MohammadReza Davari

The rapid development of AI systems has been greatly influenced by the emergence of foundation models. A common approach for targeted proble… (voir plus)ms involves fine-tuning these pre-trained foundation models for specific target tasks, resulting in a rapid spread of models fine-tuned across a diverse array of tasks. This work focuses on the problem of merging multiple fine-tunings of the same foundation model derived from a spectrum of auxiliary tasks. We introduce a new simple method, Model Breadcrumbs, which consists of a sparsely defined weight set that guides model adaptation within the weight space of a pre-trained model. These breadcrumbs are constructed by subtracting the weights from a pre-trained model before and after fine-tuning, followed by a sparsification process that eliminates weight outliers and negligible perturbations. Our experiments demonstrate the effectiveness of Model Breadcrumbs to simultaneously improve performance across multiple tasks. This contribution aligns with the evolving paradigm of updatable machine learning, reminiscent of the collaborative principles underlying open-source software development, fostering a community-driven effort to reliably update machine learning models. Our method is shown to be more efficient and unlike previous proposals does not require hyperparameter tuning for each new task added. Through extensive experimentation involving various models, tasks, and modalities we establish that integrating Model Breadcrumbs offers a simple, efficient, and highly effective approach for constructing multi-task models and facilitating updates to foundation models.

2023-12-11

ArXiv (prépublication)

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

MohammadReza Davari

2023-12-11

ArXiv (prépublication)

Can We Learn Communication-Efficient Optimizers?

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

2023-12-02

ArXiv (prépublication)

Channel Selection for Test-Time Adaptation Under Distribution Shift

Pedro Vianna

Muawiz Sajjad Chaudhary

An Tang

Guy Cloutier

Michael Eickenberg

To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust mod… (voir plus)els to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks by recalculating batch normalization statistics on test batches. However, in many practical applications this technique is vulnerable to label distribution shifts. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. We find that adapted models significantly improve the performance compared to the baseline models and counteract unknown label shifts.

2023-10-27

NeurIPS.cc/2023/Workshop/DistShift (poster)

Learning Optimizers for Local SGD

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

2023-10-27

NeurIPS.cc/2023/Workshop/Federated_Learning (poster)

DragD3D: Vertex-based Editing for Realistic Mesh Deformations using 2D Diffusion Priors

Tianhao Xie

Sudhir Mudur

Tiberiu Popa

Direct mesh editing and deformation are key components in the geometric modeling and animation pipeline. Direct mesh editing methods are typ… (voir plus)ically framed as optimization problems combining user-specified vertex constraints with a regularizer that determines the position of the rest of the vertices. The choice of the regularizer is key to the realism and authenticity of the final result. Physics and geometry-based regularizers are not aware of the global context and semantics of the object, and the more recent deep learning priors are limited to a specific class of 3D object deformations. In this work, our main contribution is a local mesh editing method called DragD3D for global context-aware realistic deformation through direct manipulation of a few vertices. DragD3D is not restricted to any class of objects. It achieves this by combining the classic geometric ARAP (as rigid as possible) regularizer with 2D priors obtained from a large-scale diffusion model. Specifically, we render the objects from multiple viewpoints through a differentiable renderer and use the recently introduced DDS loss which scores the faithfulness of the rendered image to one from a diffusion model. DragD3D combines the approximate gradients of the DDS with gradients from the ARAP loss to modify the mesh vertices via neural Jacobian field, while also satisfying vertex constraints. We show that our deformations are realistic and aware of the global context of the objects, and provide better results than just using geometric regularizers.

2023-10-06

ArXiv (prépublication)

Comparison of Radiologists and Deep Learning for US Grading of Hepatic Steatosis.

Pedro Vianna

Sara-Ivana Calce

Pamela Boustros

Cassandra Larocque-Rigney

Laurent Patry-Beaudoin

Yi Hui Luo

Emre Aslan

John Marinos

Talal M. Alamri

Kim-Nhien Vu

Jessica Murphy-Lavallée

Jean-Sébastien Billiard

Emmanuel Montagnon

Hongliang Li

Samuel Kadoury

Bich Nguyen

Shanel Gauthier

Benjamin Thérien

Irina Rish

Eugene Belilovsky … (voir 4 de plus)

Michael Chassé

Guy Cloutier

An Tang

Background Screening for nonalcoholic fatty liver disease (NAFLD) is suboptimal due to the subjective interpretation of US images. Purpose T… (voir plus)o evaluate the agreement and diagnostic performance of radiologists and a deep learning model in grading hepatic steatosis in NAFLD at US, with biopsy as the reference standard. Materials and Methods This retrospective study included patients with NAFLD and control patients without hepatic steatosis who underwent abdominal US and contemporaneous liver biopsy from September 2010 to October 2019. Six readers visually graded steatosis on US images twice, 2 weeks apart. Reader agreement was assessed with use of κ statistics. Three deep learning techniques applied to B-mode US images were used to classify dichotomized steatosis grades. Classification performance of human radiologists and the deep learning model for dichotomized steatosis grades (S0, S1, S2, and S3) was assessed with area under the receiver operating characteristic curve (AUC) on a separate test set. Results The study included 199 patients (mean age, 53 years ± 13 [SD]; 101 men). On the test set (n = 52), radiologists had fair interreader agreement (0.34 [95% CI: 0.31, 0.37]) for classifying steatosis grades S0 versus S1 or higher, while AUCs were between 0.49 and 0.84 for radiologists and 0.85 (95% CI: 0.83, 0.87) for the deep learning model. For S0 or S1 versus S2 or S3, radiologists had fair interreader agreement (0.30 [95% CI: 0.27, 0.33]), while AUCs were between 0.57 and 0.76 for radiologists and 0.73 (95% CI: 0.71, 0.75) for the deep learning model. For S2 or lower versus S3, radiologists had fair interreader agreement (0.37 [95% CI: 0.33, 0.40]), while AUCs were between 0.52 and 0.81 for radiologists and 0.67 (95% CI: 0.64, 0.69) for the deep learning model. Conclusion Deep learning approaches applied to B-mode US images provided comparable performance with human readers for detection and grading of hepatic steatosis. Published under a CC BY 4.0 license. Supplemental material is available for this article. See also the editorial by Tuthill in this issue.

2023-10-01

Radiology (publié)

Guiding The Last Layer in Federated Learning with Pre-Trained Models

Gwen Legate

Nicolas Bernier

Lucas Caccia

Edouard Oyallon

$\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning

Adel Nabli

Edouard Oyallon