Eugene Belilovsky

Google Scholar

Xiao Huang

Master's Research - Concordia University

Co-supervisor :

Paul Janson

PhD - Concordia University

Co-supervisor :

Master's Research - Concordia University

Co-supervisor :

Website

Gwen Legate

PhD - Concordia University

Co-supervisor :

Master's Research - Concordia University

Co-supervisor :

Abhinav Moudgil

PhD - Concordia University

Website

Google Scholar

Adel Nabli

PhD - Concordia University

Google Scholar

Geraldin Nanfack

Postdoctorate - Concordia University

Co-supervisor :

Albert Orozco Camacho

PhD - Concordia University

Co-supervisor :

PhD - Concordia University

Co-supervisor :

Benjamin Therien

PhD - Université de Montréal

Principal supervisor :

Collaborating researcher - Université de Montréal

Principal supervisor :

PhD - Concordia University

Co-supervisor :

Publications

Can We Learn Communication-Efficient Optimizers?

Charles-Etienne Joseph

2023-12-02

ArXiv (preprint)

arxiv.org

Channel Selection for Test-Time Adaptation Under Distribution Shift

Pedro Vianna

Muawiz Sajjad Chaudhary

An Tang

Guy Cloutier

Michael Eickenberg

To ensure robustness and generalization to real-world scenarios, test-time adaptation has been recently studied as an approach to adjust mod… (see more)els to a new data distribution during inference. Test-time batch normalization is a simple and popular method that achieved compelling performance on domain shift benchmarks by recalculating batch normalization statistics on test batches. However, in many practical applications this technique is vulnerable to label distribution shifts. We propose to tackle this challenge by only selectively adapting channels in a deep network, minimizing drastic adaptation that is sensitive to label shifts. We find that adapted models significantly improve the performance compared to the baseline models and counteract unknown label shifts.

2023-10-27

NeurIPS.cc/2023/Workshop/DistShift (poster)

Learning Optimizers for Local SGD

Charles-Etienne Joseph

2023-10-27

NeurIPS.cc/2023/Workshop/Federated_Learning (poster)

DragD3D: Vertex-based Editing for Realistic Mesh Deformations using 2D Diffusion Priors

Tianhao Xie

Sudhir Mudur

Tiberiu Popa

Direct mesh editing and deformation are key components in the geometric modeling and animation pipeline. Direct mesh editing methods are typ… (see more)ically framed as optimization problems combining user-specified vertex constraints with a regularizer that determines the position of the rest of the vertices. The choice of the regularizer is key to the realism and authenticity of the final result. Physics and geometry-based regularizers are not aware of the global context and semantics of the object, and the more recent deep learning priors are limited to a specific class of 3D object deformations. In this work, our main contribution is a local mesh editing method called DragD3D for global context-aware realistic deformation through direct manipulation of a few vertices. DragD3D is not restricted to any class of objects. It achieves this by combining the classic geometric ARAP (as rigid as possible) regularizer with 2D priors obtained from a large-scale diffusion model. Specifically, we render the objects from multiple viewpoints through a differentiable renderer and use the recently introduced DDS loss which scores the faithfulness of the rendered image to one from a diffusion model. DragD3D combines the approximate gradients of the DDS with gradients from the ARAP loss to modify the mesh vertices via neural Jacobian field, while also satisfying vertex constraints. We show that our deformations are realistic and aware of the global context of the objects, and provide better results than just using geometric regularizers.

2023-10-06

ArXiv (preprint)

arxiv.org

Comparison of Radiologists and Deep Learning for US Grading of Hepatic Steatosis.

Pedro Vianna

Sara-Ivana Calce

Pamela Boustros

Cassandra Larocque-Rigney

Laurent Patry-Beaudoin

Yi Hui Luo

Emre Aslan

John Marinos

Talal M. Alamri

Kim-Nhien Vu

Jessica Murphy-Lavallée

Jean-Sébastien Billiard

Emmanuel Montagnon

Hongliang Li

Samuel Kadoury

Bich Nguyen

Shanel Gauthier

Benjamin Therien

Eugene Belilovsky … (see 4 more)

Michael Chassé

Guy Cloutier

An Tang

Background Screening for nonalcoholic fatty liver disease (NAFLD) is suboptimal due to the subjective interpretation of US images. Purpose T… (see more)o evaluate the agreement and diagnostic performance of radiologists and a deep learning model in grading hepatic steatosis in NAFLD at US, with biopsy as the reference standard. Materials and Methods This retrospective study included patients with NAFLD and control patients without hepatic steatosis who underwent abdominal US and contemporaneous liver biopsy from September 2010 to October 2019. Six readers visually graded steatosis on US images twice, 2 weeks apart. Reader agreement was assessed with use of κ statistics. Three deep learning techniques applied to B-mode US images were used to classify dichotomized steatosis grades. Classification performance of human radiologists and the deep learning model for dichotomized steatosis grades (S0, S1, S2, and S3) was assessed with area under the receiver operating characteristic curve (AUC) on a separate test set. Results The study included 199 patients (mean age, 53 years ± 13 [SD]; 101 men). On the test set (n = 52), radiologists had fair interreader agreement (0.34 [95% CI: 0.31, 0.37]) for classifying steatosis grades S0 versus S1 or higher, while AUCs were between 0.49 and 0.84 for radiologists and 0.85 (95% CI: 0.83, 0.87) for the deep learning model. For S0 or S1 versus S2 or S3, radiologists had fair interreader agreement (0.30 [95% CI: 0.27, 0.33]), while AUCs were between 0.57 and 0.76 for radiologists and 0.73 (95% CI: 0.71, 0.75) for the deep learning model. For S2 or lower versus S3, radiologists had fair interreader agreement (0.37 [95% CI: 0.33, 0.40]), while AUCs were between 0.52 and 0.81 for radiologists and 0.67 (95% CI: 0.64, 0.69) for the deep learning model. Conclusion Deep learning approaches applied to B-mode US images provided comparable performance with human readers for detection and grading of hepatic steatosis. Published under a CC BY 4.0 license. Supplemental material is available for this article. See also the editorial by Tuthill in this issue.

2023-10-01

Radiology (published)

Guiding The Last Layer in Federated Learning with Pre-Trained Models

Lucas Caccia

$\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning

Automated liver segmentation and steatosis grading using deep learning on B-mode ultrasound images

Pedro Vianna

Merve Kulbay

Pamela Boustros

Sara-Ivana Calce

Cassandra Larocque-Rigney

Laurent Patry-Beaudoin

Yi Hui Luo

Muawiz Chaudary

Samuel Kadoury

Bich Nguyen

Emmanuel Montagnon

Michael Chassé

An Tang

Guy Cloutier

Early detection of nonalcoholic fatty liver disease (NAFLD) is crucial to avoid further complications. Ultrasound is often used for screenin… (see more)g and monitoring of hepatic steatosis, however it is limited by the subjective interpretation of images. Computer assisted diagnosis could aid radiologists to achieve objective grading, and artificial intelligence approaches have been tested across various medical applications. In this study, we evaluated the performance of a two-stage hepatic steatosis detection deep learning framework, with a first step of liver segmentation and a subsequent step of hepatic steatosis classification. We evaluated the models on internal and external datasets, aiming to understand the generalizability of the framework. In the external dataset, our segmentation model achieved a Dice score of 0.92 (95% CI: 0.78, 1.00), and our classification model achieved an area under the receiver operating characteristic curve of 0.84 (95% CI: 0.79, 0.89). Our findings highlight the potential benefits of applying artificial intelligence models in NAFLD assessment.

2023-09-03

IUS (published)

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Quentin Gregory Anthony

Timothee LESORT

Large language models (LLMs) are routinely pre-trained on billions of tokens, only to restart the process over again once new data becomes a… (see more)vailable. A much cheaper and more efficient solution would be to enable the continual pre-training of these models, i.e. updating pre-trained models with new data instead of re-training them from scratch. However, the distribution shift induced by novel data typically results in degraded performance on past data. Taking a step towards efficient continual pre-training, in this work, we examine the effect of different warm-up strategies. Our hypothesis is that the learning rate must be re-increased to improve compute efficiency when training on a new dataset. We study the warmup phase of models pre-trained on the Pile (upstream data, 300B tokens) as we continue to pre-train on SlimPajama (downstream data, 297B tokens), following a linear warmup and cosine decay schedule. We conduct all experiments on the Pythia 410M language model architecture and evaluate performance through validation perplexity. We experiment with different pre-training checkpoints, various maximum learning rates, and various warmup lengths. Our results show that while rewarming models first increases the loss on upstream and downstream data, in the longer run it improves the downstream performance, outperforming models trained from scratch

2023-06-20

ICML.cc/2023/Workshop/ES-FoMO (poster)

Learning to Optimize with Recurrent Hierarchical Transformers

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (published)

Simulated Annealing in Early Layers Leads to Better Generalization

Amir M. Sarfi

Zahra Karimpour

Muawiz Chaudhary

Nasir M. Khalid

Mirco Ravanelli

Sudhir Mudur

Recently, a number of iterative learning methods have been introduced to improve generalization. These typically rely on training for longer… (see more) periods of time in exchange for improved generalization. LLF (later-layer-forgetting) is a state-of-the-art method in this category. It strengthens learning in early layers by periodically re-initializing the last few layers of the network. Our principal innovation in this work is to use Simulated annealing in EArly Layers (SEAL) of the network in place of re-initialization of later layers. Essentially, later layers go through the normal gradient descent process, while the early layers go through short stints of gradient ascent followed by gradient descent. Extensive experiments on the popular Tiny-ImageNet dataset benchmark and a series of transfer learning and few-shot learning tasks show that we outperform LLF by a significant margin. We further show that, compared to normal training, LLF features, although improving on the target task, degrade the transfer learning performance across all datasets we explored. In comparison, our method outperforms LLF across the same target datasets by a large margin. We also show that the prediction depth of our method is significantly lower than that of LLF and normal training, indicating on average better prediction performance. 11The code to reproduce our results is publicly available at: https://github.com/amiiir-sarfi/SEAL

2023-06-17

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

arxiv.org

Preventing Dimensional Collapse in Contrastive Local Learning with Subsampling

Louis Fournier

Adeetya Patel

Michael Eickenberg

Edouard Oyallon