Irina Rish

sayed.mansouri-tehrani@mila.quebec

Amin Memarian

Visiteur de recherche indépendant

memariaa@mila.quebec

Amin Mansouri

Maîtrise recherche - Université de Montréal

andrew.williams@mila.quebec

Amin Darabi

Doctorat - Université de Montréal

amin.darabi@mila.quebec

Andrei Mircea Romascanu

Doctorat - Université de Montréal

Doctorat - Université de Montréal

arian.khorasani@mila.quebec

Arian Khorasani

Maîtrise recherche - Université de Montréal

arnav-kumar.jain@mila.quebec

Arjun Ashok

Doctorat

Co-superviseur⋅e :

Alexandre Drouin

arjun.ashok@mila.quebec

Doctorat - Université de Montréal

Doctorat - Université de Montréal

Co-superviseur⋅e :

Collaborateur·rice de recherche

ayush.kaushal@mila.quebec

Benjamin Therien

Doctorat - Université de Montréal

Co-superviseur⋅e :

benjamin.therien@mila.quebec

Collaborateur·rice de recherche - Université de Montréal

connor.brennan@mila.quebec

Daria Yasafova

Stagiaire de recherche - Technical University of Munich

daria.yasafova@mila.quebec

Dave Whipps

Maîtrise recherche - Université de Montréal

whippsda@mila.quebec

diganta.misra@mila.quebec

Diganta Misra

Maîtrise recherche - Université de Montréal

Postdoctorat

Superviseur⋅e principal⋅e :

Nicolas Le Roux

ekaterina.lobacheva@mila.quebec

Doctorat - McGill University

Superviseur⋅e principal⋅e :

Blake Richards

ethan.caballero@mila.quebec

george.adamopoulos@mila.quebec

George Adamopoulos

Stagiaire de recherche

gopeshh.subbaraj@mila.quebec

Germán Abrevaya

Visiteur de recherche indépendant - Université de Montréal

Co-superviseur⋅e :

Doctorat - Université de Montréal

Gwen Legate

Doctorat - Concordia University

Superviseur⋅e principal⋅e :

gwendolyne.legate@mila.quebec

Ivan Anokhin

Doctorat - Université de Montréal

Co-superviseur⋅e :

Samira Ebrahimi Kahou

ivan.anokhin@mila.quebec

juan.mayor-torres@mila.quebec

Juan Manuel Mayor-Torres

Collaborateur·rice de recherche

Collaborateur·rice alumni - Université de Montréal

Co-superviseur⋅e :

Sarath Chandar Anbil Parthipan

kshitij.gupta@mila.quebec

Mahta Ramezanian

Maîtrise recherche - Université de Montréal

Co-superviseur⋅e :

mahta.ramezanian@mila.quebec

Matthew Riemer

Doctorat - Université de Montréal

matthew.riemer@mila.quebec

Maximilian Puelma Touzel

Collaborateur·rice de recherche

Doctorat - Université de Montréal

arefinmr@mila.quebec

Mohammad Pezeshki

Collaborateur·rice de recherche

pezeshki@mila.quebec

Mohammad-Javad Darvishi Bayazi

Doctorat - Université de Montréal

mohammad-javad.darvishi-bayasi@mila.quebec

Doctorat - Université de Montréal

faramarm@mila.quebec

Motahareh Pourrahimi

Doctorat - McGill University

Superviseur⋅e principal⋅e :

Pouya Bashivan

motahareh.pourrahimi@mila.quebec

nadhir.hassen@mila.quebec

Nadhir Hassen

Stagiaire de recherche - Université de Montréal

Neeraj Kumar

Maîtrise professionnelle - Université de Montréal

neeraj.kumar@mila.quebec

Nizar Islah

Doctorat - Université de Montréal

Superviseur⋅e principal⋅e :

Eilif Benjamin Muller

nizar.islah@mila.quebec

paolo.cudrano@mila.quebec

Omar Younis

Stagiaire de recherche - Université de Montréal

omar.younis@mila.quebec

Collaborateur·rice de recherche - Politecnico di Milano

Pascal Tikeng Notsawo

Doctorat - Université de Montréal

Co-superviseur⋅e :

pascal.tikeng@mila.quebec

Collaborateur·rice de recherche

prateek.humane@mila.quebec

Maîtrise recherche - Université de Montréal

remus.mocanu@mila.quebec

Reza Bayat

Maîtrise recherche - Université de Montréal

Co-superviseur⋅e :

Pouya Bashivan

reza.bayat@mila.quebec

rishika.bhagwatkar@mila.quebec

Rishika Bhagwatkar

Maîtrise recherche - Université de Montréal

Collaborateur·rice de recherche - Université de Montréal

roland.riachi@mila.quebec

Simon Dufort-Labbé

Doctorat - Université de Montréal

simon.dufort-labbe@mila.quebec

Sparsha Mishra

Maîtrise recherche - Université de Montréal

sparsha.mishra@mila.quebec

Tejas Vaidhya

Maîtrise recherche - Université de Montréal

tejas.vaidhya@mila.quebec

Doctorat - Université de Montréal

Co-superviseur⋅e :

Eilif Benjamin Muller

timothy.nest@mila.quebec

Vaibhav Singh

Doctorat - Concordia University

Superviseur⋅e principal⋅e :

vaibhav.singh@mila.quebec

Zahra Sheikhbahaee

Postdoctorat - Université de Montréal

Superviseur⋅e principal⋅e :

zahra.sheikhbahaee@mila.quebec

Publications

Comparison of Radiologists and Deep Learning for US Grading of Hepatic Steatosis.

Pedro Vianna

Sara-Ivana Calce

Pamela Boustros

Cassandra Larocque-Rigney

Laurent Patry-Beaudoin

Yi Hui Luo

Emre Aslan

John Marinos

Talal M. Alamri

Kim-Nhien Vu

Jessica Murphy-Lavallée

Jean-Sébastien Billiard

Emmanuel Montagnon

Hongliang Li

Samuel Kadoury

Bich Nguyen

Shanel Gauthier

Benjamin Thérien

Eugene Belilovsky … (voir 4 de plus)

Guy Wolf

Michaël Chassé

Guy Cloutier

An Tang

Background Screening for nonalcoholic fatty liver disease (NAFLD) is suboptimal due to the subjective interpretation of US images. Purpose T… (voir plus)o evaluate the agreement and diagnostic performance of radiologists and a deep learning model in grading hepatic steatosis in NAFLD at US, with biopsy as the reference standard. Materials and Methods This retrospective study included patients with NAFLD and control patients without hepatic steatosis who underwent abdominal US and contemporaneous liver biopsy from September 2010 to October 2019. Six readers visually graded steatosis on US images twice, 2 weeks apart. Reader agreement was assessed with use of κ statistics. Three deep learning techniques applied to B-mode US images were used to classify dichotomized steatosis grades. Classification performance of human radiologists and the deep learning model for dichotomized steatosis grades (S0, S1, S2, and S3) was assessed with area under the receiver operating characteristic curve (AUC) on a separate test set. Results The study included 199 patients (mean age, 53 years ± 13 [SD]; 101 men). On the test set (n = 52), radiologists had fair interreader agreement (0.34 [95% CI: 0.31, 0.37]) for classifying steatosis grades S0 versus S1 or higher, while AUCs were between 0.49 and 0.84 for radiologists and 0.85 (95% CI: 0.83, 0.87) for the deep learning model. For S0 or S1 versus S2 or S3, radiologists had fair interreader agreement (0.30 [95% CI: 0.27, 0.33]), while AUCs were between 0.57 and 0.76 for radiologists and 0.73 (95% CI: 0.71, 0.75) for the deep learning model. For S2 or lower versus S3, radiologists had fair interreader agreement (0.37 [95% CI: 0.33, 0.40]), while AUCs were between 0.52 and 0.81 for radiologists and 0.67 (95% CI: 0.64, 0.69) for the deep learning model. Conclusion Deep learning approaches applied to B-mode US images provided comparable performance with human readers for detection and grading of hepatic steatosis. Published under a CC BY 4.0 license. Supplemental material is available for this article. See also the editorial by Tuthill in this issue.

2023-10-01

Radiology (publié)

LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression

Ayush Kaushal

Tejas Vaidhya

Low Rank Decomposition of matrix - splitting a large matrix into a product of two smaller matrix offers a means for compression that reduces… (voir plus) the parameters of a model without sparsification, and hence delivering more speedup on modern hardware. Moreover, unlike quantization, the compressed linear layers remain fully differentiable and all the parameters trainable, while being able to leverage the existing highly efficient kernels over floating point matrices. We study the potential to compress Large Language Models (LLMs) for monolingual Code generation via Low Rank Decomposition (LoRD) and observe that ranks for the linear layers in these models can be reduced by upto 39.58% with less than 1% increase in perplexity. We then use Low Rank Decomposition (LoRD) to compress StarCoder 16B to 13.2B parameter with no drop and to 12.3B with minimal drop in HumanEval Pass@1 score, in less than 10 minutes on a single A100. The compressed models speeds up inference by up to 22.35% with just a single line of change in code over huggingface's implementation with pytorch backend. Low Rank Decomposition (LoRD) models remain compatible with state of the art near-lossless quantization method such as SpQR, which allows leveraging further compression gains of quantization. Lastly, QLoRA over Low Rank Decomposition (LoRD) model further reduces memory requirements by as much as 21.2% over vanilla QLoRA while offering similar gains from parameter efficient fine tuning. Our work shows Low Rank Decomposition (LoRD) as a promising new paradigm for LLM compression.

2023-09-25

ArXiv (prépublication)

arxiv.org

Maximum State Entropy Exploration using Predecessor and Successor Representations

Arnav Kumar Jain

Lucas Lehnert

Glen Berseth

Animals have a developed ability to explore that aids them in important tasks such as locating food, exploring for shelter, and finding misp… (voir plus)laced items. These exploration skills necessarily track where they have been so that they can plan for finding items with relative efficiency. Contemporary exploration algorithms often learn a less efficient exploration strategy because they either condition only on the current state or simply rely on making random open-loop exploratory moves. In this work, we propose

WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series

Jean-Christophe Gagnon-Audet

Kartik Ahuja

Mohammad Javad Darvishi Bayazi

Pooneh Mousavi

2023-09-01

TMLR (accepté)

Beyond performance: the role of task demand, effort, and individual differences in ab initio pilots

Mohammad-Javad Darvishi-Bayazi

Andrew Law

Sergio Mejia Romero

Sion Jennings

Jocelyn Faubert

2023-08-28

Scientific Reports (publié)

Neural efficiency in an aviation task with different levels of difficulty: Assessing different biometrics during a performance task

Mohammad Javad Darvishi Bayazi

Andrew Law

Sergio Mejia Romero

Sion Jennings

Jocelyn Faubert

2023-08-01

Journal of Vision (publié)

Cognitive Models as Simulators: Using Cognitive Models to Tap into Implicit Human Feedback

Ardavan S. Nobandegani

Thomas Shultz

2023-06-20

ICML.cc/2023/Workshop/ILHF (publié)

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Kshitij Gupta

Benjamin Thérien

Adam Ibrahim

Mats Leon Richter

Quentin Gregory Anthony

Timothee LESORT

Large language models (LLMs) are routinely pre-trained on billions of tokens, only to restart the process over again once new data becomes a… (voir plus)vailable. A much cheaper and more efficient solution would be to enable the continual pre-training of these models, i.e. updating pre-trained models with new data instead of re-training them from scratch. However, the distribution shift induced by novel data typically results in degraded performance on past data. Taking a step towards efficient continual pre-training, in this work, we examine the effect of different warm-up strategies. Our hypothesis is that the learning rate must be re-increased to improve compute efficiency when training on a new dataset. We study the warmup phase of models pre-trained on the Pile (upstream data, 300B tokens) as we continue to pre-train on SlimPajama (downstream data, 297B tokens), following a linear warmup and cosine decay schedule. We conduct all experiments on the Pythia 410M language model architecture and evaluate performance through validation perplexity. We experiment with different pre-training checkpoints, various maximum learning rates, and various warmup lengths. Our results show that while rewarming models first increases the loss on upstream and downstream data, in the longer run it improves the downstream performance, outperforming models trained from scratch

2023-06-20

ICML.cc/2023/Workshop/ES-FoMO (poster)

Towards Out-of-Distribution Adversarial Robustness

Adam Ibrahim

Charles Guille-Escuret

Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fail… (voir plus)s to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different

2023-06-20

ICML.cc/2023/Workshop/AdvML-Frontiers (publié)

Dialogue System with Missing Observation

Djallel Bouneffouf

Mayank Agarwal

Within the domain of dialogue, the ability to orchestrate multiple independently trained dialogue agents to create a unified system is of pa… (voir plus)rticular importance. Where we define orchestration as the task of selecting a subset of skills which most appropriately answer a user input using features extracted from both the user input and the individual skills. In this work, we study the task of online dialogue orchestration where the user feedback associated with the dialogue agent may not always be observed. In order to address the missing feedback setting, we propose to combine the attentive contextual bandit approach with an unsupervised learning mechanism such as clustering. By leveraging clustering to estimate missing reward, we are able to learn from each incoming event, even those with missing rewards. Promising empirical results are obtained on proprietary conversational datasets.

2023-06-04

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (publié)

Estimating individual minimum calibration for deep-learning with predictive performance recovery: An example case of gait surface classification from wearable sensor gait data.

Guillaume Lam

P. Dixon

2023-06-01

Journal of Biomechanics (publié)

Towards ethical multimodal systems

Alexis Roger

Esma Aimeur

2023-04-26

ArXiv (prépublication)