Publications

Learning from uncertain concepts via test time interventions
Ivaxi Sheth
Aamer Abdul Rahman
Laya Rafiee Sevyeri
Mohammad Havaei
With neural networks applied to safety-critical applications, it has become increasingly important to understand the defining features of de… (see more)cision-making. Therefore, the need to uncover the black boxes to rational representational space of these neural networks is apparent. Concept bottleneck model (CBM) encourages interpretability by predicting human-understandable concepts. They predict concepts from input images and then labels from concepts. Test time intervention, a salient feature of CBM, allows for human-model interactions. However, these interactions are prone to information leakage and can often be ineffective inappropriate communication with humans. We propose a novel uncertainty based strategy, \emph{SIUL: Single Interventional Uncertainty Learning} to select the interventions. Additionally, we empirically test the robustness of CBM and the effect of SIUL interventions under adversarial attack and distributional shift. Using SIUL, we observe that the interventions suggested lead to meaningful corrections along with mitigation of concept leakage. Extensive experiments on three vision datasets along with a histopathology dataset validate the effectiveness of our interventional learning.
Striving for data-model efficiency: Identifying data externalities on group performance
Esther Rolf
Ben Packer
Alex Beutel
GPS++: An Optimised Hybrid MPNN/Transformer for Molecular Property Prediction
Dominic Masters
Josef Dean
Kerstin Klaeser
Zhiyi Li
Samuel Maddrell-Mander
Adam Sanders
Hatem Helal
Deniz Beker
Ladislav Rampášek
APP: Anytime Progressive Pruning
Diganta Misra
Bharat Runwal
Tianlong Chen
Zhangyang Wang
With the latest advances in deep learning, several methods have been investigated for optimal learning settings in scenarios where the data … (see more)stream is continuous over time. However, training sparse networks in such settings has often been overlooked. In this paper, we explore the problem of training a neural network with a target sparsity in a particular case of online learning: the anytime learning at macroscale paradigm (ALMA). We propose a novel way of progressive pruning, referred to as \textit{Anytime Progressive Pruning} (APP); the proposed approach significantly outperforms the baseline dense and Anytime OSP models across multiple architectures and datasets under short, moderate, and long-sequence training. Our method, for example, shows an improvement in accuracy of
Clinically Plausible Pathology-Anatomy Disentanglement in Patient Brain MRI with Structured Variational Priors
Anjun Hu
Jean-Pierre R. Falet
Brennan Nichyporuk
Changjian Shui
Douglas Arnold
Sotirios A. Tsaftaris
We propose a hierarchically structured variational inference model for accurately disentangling observable evidence of disease (e.g. brain l… (see more)esions or atrophy) from subject-specific anatomy in brain MRIs. With flexible, partially autoregressive priors, our model (1) addresses the subtle and fine-grained dependencies that typically exist between anatomical and pathological generating factors of an MRI to ensure the clinical validity of generated samples; (2) preserves and disentangles finer pathological details pertaining to a patient's disease state. Additionally, we experiment with an alternative training configuration where we provide supervision to a subset of latent units. It is shown that (1) a partially supervised latent space achieves a higher degree of disentanglement between evidence of disease and subject-specific anatomy; (2) when the prior is formulated with an autoregressive structure, knowledge from the supervision can propagate to the unsupervised latent units, resulting in more informative latent representations capable of modelling anatomy-pathology interdependencies.
PatchBlender: A Motion Prior for Video Transformers
Gabriele Prato
Yale Song
Janarthanan Rajendran
Neel Joshi
SVRG meets AdaGrad: painless variance reduction
Benjamin Dubois-Taine
Sharan Vaswani
Reza Babanezhad Harikandeh
Mark Schmidt
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Teven Le Scao
Angela Fan
Christopher Akiki
Ellie Pavlick
Suzana Ili'c
Daniel Hesslow
Roman Castagn'e
Alexandra Luccioni
François Yvon
Matthias Gall'e
J. Tow
Alexander M. Rush
Stella Biderman
Albert Webson
Pawan Sasanka Ammanamanchi
Thomas Wang
Benoı̂t Sagot
Niklas Muennighoff
Albert Villanova del Moral
Olatunji Ruwase … (see 372 more)
Rachel Bawden
Stas Bekman
Angelina McMillan-Major
Iz Beltagy
Huu Nguyen
Lucile Saulnier
Samson Tan
Pedro Ortiz Suarez
Victor Sanh
Hugo Laurençon
Yacine Jernite
Julien Launay
Margaret Mitchell
Colin Raffel
Aaron Gokaslan
Adi Simhi
Aitor Soroa
Alham Fikri Aji
Amit Alfassy
Anna Rogers
Ariel Kreisberg Nitzav
Canwen Xu
Chenghao Mou
Chris Emezue
Christopher Klamm
Colin D. Leong
Daniel Van Strien
Dragomir R. Radev
Eduardo González Ponferrada
Efrat Levkovizh
Ethan Kim
Eyal Bar Natan
Francesco De Toni
Gérard Dupont
Germán Kruszewski
Giada Pistilli
Hady Elsahar
Hamza Benyamina
Hieu Tran
Ian W. Yu
Idris Abdulmumin
Isaac L. Johnson
Itziar Gonzalez-Dios
Javier de la Rosa
Jenny Chim
Jesse Dodge
Jian Zhu
Jonathan Chang
Jörg Frohberg
Josephine L. Tobing
J. Bhattacharjee
Khalid Almubarak
Kimbo Chen
Kyle Lo
Leandro Von Werra
Leon Weber
Long Phan
Loubna Ben allal
Ludovic Tanguy
Manan Dey
Manuel Romero Muñoz
Maraim Masoud
Mar'ia Grandury
Mario Šaško
Max Huang
Maximin Coavoux
Mayank Singh
Mike Tian-Jian Jiang
Vu Minh Chien
Mohammad Ali Jauhar
Mustafa Ghaleb
Nishant Subramani
Nora Kassner
Nurulaqilla Khamis
Olivier Nguyen
Omar Espejel
Ona de Gibert
Paulo Villegas
Peter Henderson
Pierre Colombo
Priscilla A. Amuok
Quentin Lhoest
Rheza Harliman
Rishi Bommasani
Roberto Luis L'opez
Rui Ribeiro
Salomey Osei
Sampo Pyysalo
Sebastian Nagel
Shamik Bose
Shamsuddeen Hassan Muhammad
Shanya Sharma Sharma
Shayne Longpre
Somaieh Nikpoor
S. Silberberg
Suhas Pai
Sydney Zink
Tiago Timponi Torrent
Timo Schick
Tristan Thrush
Valentin Danchev
Vassilina Nikoulina
Veronika Laippala
Violette Lepercq
Vrinda Prabhu
Zaid Alyafeai
Zeerak Talat
Arun Raja
Benjamin Heinzerling
Chenglei Si
Elizabeth E Salesky
Sabrina J. Mielke
Wilson Y. Lee
Abheesht Sharma
Andrea Santilli
Antoine Chaffin
Arnaud Stiegler
Debajyoti Datta
Eliza Szczechla
Gunjan Chhablani
Han Wang
Harshit Pandey
Hendrik. Strobelt
Jason Alan Fries
Jos Rozen
Leo Gao
Lintang A. Sutawika
M. Saiful Bari
Maged S. Al-shaibani
Matteo Manica
Nihal V. Nayak
Ryan Teehan
Samuel Albanie
Sheng Shen
Srulik Ben-David
Stephen H. Bach
Taewoon Kim
T. Bers
Thibault F'evry
Trishala Neeraj
Urmish Thakker
Vikas Raunak
Xiang Tang
Zheng Xin Yong
Zhiqing Sun
Shaked Brody
Y. Uri
Hadar Tojarieh
Adam Roberts
Hyung Won Chung
Jaesung Tae
Jason Phang
Ofir Press
Conglong Li
D. Narayanan
Hatim Bourfoune
Jared Casper
Jeff Rasley
Max Ryabinin
Mayank Mishra
Minjia Zhang
Mohammad Shoeybi
Myriam Peyrounette
Nicolas Patry
Nouamane Tazi
Omar Sanseviero
Patrick von Platen
Pierre Cornette
Pierre Franccois Lavall'ee
R'emi Lacroix
Samyam Rajbhandari
Sanchit Gandhi
Shaden Smith
St'ephane Requena
Suraj Patil
Tim Dettmers
Ahmed Baruwa
Amanpreet Singh
Anastasia Cheveleva
Anne-Laure Ligozat
Arjun Subramonian
Aur'elie N'ev'eol
Charles Lovering
Dan Garrette
D. Tunuguntla
Ehud Reiter
Ekaterina Taktasheva
E. Voloshina
Eli Bogdanov
Genta Indra Winata
Hailey Schoelkopf
Jan-Christoph Kalo
Jekaterina Novikova
Jessica Zosa Forde
Xiangru Tang
Jungo Kasai
Ken Kawamura
Liam Hazan
Marine Carpuat
Miruna-adriana Clinciu
Najoung Kim
Newton Cheng
O. Serikov
Omer Antverg
Oskar van der Wal
Rui Zhang
Ruochen Zhang
Sebastian Gehrmann
Shachar Mirkin
S. Pais
Tatiana Shavrina
Thomas Scialom
Tian Yun
Tomasz Limisiewicz
Verena Teresa Rieser
Vitaly Protasov
V. Mikhailov
Yada Pruksachatkun
Yonatan Belinkov
Zachary Bamberger
Zdenvek Kasner
Zdeněk Kasner
A. Pestana
Amir Feizpour
Ammar Khan
Amy Faranak
A. Santos
Anthony Hevia
Antigona Unldreaj
Arash Aghagol
Arezoo Abdollahi
Aycha Tammour
Azadeh Hajihosseini
Bahareh Behroozi
Benjamin A. Ajibade
B. Saxena
Carlos Muñoz Ferrandis
Danish Contractor
D. Lansky
Davis David
Douwe Kiela
Duong Anh Nguyen
Edward Chwee Kheng. Tan
Emi Baylor
Ezinwanne Ozoani
F. Mirza
Frankline Ononiwu
Habib Rezanejad
H.A. Jones
Indrani Bhattacharya
Irene Solaiman
Irina Sedenko
Isar Nejadgholi
J. Passmore
Joshua Seltzer
Julio Bonis Sanz
Karen Fort
Livia Macedo Dutra
Mairon Samagaio
Maraim Elbadri
Margot Mieskes
Marissa Kumar Gerchick
Martha Akinlolu
Michael McKenna
Mike Qiu
M. Ghauri
Mykola Burynok
Nafis Abrar
Nazneen Fatema Rajani
Nour Elkott
N. Fahmy
Olanrewaju Samuel
Ran An
R. Kromann
Ryan Hao
Samira Hassan Alizadeh
Sarmad Shubber
Silas L. Wang
Sourav Roy
Sylvain Viguier
Thanh-Cong Le
Tobi Oyebade
T. Le
Yoyo Yang
Zach Nguyen
Abhinav R. Kashyap
Alfredo Palasciano
Alison Callahan
Anima Shukla
Antonio Miranda-Escalada
Ayush Kumar Singh
Benjamin Beilharz
Bo Wang
Caio Matheus Fonseca De Brito
Chenxi Zhou
Chirag Jain
Chuxin Xu
Cl'ementine Fourrier
Daniel Le'on Perin'an
Daniel Molano
Dian Yu
Enrique Manjavacas
Fabio Barth
Florian Fuhrimann
Gabriel Altay
Giyaseddin Bayrak
Gully Burns
Helena U. Vrabec
I. Bello
Isha Dash
J. Kang
John Michael Giorgi
Jonas Golde
J. Posada
Karthi Sivaraman
Lokesh Bulchandani
Lu Liu
Luisa Shinzato
Madeleine Hahn de Bykhovetz
Maiko Takeuchi
Marc Pamies
M. A. Castillo
Marianna Nezhurina
Mario Sanger
Matthias Samwald
Michael Joseph Cullan
Michael Weinberg
Michiel De Wolf
Mina Mihaljcic
Minna Liu
Moritz Freidank
Myungsun Kang
Natasha Seelam
Nathan Dahlberg
Nicholas Michio Broad
Nikolaus Muellner
Pascale Fung
Patricia Haller
Ramya Chandrasekhar
Patrick Haller
Renata Eisenberg
Robert Martin
Rodrigo Canalli
Rosaline Su
Ruisi Su
Samuel Cahyawijaya
Samuele Garda
Shlok S Deshmukh
Shubhanshu Mishra
Sid Kiblawi
Simon Ott
Sinee Sang-aroonsiri
Srishti Kumar
Stefan Schweter
Sushil Pratap Bharati
Tanmay Laud
Th'eo Gigant
Tomoya Kainuma
Wojciech Kusa
Yanis Labrak
Yashasvi Bajaj
Yash Venkatraman
Yifan Xu
Ying Xu
Yu Xu
Z. Tan
Zhongli Xie
Zifan Ye
Mathilde Le Bras
Younes Belkada
Thomas Wolf
Flaky Performances when Pretraining on Relational Databases
Shengchao Liu
David Vazquez
Pierre-Andre Noel
Knowledge Distillation for Federated Learning: a Practical Guide
Alessio Mora
Irene Tenison
Paolo Bellavista
Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data. This paves th… (see more)e way for stronger privacy guarantees when building predictive models. The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits: (i) Clients must implement the same model architecture; (ii) Transmitting model weights and model updates implies high communication cost, which scales up with the number of model parameters; (iii) In presence of non-IID data distributions, parameter-averaging aggregation schemes perform poorly due to client model drifts. Federated adaptations of regular Knowledge Distillation (KD) can solve and/or mitigate the weaknesses of parameter-averaging FL algorithms while possibly introducing other trade-offs. In this article, we provide a review of KD-based algorithms tailored for specific FL issues.
A debriefing tool to acquire non-technical skills in trauma courses
Fabio Botelho
Jason M. Harley
Natalie Yanchar
Simone Abib
Ilana Bank
Multi-Head Adapter Routing for Cross-Task Generalization
Lucas Caccia
Edoardo Ponti
Zhan Su
Matheus Pereira
Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before f… (see more)ew-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] (