Publications

Typology of ICU-Healthcare Providers Who Delayed or Declined COVID-19 Vaccination

Elie Azoulay

Frédéric Pochard

Guillaume Dumas

Nancy Kentish-Barnes

OBJECTIVES: To assess COVID-19 vaccination rates in ICU-healthcare providers (HCPs) in France and to identify the typology of those who dela… (see more)yed or declined vaccination. DESIGN: Cross-sectional study. SETTING: Twenty-one ICUs in France. SUBJECTS: Members of the nursing and medical staff and other allied professionals. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Six hundred ninety-six of the 950 respondents (73.3%) had undergone a full vaccination schedule. Other HCPs either declined vaccination (n = 112) or delayed vaccination until it became mandatory (n = 142). Factors independently associated with full vaccination were age older than 50 years (odds ratio, 0.25 [95% CI, 0.12–0.51]), more than 5 years of ICU experience (0.66 [0.47–0.93]), increasing working time during the surge (0.94 [0.88–1.00]), and spending time with the family (0.92 [0.85–0.99]). Conversely, being a nurse (1.94 [1.25–2.99]) or a nurse assistant (2.77 [1.62–4.73]), and feeling not supported by hospital and ICU directors (1.49 [1.01–2.20]) was independently associated with not being vaccinated. CONCLUSIONS: These results are important to take into account to better implement vaccination strategies in HCPs for existing or future pandemics.

2023-10-30

Critical Care Medicine (published)

Unraveling the Interconnected Axes of Heterogeneity in Machine Learning for Democratic and Inclusive Advancements

Maryam Molamohammadi

Afaf Taïk

Nicolas Le Roux

Golnoosh Farnadi

2023-10-30

Equity and Access in Algorithms, Mechanisms, and Optimization (published)

Object-centric architectures enable efficient causal representation learning

Amin Mansouri

Jason Hartford

Yan Zhang

Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees… (see more) (up to some reasonable equivalence class). Common to all of these approaches is the assumption that (1) the latent variables are represented as

2023-10-29

ArXiv (preprint)

Proving Linear Mode Connectivity of Neural Networks via Optimal Transport

Damien Ferbach

Baptiste Goujaud

Gauthier Gidel

Aymeric Dieuleveut

The energy landscape of high-dimensional non-convex optimization problems is crucial to understanding the effectiveness of modern deep neura… (see more)l network architectures. Recent works have experimentally shown that two different solutions found after two runs of a stochastic training are often connected by very simple continuous paths (e.g., linear) modulo a permutation of the weights. In this paper, we provide a framework theoretically explaining this empirical observation. Based on convergence rates in Wasserstein distance of empirical measures, we show that, with high probability, two wide enough two-layer neural networks trained with stochastic gradient descent are linearly connected. Additionally, we express upper and lower bounds on the width of each layer of two deep neural networks with independent neuron weights to be linearly connected. Finally, we empirically demonstrate the validity of our approach by showing how the dimension of the support of the weight distribution of neurons, which dictates Wasserstein convergence rates is correlated with linear mode connectivity.

2023-10-29

ArXiv (preprint)

A Case Study of Instruction Tuning with Mixture of Parameter-Efficient Experts

Oleksiy Ostapenko

Lucas Caccia

Zhan Su

Nicolas Le Roux

Laurent Charlin

Alessandro Sordoni

We study the applicability of mixture of parameter-efficient experts (MoPEs) for instruction-tuning large decoder-only language models. Rece… (see more)nt literature indicates that MoPEs might enhance performance in specific multi-task instruction-following datasets. In this paper, we extend such previous results and study applicability of MoPEs in settings previously overlooked: a) with open-domain instruction-following datasets; b) with recent decoder-only models and c) with downstream out-of-distribution test sets. We build on top of LLaMA1-13B/-7B and LLaMA2-13B. We study different variants of learned routing, namely per-example routing ([PE]), and a more expensive per-token ([PT]) routing. Overall, we are unable to substantiate strong performance gains observed in related studies in our setting. We observe occasional enhancements of LLAMA2 fine-tuned on Open Platypus dataset in 0-shot SNI evaluation and TruthfulQA evaluation after fine-tuning on a subset of Flan. We shed some light on the inner workings of MoPEs by comparing different routing strategies. We find that [PE] routing tends to collapse at downstream evaluation time reducing the importance of router's application. We plan to publicly release our code.

2023-10-28

NeurIPS.cc/2023/Workshop/Instruction (published)

openreview.net

Detecting Backdoors with Meta-Models

Lauro Langosco

Neel Alex

William Baker

David John Quarel

Herbie Bradley

David Scott Krueger

It is widely known that it is possible to implant backdoors into neural networks, by which an attacker can choose an input to produce a part… (see more)icular undesirable output (e.g.\ misclassify an image). We propose to use \emph{meta-models}, neural networks that take another network's parameters as input, to detect backdoors directly from model weights. To this end we present a meta-model architecture and train it on a dataset of approx.\ 4000 clean and backdoored CNNs trained on CIFAR-10. Our approach is simple and scalable, and is able to detect the presence of a backdoor with

2023-10-28

NeurIPS.cc/2023/Workshop/BUGS (poster)

openreview.net

Detecting Backdoors with Meta-Models

Lauro Langosco

Neel Alex

William Baker

David John Quarel

Herbie Bradley

David Scott Krueger

It is widely known that it is possible to implant backdoors into neural networks, by which an attacker can choose an input to produce a part… (see more)icular undesirable output (e.g.\ misclassify an image). We propose to use \emph{meta-models}, neural networks that take another network's parameters as input, to detect backdoors directly from model weights. To this end we present a meta-model architecture and train it on a dataset of approx.\ 4000 clean and backdoored CNNs trained on CIFAR-10. Our approach is simple and scalable, and is able to detect the presence of a backdoor with

2023-10-28

NeurIPS.cc/2023/Workshop/BUGS (poster)

openreview.net

Generative AI models should include detection mechanisms as a condition for public release

Alistair Knott

Dino Pedreschi

Raja Chatila

Tapabrata Chakraborti

Susan Leavy

Ricardo Baeza-Yates

D. Eyers

Andrew Trotman

Paul D. Teal

Przemyslaw Biecek

Stuart Russell

2023-10-28

Ethics and Information Technology (published)

Generative AI models should include detection mechanisms as a condition for public release

Alistair Knott

Dino Pedreschi

Raja Chatila

Tapabrata Chakraborti

Susan Leavy

Ricardo Baeza-Yates

D. Eyers

Andrew Trotman

Paul D. Teal

Przemyslaw Biecek

Stuart Russell

2023-10-28

Ethics and Information Technology (published)

Generative AI models should include detection mechanisms as a condition for public release

Alistair Knott

Dino Pedreschi

Raja Chatila

Tapabrata Chakraborti

Susan Leavy

Ricardo Baeza-Yates

D. Eyers

Andrew Trotman

Paul D. Teal

Przemyslaw Biecek

Stuart Russell

2023-10-28

Ethics and Information Technology (published)

Generative AI models should include detection mechanisms as a condition for public release

Alistair Knott

Dino Pedreschi

Raja Chatila

Tapabrata Chakraborti

Susan Leavy

Ricardo Baeza-Yates

D. Eyers

Andrew Trotman

Paul D. Teal

Przemyslaw Biecek

Stuart Russell

2023-10-28

Ethics and Information Technology (published)