Publications

Association between arterial oxygen and mortality across critically ill patients with hematologic malignancies: results from an international collaborative network

Guillaume Dumas

Idunn S. Morris

Tamishta Hensman

Alexandre Demoule

Achille Kouatchet

Virginie Lemiale

Djamel Mokart

Frédéric Pène

Elie Azoulay

Laveena Munshi

Laurent François Dominique Naike Fabrice Emmanuel Yves Mic Argaud Barbier Benoit Bigé Bruneel Canet Cohen Dar

Laurent Argaud

François Barbier

Dominique Benoit

Naike Bigé

Fabrice Bruneel

Emmanuel Canet

Yves Cohen

Michael Darmon

Didier Gruson … (see 31 more)

Kada Klouche

Loay Kontar

Alexandre Lautrette

Christine Lebert

Guillaume Louis

Julien Mayaux

Anne-Pascale Meert

Anne-Sophie Moreau

Martine Nyunga

Vincent Peigne

Pierre Perez

Jean Herlé Raphalen

Carole Schwebel

Jean-Marie Tonnelier

Florent Wallet

Lara Zafrani

Bram Rochwerg

Farah Shoukat

Dean Fergusson

Bruno Ferreyro

Paul Heffernan

Margaret Herridge

Sheldon Magder

Mark Minden

Rakesh Patel

Salman Qureshi

Aaron Schimmer

Santhosh Thyagu

Han Ting Wang

Sangeeta Mehta

Sean M. Bagshaw

2024-04-09

Intensive Care Medicine (published)

doi.org

Deep Generative Sampling in the Dual Divergence Space: A Data-efficient&Interpretative Approach for Generative AI

Sahil Garg

Anderson Schneider

Anant Raj

Kashif Rasul

Yuriy Nevmyvaka

S. Gopal

Amit Dhurandhar

Guillermo A. Cecchi

Irina Rish

Building on the remarkable achievements in generative sampling of natural images, we propose an innovative challenge, potentially overly amb… (see more)itious, which involves generating samples of entire multivariate time series that resemble images. However, the statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects. This issue is especially problematic for deep generative models that follow the conventional approach of generating samples from a canonical distribution and then decoding or denoising them to match the true data distribution. In contrast, our method is grounded in information theory and aims to implicitly characterize the distribution of images, particularly the (global and local) dependency structure between pixels. We achieve this by empirically estimating its KL-divergence in the dual form with respect to the respective marginal distribution. This enables us to perform generative sampling directly in the optimized 1-D dual divergence space. Specifically, in the dual space, training samples representing the data distribution are embedded in the form of various clusters between two end points. In theory, any sample embedded between those two end points is in-distribution w.r.t. the data distribution. Our key idea for generating novel samples of images is to interpolate between the clusters via a walk as per gradients of the dual function w.r.t. the data dimensions. In addition to the data efficiency gained from direct sampling, we propose an algorithm that offers a significant reduction in sample complexity for estimating the divergence of the data distribution with respect to the marginal distribution. We provide strong theoretical guarantees along with an extensive empirical evaluation using many real-world datasets from diverse domains, establishing the superiority of our approach w.r.t. state-of-the-art deep learning methods.

2024-04-09

ArXiv (preprint)

doi.org

arxiv.org

On the Neurobiological Basis of Chronotype: Insights from a Multimodal Population Neuroscience Study

Le Zhou

Karin Saltoun

Julie Carrier

Kai-Florian Storch

Robin Dunbar

Danilo Bzdok

Abstract

The rapid shifts of society have brought about changes in human behavioral patterns, with increased eveni… (see more)ng activities, increased screen time, and postponed sleep schedules. As an explicit manifestation of circadian rhythms, chronotype is closely intertwined with both physical and mental health. Night owls often exhibit more unhealthy lifestyle habits, are more susceptible to mood disorders, and have poorer physical fitness. Although individual differences in chronotype yield varying consequences, their neurobiological underpinnings remain elusive. Here we carry out a pattern-learning analysis, and capitalize on a vast array of ~ 1,000 phenome-wide phenotypes with three brain-imaging modalities (region volume of gray matter, whiter-matter fiber tracts, and functional connectivity) in 27,030 UK Biobank participants. The resulting multi-level depicts of brain images converge on the basal ganglia, limbic system, hippocampus, as well as cerebellum vermis, thus implicating key nodes in habit formation, emotional regulation and reward processing. Complementary by comprehensive investigations of in-deep phenotypic collections, our population study offers evidence of behavioral pattern disparities linked to distinct chronotype-related behavioral tendencies in our societies.

2024-04-09

Research Square (preprint)

doi.org

AI healthcare research: Pioneering iSMART Lab

Narges Armanfard

Dr Narges Armanfard, Professor, talks us through the AI healthcare research at McGill University which is spearheading a groundbreaking init… (see more)iative – the iSMART Lab. Access to high-quality healthcare is not just a fundamental human right; it is the bedrock of our societal wellbeing, with the crucial roles played by doctors, nurses, and hospitals. Yet, healthcare systems globally face mounting challenges, particularly from aging populations. Dr Narges Armanfard, affiliated with McGill University and Mila Quebec AI Institute in Montreal, Canada, has spearheaded a groundbreaking initiative – the iSMART Lab. This laboratory represents a revolutionary leap into the future of healthcare, with its pioneering research in AI for health applications garnering significant attention. Renowned for its innovative integration of AI across diverse domains, iSMART Lab stands at the forefront of harnessing Artificial Intelligence to elevate and streamline health services.

2024-04-08

Open Access Government (published)

doi.org

Interpretable machine learning for finding intermediate-mass black holes

Mario Pasquato

PIERO TREVISAN

ABBAS ASKAR

Pablo Lemos

GAIA CARENINI

MICHELA MAPELLI

Yashar Hezaveh

Definitive evidence that globular clusters (GCs) host intermediate-mass black holes (IMBHs) is elusive. Machine learning (ML) models trained… (see more) on GC simulations can in principle predict IMBH host candidates based on observable features. This approach has two limitations: first, an accurate ML model is expected to be a black box due to complexity; second, despite our efforts to realistically simulate GCs, the simulation physics or initial conditions may fail to fully reflect reality. Therefore our training data may be biased, leading to a failure in generalization on observational data. Both the first issue -- explainability/interpretability -- and the second -- out of distribution generalization and fairness -- are active areas of research in ML. Here we employ techniques from these fields to address them: we use the anchors method to explain an XGBoost classifier; we also independently train a natively interpretable model using Certifiably Optimal RulE ListS (CORELS). The resulting model has a clear physical meaning, but loses some performance with respect to XGBoost. We evaluate potential candidates in real data based not only on classifier predictions but also on their similarity to the training data, measured by the likelihood of a kernel density estimation model. This measures the realism of our simulated data and mitigates the risk that our models may produce biased predictions by working in extrapolation. We apply our classifiers to real GCs, obtaining a predicted classification, a measure of the confidence of the prediction, an out-of-distribution flag, a local rule explaining the prediction of XGBoost and a global rule from CORELS.

2024-04-08

The Astrophysical Journal (published)

doi.org

arxiv.org

Learning Minimal NAP Specifications for Neural Network Verification

Chuqin Geng

Zhaoyue Wang

Haolin Ye

Saifei Liao

Xujie Si

Specifications play a crucial role in neural network verification. They define the precise input regions we aim to verify, typically represe… (see more)nted as L-infinity norm balls. While recent research suggests using neural activation patterns (NAPs) as specifications for verifying unseen test set data, it focuses on computing the most refined NAPs, often limited to very small regions in the input space. In this paper, we study the following problem: Given a neural network, find a minimal (coarsest) NAP that is sufficient for formal verification of the network's robustness. Finding the minimal NAP specification not only expands verifiable bounds but also provides insights into which neurons contribute to the model's robustness. To address this problem, we propose several exact and approximate approaches. Our exact approaches leverage the verification tool to find minimal NAP specifications in either a deterministic or statistical manner. Whereas the approximate methods efficiently estimate minimal NAPs using adversarial examples and local gradients, without making calls to the verification tool. This allows us to inspect potential causal links between neurons and the robustness of state-of-the-art neural networks, a task for which existing verification frameworks fail to scale. Our experimental results suggest that minimal NAP specifications require much smaller fractions of neurons compared to the most refined NAP specifications, yet they can significantly expand the verifiable boundaries to several orders of magnitude larger.

2024-04-05

ArXiv (preprint)

doi.org

arxiv.org

SAT-DIFF: A Tree Diffing Framework Using SAT Solving

Chuqin Geng

Haolin Ye

Yihan Zhang

Brigitte Pientka

Xujie Si

Computing differences between tree-structured data is a critical but challenging problem in software analysis. In this paper, we propose a n… (see more)ovel tree diffing approach called SatDiff, which reformulates the structural diffing problem into a MaxSAT problem. By encoding the necessary transformations from the source tree to the target tree, SatDiff generates correct, minimal, and type safe low-level edit scripts with formal guarantees. We then synthesize concise high-level edit scripts by effectively merging low-level edits in the appropriate topological order. Our empirical results demonstrate that SatDiff outperforms existing heuristic-based approaches by a significant margin in terms of conciseness while maintaining a reasonable runtime.

2024-04-05

ArXiv (preprint)

doi.org

arxiv.org

PopulAtion Parameter Averaging (PAPA)

Alexia Jolicoeur-Martineau

Emy Gervais

Kilian Fatras

Yang Zhang

Simon Lacoste-Julien

2024-04-04

TMLR (accepted)

doi.org

openreview.net

Applying Recurrent Neural Networks and Blocked Cross-Validation to Model Conventional Drinking Water Treatment Processes

Aleksandar Jakovljevic

Laurent Charlin

Benoit Barbeau

The jar test is the current standard method for predicting the performance of a conventional drinking water treatment (DWT) process and opti… (see more)mizing the coagulant dose. This test is time-consuming and requires human intervention, meaning it is infeasible for making continuous process predictions. As a potential alternative, we developed a machine learning (ML) model from historical DWT plant data that can operate continuously using real-time sensor data without human intervention for predicting clarified water turbidity 15 min in advance. We evaluated three types of models: multilayer perceptron (MLP), the long short-term memory (LSTM) recurrent neural network (RNN), and the gated recurrent unit (GRU) RNN. We also employed two training methodologies: the commonly used holdout method and the theoretically correct blocked cross-validation (BCV) method. We found that the RNN with GRU was the best model type overall and achieved a mean absolute error on an independent production set of as low as 0.044 NTU. We further found that models trained using BCV typically achieve errors equal to or lower than their counterparts trained using holdout. These results suggest that RNNs trained using BCV are superior for the development of ML models for DWT processes compared to those reported in earlier literature.

2024-04-03

Water (published)

doi.org

Assessing the emergence time of SARS-CoV-2 zoonotic spillover

Stéphane Samson

Étienne Lord

Vladimir Makarenkov

2024-04-03

PLoS ONE (published)

doi.org

Regulating advanced artificial agents

Michael K. Cohen