Publications

DECOLLAGE: 3D Detailization by Controllable, Localized, and Learned Geometry Enhancement

Qimin Chen

Zhiqin Chen

Vladimir Kim

Noam Aigerman

Hao (Richard) Zhang

Hao Zhang 0002

Siddhartha Chaudhuri

2024-10-02

Lecture Notes in Computer Science (publié)

doi.org

arxiv.org

Probabilistic Temporal Prediction of Continuous Disease Trajectories and Treatment Effects Using Neural SDEs

Joshua D. Durso-Finley

Berardino Barile

Jean-Pierre R. Falet

Douglas Arnold

Nick Pawlowski

Tal Arbel

Personalized medicine based on medical images, including predicting future individualized clinical disease progression and treatment respons… (voir plus)e, would have an enormous impact on healthcare and drug development, particularly for diseases (e.g. multiple sclerosis (MS)) with long term, complex, heterogeneous evolutions and no cure. In this work, we present the first stochastic causal temporal framework to model the continuous temporal evolution of disease progression via Neural Stochastic Differential Equations (NSDE). The proposed causal inference model takes as input the patient's high dimensional images (MRI) and tabular data, and predicts both factual and counterfactual progression trajectories on different treatments in latent space. The NSDE permits the estimation of high-confidence personalized trajectories and treatment effects. Extensive experiments were performed on a large, multi-centre, proprietary dataset of patient 3D MRI and clinical data acquired during several randomized clinical trials for MS treatments. Our results present the first successful uncertainty-based causal Deep Learning (DL) model to: (a) accurately predict future patient MS disability evolution (e.g. EDSS) and treatment effects leveraging baseline MRI, and (b) permit the discovery of subgroups of patients for which the model has high confidence in their response to treatment even in clinical trials which did not reach their clinical endpoints.

2024-10-02

Lecture Notes in Computer Science (publié)

doi.org

arxiv.org

Sparse Bayesian Networks: Efficient Uncertainty Quantification in Medical Image Analysis

Zeinab Abboud

Hervé Lombaert

Samuel Kadoury

Efficiently quantifying predictive uncertainty in medical images remains a challenge. While Bayesian neural networks (BNN) offer predictive … (voir plus)uncertainty, they require substantial computational resources to train. Although Bayesian approximations such as ensembles have shown promise, they still suffer from high training and inference costs. Existing approaches mainly address the costs of BNN inference post-training, with little focus on improving training efficiency and reducing parameter complexity. This study introduces a training procedure for a sparse (partial) Bayesian network. Our method selectively assigns a subset of parameters as Bayesian by assessing their deterministic saliency through gradient sensitivity analysis. The resulting network combines deterministic and Bayesian parameters, exploiting the advantages of both representations to achieve high task-specific performance and minimize predictive uncertainty. Demonstrated on multi-label ChestMNIST for classification and ISIC, LIDC-IDRI for segmentation, our approach achieves competitive performance and predictive uncertainty estimation by reducing Bayesian parameters by over 95\%, significantly reducing computational expenses compared to fully Bayesian and ensemble methods.

2024-10-02

Lecture Notes in Computer Science (publié)

doi.org

arxiv.org

Top-down feedback matters: Functional impact of brainlike connectivity motifs on audiovisual integration

Mashbayar Tugsbayar

Mingze Li

Eilif B Muller

Blake Richards

Artificial neural networks (ANNs) are an important tool for studying neural computation, but many features of the brain are not captured by … (voir plus)standard ANN architectures. One notable missing feature in most ANN models is top-down feedback, i.e. projections from higher-order layers to lower-order layers in the network. Top-down feedback is ubiquitous in the brain, and it has a unique modulatory impact on activity in neocortical pyramidal neurons. However, we still do not understand its computational role. Here we develop a deep neural network model that captures the core functional properties of top-down feedback in the neocortex, allowing us to construct hierarchical recurrent ANN models that more closely reflect the architecture of the brain. We use this to explore the impact of different hierarchical recurrent architectures on an audiovisual integration task. We find that certain hierarchies, namely those that mimic the architecture of the human brain, impart ANN models with a light visual bias similar to that seen in humans. This bias does not impair performance on the audiovisual tasks. The results further suggest that different configurations of top-down feedback make otherwise identically connected models functionally distinct from each other, and from traditional feedforward and laterally recurrent models. Altogether our findings demonstrate that modulatory top-down feedback is a computationally relevant feature of biological brains, and that incorporating it into ANNs affects their behavior and constrains the solutions it’s likely to discover.

2024-10-02

bioRxiv (prépublication)

doi.org

TrajGPT: Irregular Time-Series Representation Learning for Health Trajectory Analysis

Ziyang Song

Qincheng Lu

Mike He Zhu

David L Buckeridge

Yuemei Li

2024-10-02

ArXiv (prépublication)

doi.org

openreview.net

MiRGraph: A hybrid deep learning approach to identify microRNA-target interactions by integrating heterogeneous regulatory network and genomic sequences

Pei Liu

Yang Liu

Jiawei Luo

Yuemei Li

2024-10-01

bioRxiv (prépublication)

doi.org

VinePPO: Refining Credit Assignment in RL Training of LLMs

Amirhossein Kazemnejad

Large language models (LLMs) are increasingly applied to complex reasoning tasks that require executing several complex steps before receivi… (voir plus)ng any reward. Properly assigning credit to these steps is essential for enhancing model performance. Proximal Policy Optimization (PPO), a common reinforcement learning (RL) algorithm used for LLM finetuning, employs value networks to tackle credit assignment. However, recent approaches achieve strong results without it, raising questions about the efficacy of value networks in practice. In this work, we systematically evaluate the efficacy of value networks and reveal their significant shortcomings in reasoning-heavy LLM tasks, showing that they often produce poor estimate of expected return and barely outperform a random baseline when comparing alternative steps. This motivates our key question: Can improved credit assignment enhance RL training for LLMs? To address this, we propose VinePPO, a straightforward approach that leverages the flexibility of language environments to compute unbiased Monte Carlo-based estimates. Our method consistently outperforms PPO and other baselines across MATH and GSM8K datasets in less wall-clock time (up to 3.0x). Crucially, it achieves higher test accuracy for a given training accuracy, capturing more generalization signal per sample. These results emphasize the importance of accurate credit assignment in RL training of LLM.

2024-10-01

arXiv (prépublication)

doi.org

proceedings.mlr.press

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Yoshua Bengio

Hossein Hajimirsadeghi

The introduction of Transformers in 2017 reshaped the landscape of deep learning. Originally proposed for sequence modelling, Transformers h… (voir plus)ave since achieved widespread success across various domains. However, the scalability limitations of Transformers - particularly with respect to sequence length - have sparked renewed interest in novel recurrent models that are parallelizable during training, offer comparable performance, and scale more effectively. In this work, we revisit sequence modelling from a historical perspective, focusing on Recurrent Neural Networks (RNNs), which dominated the field for two decades before the rise of Transformers. Specifically, we examine LSTMs (1997) and GRUs (2014). We demonstrate that by simplifying these models, we can derive minimal versions (minLSTMs and minGRUs) that (1) use fewer parameters than their traditional counterparts, (2) are fully parallelizable during training, and (3) achieve surprisingly competitive performance on a range of tasks, rivalling recent models including Transformers.

2024-10-01

ArXiv (prépublication)

doi.org

arxiv.org

Were RNNs All We Needed?

Leo Feng

Frederick Tung

Mohamed Osama Ahmed

Yoshua Bengio

Hossein Hajimirsadeghi

2024-10-01

ArXiv (prépublication)

doi.org

arxiv.org

Challenges for impact evaluation of WHO’s normative output

Catherine Régis

Gaëlle Foucault

Jean-Louis Denis

Pierre Larouche

Miriam Cohen

2024-09-30

Bulletin of the World Health Organization (publié)

doi.org

Efficient line search for optimizing Area Under the ROC Curve in gradient descent

Jadon Fowler

Toby Dylan Hocking

Receiver Operating Characteristic (ROC) curves are useful for evaluation in binary classification and changepoint detection, but difficult t… (voir plus)o use for learning since the Area Under the Curve (AUC) is piecewise constant (gradient zero almost everywhere). Recently the Area Under Min (AUM) of false positive and false negative rates has been proposed as a differentiable surrogate for AUC. In this paper we study the piecewise linear/constant nature of the AUM/AUC, and propose new efficient path-following algorithms for choosing the learning rate which is optimal for each step of gradient descent (line search), when optimizing a linear model. Remarkably, our proposed line search algorithm has the same log-linear asymptotic time complexity as gradient descent with constant step size, but it computes a complete representation of the AUM/AUC as a function of step size. In our empirical study of binary classification problems, we verify that our proposed algorithm is fast and exact; in changepoint detection problems we show that the proposed algorithm is just as accurate as grid search, but faster.

2024-09-30

arXiv (publié)

doi.org

arxiv.org

Finite Sample Complexity Analysis of Binary Segmentation

Toby Dylan Hocking

Binary segmentation is the classic greedy algorithm which recursively splits a sequential data set by optimizing some loss or likelihood fun… (voir plus)ction. Binary segmentation is widely used for changepoint detection in data sets measured over space or time, and as a sub-routine for decision tree learning. In theory it should be extremely fast for

2024-09-30

arXiv (publié)

doi.org

arxiv.org

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Publications

TRAIL : IA responsable pour les professionnels et les leaders

Fondateur en résidence Mila Ventures

Avantage IA : productivité dans la fonction publique

Mots-clés populaires:

Publications