Samira Ebrahimi Kahou

Maîtrise recherche - École de technologie suprérieure

Doctorat - École de technologie suprérieure

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - McGill

Co-superviseur⋅e :

Maîtrise professionnelle - UdeM

Meghana Bhange

Doctorat - École de technologie suprérieure

Superviseur⋅e principal⋅e :

Maîtrise recherche - École de technologie suprérieure

Doctorat - École de technologie suprérieure

Superviseur⋅e principal⋅e :

Somjit Nath

Doctorat - McGill

Co-superviseur⋅e :

Maîtrise recherche - École de technologie suprérieure

Google Scholar

Priyesh Vijayan

Doctorat - McGill

Superviseur⋅e principal⋅e :

Publications

Empowering Clinicians with Medical Decision Transformers: A Framework for Sepsis Treatment

Rita Noumeir

Philippe Jouvet

Offline reinforcement learning has shown promise for solving tasks in safety-critical settings, such as clinical decision support. Its appli… (voir plus)cation, however, has been limited by the lack of interpretability and interactivity for clinicians. To address these challenges, we propose the medical decision transformer (MeDT), a novel and versatile framework based on the goal-conditioned reinforcement learning paradigm for sepsis treatment recommendation. MeDT uses the decision transformer architecture to learn a policy for drug dosage recommendation. During offline training, MeDT utilizes collected treatment trajectories to predict administered treatments for each time step, incorporating known treatment outcomes, target acuity scores, past treatment decisions, and current and past medical states. This analysis enables MeDT to capture complex dependencies among a patient's medical history, treatment decisions, outcomes, and short-term effects on stability. Our proposed conditioning uses acuity scores to address sparse reward issues and to facilitate clinician-model interactions, enhancing decision-making. Following training, MeDT can generate tailored treatment recommendations by conditioning on the desired positive outcome (survival) and user-specified short-term stability improvements. We carry out rigorous experiments on data from the MIMIC-III dataset and use off-policy evaluation to demonstrate that MeDT recommends interventions that outperform or are competitive with existing offline reinforcement learning methods while enabling a more interpretable, personalized and clinician-directed approach.

2024-07-28

ArXiv (prépublication)

Empowering Clinicians with Medical Decision Transformers: A Framework for Sepsis Treatment

Rita Noumeir

Philippe Jouvet

2024-07-28

ArXiv (prépublication)

Reinforcement Learning for Sequence Design Leveraging Protein Language Models

Jithendaraa Subramanian

Shiva Kanth Sujit

Niloy Irtisam

Umong Sain

Derek Nowrouzezahrai

Riashat Islam

2024-07-03

ArXiv (prépublication)

Handling Delay in Reinforcement Learning Caused by Parallel Computations of Neurons

Ivan Anokhin

Rishav

Stephen Chung

Irina Rish

Biological neural networks operate in parallel, a feature that sets them apart from artificial neural networks and can significantly enhance… (voir plus) inference speed. However, this parallelism introduces challenges: when each neuron operates asynchronously with a fixed execution time, an

2024-06-19

ICML.cc/2024/Workshop/ARLET (poster)

openreview.net

A Survey on Fairness Without Demographics

Patrik Joslin Kenfack

Éts Montréal

Ulrich Aivodji

The issue of bias in Machine Learning (ML) models is a significant challenge for the machine learning community. Real-world biases can be em… (voir plus)bedded in the data used to train models, and prior studies have shown that ML models can learn and even amplify these biases. This can result in unfair treatment of individuals based on their inherent characteristics or sensitive attributes such as gender, race, or age. Ensuring fairness is crucial with the increasing use of ML models in high-stakes scenarios and has gained significant attention from researchers in recent years. However, the challenge of ensuring fairness becomes much greater when the assumption of full access to sensitive attributes does not hold. The settings where the hypothesis does not hold include cases where (1) only limited or noisy demographic information is available or (2) demographic information is entirely unobserved due to privacy restrictions. This survey reviews recent research efforts to enforce fairness when sensitive attributes are missing. We propose a taxonomy of existing works and, more importantly, highlight current challenges and future research directions to stimulate research in ML fairness in the setting of missing sensitive attributes.

2024-06-16

TMLR (accepté)

openreview.net

Learning to Play Atari in a World of Tokens

Sheldon Andrews

Model-based reinforcement learning agents utilizing transformers have shown improved sample efficiency due to their ability to model extende… (voir plus)d context, resulting in more accurate world models. However, for complex reasoning and planning tasks, these methods primarily rely on continuous representations. This complicates modeling of discrete properties of the real world such as disjoint object classes between which interpolation is not plausible. In this work, we introduce discrete abstract representations for transformer-based learning (DART), a sample-efficient method utilizing discrete representations for modeling both the world and learning behavior. We incorporate a transformer-decoder for auto-regressive world modeling and a transformer-encoder for learning behavior by attending to task-relevant cues in the discrete representation of the world model. For handling partial observability, we aggregate information from past time steps as memory tokens. DART outperforms previous state-of-the-art methods that do not use look-ahead search on the Atari 100k sample efficiency benchmark with a median human-normalized score of 0.790 and beats humans in 9 out of 26 games. We release our code at https://pranaval.github.io/DART/.

2024-06-03

ArXiv (prépublication)

On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization

Jordi Armengol-Estap'e

Ramnath Kumar

Pierre-Luc St-Charles

Doina Precup

Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that cross… (voir plus)-modal learning can improve representations for few-shot classification. More specifically, language is a rich modality that can be used to guide visual learning. In this work, we experiment with a multi-modal architecture for few-shot learning that consists of three components: a classifier, an auxiliary network, and a bridge network. While the classifier performs the main classification task, the auxiliary network learns to predict language representations from the same input, and the bridge network transforms high-level features of the auxiliary network into modulation parameters for layers of the few-shot classifier using conditional batch normalization. The bridge should encourage a form of lightweight semantic alignment between language and vision which could be useful for the classifier. However, after evaluating the proposed approach on two popular few-shot classification benchmarks we find that a) the improvements do not reproduce across benchmarks, and b) when they do, the improvements are due to the additional compute and parameters introduced by the bridge network. We contribute insights and recommendations for future work in multi-modal meta-learning, especially when using language representations.

2024-05-29

ArXiv (prépublication)

Neural semantic tagging for natural language-based search in building information models: Implications for practice

Mehrzad Shahinmoghadam

Ali Motamedi

2024-02-01

Computers in industry (Print) (publié)

Spectral Temporal Contrastive Learning

Sacha Morin

Somjit Nath

Guy Wolf

Learning useful data representations without requiring labels is a cornerstone of modern deep learning. Self-supervised learning methods, pa… (voir plus)rticularly contrastive learning (CL), have proven successful by leveraging data augmentations to define positive pairs. This success has prompted a number of theoretical studies to better understand CL and investigate theoretical bounds for downstream linear probing tasks. This work is concerned with the temporal contrastive learning (TCL) setting where the sequential structure of the data is used instead to define positive pairs, which is more commonly used in RL and robotics contexts. In this paper, we adapt recent work on Spectral CL to formulate Spectral Temporal Contrastive Learning (STCL). We discuss a population loss based on a state graph derived from a time-homogeneous reversible Markov chain with uniform stationary distribution. The STCL loss enables to connect the linear probing performance to the spectral properties of the graph, and can be estimated by considering previously observed data sequences as an ensemble of MCMC chains.

2023-12-01

ArXiv (prépublication)

Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies

Shiva Kanth Sujit

Pedro Braga

Jorg Bornschein

Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from … (voir plus)scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive, such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there isn't a single established protocol for evaluating offline RL methods. In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of the learning process and the robustness of algorithms to distribution changes in the dataset while also harmonizing the visualization of the offline and online learning phases. Our approach is generally applicable and easy to implement. We compare several existing offline RL algorithms using this approach and present insights from a variety of tasks and offline datasets.

2023-11-10

TMLR (accepté)

openreview.net

Empowering Clinicians with MeDT: A Framework for Sepsis Treatment

Rita Noumeir