Amin Emad

david.hostallero@mila.quebec

David Earl Hostallero

PhD - McGill University

Safyan Aman Memon

Master's Research - McGill University

safyanaman.memon@mila.quebec

Google Scholar

Ali Saberi

PhD - McGill University

ali.saberi@mila.quebec

Cedrique Shum-Tim

PhD - McGill University

cedrique.shum-tim@mila.quebec

joseph.szymborski@mila.quebec

Kiri Stern

PhD - McGill University

kiri.stern@mila.quebec

Chen Su

PhD - McGill University

chen.su@mila.quebec

Joseph Szymborski

PhD - McGill University

Website

yazdan.zinati@mila.quebec

Yazdan Zinati

Master's Research - McGill University

Website

Google Scholar

Publications

INTREPPPID - An Orthologue-Informed Quintuplet Network for Cross-Species Prediction of Protein-Protein Interaction

Joseph Szymborski

An overwhelming majority of protein-protein interaction (PPI) studies are conducted in a select few model organisms largely due to constrain… (see more)ts in time and cost of the associated “wet lab” experiments. In silico PPI inference methods are ideal tools to overcome these limitations, but often struggle with cross-species predictions. We present INTREPPPID, a method which incorporates orthology data using a new “quintuplet” neural network, which is constructed with five parallel encoders with shared parameters. INTREPPPID incorporates both a PPI classification task and an orthologous locality task. The latter learns embeddings of orthologues that have small Euclidean distances between them and large distances between embeddings of all other proteins. INTREPPPID outperforms all other leading PPI inference methods tested on both the intra-species and cross-species tasks using strict evaluation datasets. We show that INTREPPPID’s orthologous locality loss increases performance because of the biological relevance of the orthologue data, and not due to some other specious aspect of the architecture. Finally, we introduce PPI.bio and PPI Origami, a web server interface for INTREPPPID and a software tool for creating strict evaluation datasets, respectively. Together, these two initiatives aim to make both the use and development of PPI inference tools more accessible to the community. GRAPHICAL ABSTRACT

2024-02-16

bioRxiv (preprint)

Validation of ANG-1 and P-SEL as biomarkers of post-COVID-19 conditions using data from the Biobanque québécoise de la COVID-19 (BQC-19)

Eric Yamga

Antoine Soulé

Alain Piché

Madeleine Durand

Simon Rousseau

2023-10-24

Clinical Proteomics (published)

GRouNdGAN: GRN-guided simulation of single-cell RNA-seq data using causal generative adversarial networks

Yazdan Zinati

Abdulrahman Takiddeen

We introduce GRouNdGAN, a gene regulatory network (GRN)-guided causal implicit generative model for simulating single-cell RNA-seq data, in-… (see more)silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on three experimental datasets, we show that our model captures non-linear TF-gene dependences and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. Despite imposing rigid causality constraints, it outperforms state-of-the-art simulators in generating realistic cells. GRouNdGAN learns meaningful causal regulatory dynamics, allowing sampling from both observational and interventional distributions. This enables it to synthesize cells under conditions that do not occur in the dataset at inference time, allowing to perform in-silico TF knockout experiments. Our results show that in-silico knockout of cell type-specific TFs significantly reduces cells of that type being generated. Interactions imposed through the GRN are emphasized in the simulated datasets, resulting in GRN inference algorithms assigning them much higher scores than interactions not imposed but of equal importance in the experimental training dataset. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest. Our results show that GRouNdGAN is a stable, realistic, and effective simulator with various applications in single-cell RNA-seq analysis.

2023-07-31

bioRxiv (preprint)

Interpretable deep learning architectures for improving drug response prediction performance: myth or reality?

Yihui Li

David Earl Hostallero

Motivation: Recent advances in deep learning model development have enabled more accurate prediction of drug response in cancer. However, th… (see more)e black-box nature of these models still remains a hurdle in their adoption for precision cancer medicine. Recent efforts have focused on making these models interpretable by incorporating signaling pathway information in model architecture. While these models improve interpretability, it is unclear whether this higher interpretability comes at the cost of less accurate predictions, or a prediction improvement can also be obtained. Results: In this study, we comprehensively and systematically assessed four state-of-the-art interpretable models developed for drug response prediction to answer this question using three pathway collections. Our results showed that models that explicitly incorporate pathway information in the form of a latent layer perform worse compared to models that incorporate this information implicitly. Moreover, in most evaluation setups the best performance is achieved using a simple black-box model. In addition, replacing the signaling pathways with randomly generated pathways shows a comparable performance for the majority of these interpretable models. Our results suggest that new interpretable models are necessary to improve the drug response prediction performance. In addition, the current study provides different baseline models and evaluation setups necessary for such new models to demonstrate their superior prediction performance. Availability and Implementation: Implementation of all methods are provided in https://github.com/Emad-COMBINE-lab/InterpretableAI_for_DRP. Generated uniform datasets are in https://zenodo.org/record/7101665#.YzS79HbMKUk. Contact: amin.emad@mcgill.ca Supplementary Information: Online-only supplementary data is available at the journal’s website.

2023-06-16

Bioinformatics (published)

MARSY: a multitask deep-learning framework for prediction of drug combination synergy scores

Mohamed Reda El Khili

Safyan Aman Memon

Motivation Combination therapies have emerged as a treatment strategy for cancers to reduce the probability of drug resistance and to improv… (see more)e outcome. Large databases curating the results of many drug screening studies on preclinical cancer cell lines have been developed, capturing the synergistic and antagonistic effects of combination of drugs in different cell lines. However, due to the high cost of drug screening experiments and the sheer size of possible drug combinations, these databases are quite sparse. This necessitates the development of transductive computational models to accurately impute these missing values. Results Here, we developed MARSY, a deep learning multi-task model that incorporates information on gene expression profile of cancer cell lines, as well as the differential expression signature induced by each drug to predict drug-pair synergy scores. By utilizing two encoders to capture the interplay between the drug-pairs, as well as the drug-pairs and cell lines, and by adding auxiliary tasks in the predictor, MARSY learns latent embeddings that improve the prediction performance compared to state-of-the-art and traditional machine learning models. Using MARSY, we then predicted the synergy scores of 133,722 new drug-pair cell line combinations, which we have made available to the community as part of this study. Moreover, we validated various insights obtained from these novel predictions using independent studies, confirming the ability of MARSY in making accurate novel predictions. Availability and Implementation An implementation of the algorithms in Python and cleaned input datasets are provided in https://github.com/Emad-COMBINE-lab/MARSY. Contact amin.emad@mcgill.ca Supplementary Information Online-only supplementary data is available at the journal’s website.

2023-04-06

Bioinformatics (published)

Analysis of gene expression and use of connectivity mapping to identify drugs for treatment of human glomerulopathies

Chen-Fang Chung

Joan Papillon

José R. Navarro-Betancourt

Julie Guillemette

Ameya Bhope

Andrey V. Cybulsky

2023-03-13

Frontiers in Medicine (published)

Preclinical-to-clinical Anti-cancer Drug Response Prediction and Biomarker Identification Using TINDL

David Earl Hostallero

Lixuan Wei

Liewei Wang

Junmei Cairns

2023-02-01

Genomics, Proteomics & Bioinformatics (published)

A circulating proteome-informed prognostic model of COVID-19 disease activity that relies on 1 routinely available clinical laboratories 2

William Ma

Antoine Soulé

Karine Tremblay

Simon Rousseau

Abstract

2023-01-01

(published)

www.semanticscholar.org

Poisson Group Testing: A Probabilistic Model for Boolean Compressed Sensing

Olgica Milenkovic

We introduce a novel probabilistic group testing framework, termed Poisson group testing, in which the number of defectives follows a right-… (see more)truncated Poisson distribution. The Poisson model has a number of new applications, including dynamic testing with diminishing relative rates of defectives. We consider both nonadaptive and semi-adaptive identification methods. For nonadaptive methods, we derive a lower bound on the number of tests required to identify the defectives with a probability of error that asymptotically converges to zero; in addition, we propose test matrix constructions for which the number of tests closely matches the lower bound. For semiadaptive methods, we describe a lower bound on the expected number of tests required to identify the defectives with zero error probability. In addition, we propose a stage-wise reconstruction algorithm for which the expected number of tests is only a constant factor away from the lower bound. The methods rely only on an estimate of the average number of defectives, rather than on the individual probabilities of subjects being defective.

2015-08-15

IEEE Transactions on Signal Processing (published)

arxiv.org

Poisson Group Testing: A Probabilistic Model for Boolean Compressed Sensing

Olgica Milenkovic

2014-10-20

ArXiv (preprint)