Portrait de Sébastien Lemieux

Sébastien Lemieux

Membre académique associé
Professeur agrégé, Université de Montréal, Département de biochimie et de médecine moléculaire
Université de Montréal
Sujets de recherche
Biologie computationnelle
Modélisation moléculaire

Biographie

Microbiologiste de formation, Sébastien Lemieux s'est tourné vers la bio-informatique en 1997 et a réalisé des études de maîtrise et de doctorat à l'Université de Montréal sous la direction de François Major. Après avoir obtenu son doctorat en 2002, le jeune chercheur s'est dirigé vers le secteur privé et a effectué un stage postdoctoral chez Elitra Canada (maintenant Merck & Co.) sous la supervision de Bo Jiang. Il y a acquis des compétences en analyse de séquences et en analyse de données de microréseaux d'ADN, ainsi qu'en intégration informatique de données expérimentales.

Il a finalement rejoint les rangs de l'Institut de recherche en immunologie et en cancérologie (IRIC) en 2005. En 2018, il a été nommé professeur agrégé au Département de biochimie et médecine moléculaire de la Faculté de médecine de l'Université de Montréal.

Étudiants actuels

Maîtrise recherche - UdeM
Maîtrise recherche - UdeM

Publications

Toward computing attributions for dimensionality reduction techniques
Jean-Christophe Grenier
Raphaël Poujol
Julie G. Hussin
We describe the problem of computing local feature attributions for dimensionality reduction methods. We use one such method that is well es… (voir plus)tablished within the context of supervised classification—using the gradients of target outputs with respect to the inputs—on the popular dimensionality reduction technique t-SNE, widely used in analyses of biological data. We provide an efficient implementation for the gradient computation for this dimensionality reduction technique. We show that our explanations identify significant features using novel validation methodology; using synthetic datasets and the popular MNIST benchmark dataset. We then demonstrate the practical utility of our algorithm by showing that it can produce explanations that agree with domain knowledge on a SARS-CoV-2 sequence dataset. Throughout, we provide a road map so that similar explanation methods could be applied to other dimensionality reduction techniques to rigorously analyze biological datasets. We have created a Python package that can be installed using the following command: pip install interpretable_tsne. All code used can be found at github.com/MattScicluna/interpretable_tsne.
Two types of human TCR differentially regulate reactivity to self and non-self antigens
Jean-David Larouche
Jonathan Séguin
Jean-Philippe Laverdure
Ann Brasey
Gregory Ehx
Denis-Claude Roy
Lambert Busque
Silvy Lachance
Claude Perreault
Based on analyses of TCR sequences from over 1,000 individuals, we report that the TCR repertoire is composed of two ontogenically and funct… (voir plus)ionally distinct types of TCRs. Their production is regulated by variations in thymic output and terminal deoxynucleotidyl transferase (TDT) activity. Neonatal TCRs derived from TDT-negative progenitors persist throughout life, are highly shared among subjects, and are reported as disease-associated. Thus, 10%–30% of most frequent cord blood TCRs are associated with common pathogens and autoantigens. TDT-dependent TCRs present distinct structural features and are less shared among subjects. TDT-dependent TCRs are produced in maximal numbers during infancy when thymic output and TDT activity reach a summit, are more abundant in subjects with AIRE mutations, and seem to play a dominant role in graft-versus-host disease. Factors decreasing thymic output (age, male sex) negatively impact TCR diversity. Males compensate for their lower repertoire diversity via hyperexpansion of selected TCR clonotypes.
Proteogenomics and Differential Ion Mobility Enable the Exploration of the Mutational Landscape in Colon Cancer Cells
Zhaoguan Wu
Éric Bonneil
Michael Belford
Cornelia Boeser
Maria Virginia Ruiz Cuevas
Jean-Jacques Dunyach
Pierre Thibault
The sensitivity and depth of proteomic analyses are limited by isobaric ions and interferences that preclude the identification of low abund… (voir plus)ance peptides. Extensive sample fractionation is often required to extend proteome coverage when sample amount is not a limitation. Ion mobility devices provide a viable alternate approach to resolve confounding ions and improve peak capacity and mass spectrometry (MS) sensitivity. Here, we report the integration of differential ion mobility with segmented ion fractionation (SIFT) to enhance the comprehensiveness of proteomic analyses. The combination of differential ion mobility and SIFT, where narrow windows of ∼m/z 100 are acquired in turn, is found particularly advantageous in the analysis of protein digests and typically provided more than 60% gain in identification compared to conventional single-shot LC–MS/MS. The application of this approach is further demonstrated for the analysis of tryptic digests from different colorectal cancer cell lines where the enhanced sensitivity enabled the identification of single amino acid variants that were correlated with the corresponding transcriptomic data sets.
Induced pluripotent stem cells display a distinct set of MHC I-associated peptides shared by human cancers
Anca Apavaloaei
Leslie Hesnard
Marie-Pierre Hardy
Basma Benabdallah
Gregory Ehx
Catherine Thériault
Jean-Philippe Laverdure
Chantal Durette
Joël Lanoix
Mathieu Courcelles
Nandita Noronha
Kapil Dev Chauhan
Christian Beauséjour
Mick Bhatia
Pierre Thibault
Claude Perreault
Unified gene expression signature of novel NPM1 exon 5 mutations in acute myeloid leukemia
Véronique Lisi
Ève Blanchard
Michael Vladovsky
Éric Audemard
Albert Ferghaly
Josée Hébert
Guy Sauvageau
Vincent-Philippe Lavallee
Visual Abstract
Monoallelic Heb/Tcf12 Deletion Reduces the Requirement for NOTCH1 Hyperactivation in T-Cell Acute Lymphoblastic Leukemia
Diogo F. T. Veiga
Mathieu Tremblay
Bastien Gerby
Sabine Herblot
André Haman
Patrick Gendron
Juan Carlos Zúñiga-Pflücker
Josée Hébert
Trang Hoang
Early T-cell development is precisely controlled by E proteins, that indistinguishably include HEB/TCF12 and E2A/TCF3 transcription factors,… (voir plus) together with NOTCH1 and pre-T cell receptor (TCR) signalling. Importantly, perturbations of early T-cell regulatory networks are implicated in leukemogenesis. NOTCH1 gain of function mutations invariably lead to T-cell acute lymphoblastic leukemia (T-ALL), whereas inhibition of E proteins accelerates leukemogenesis. Thus, NOTCH1, pre-TCR, E2A and HEB functions are intertwined, but how these pathways contribute individually or synergistically to leukemogenesis remain to be documented. To directly address these questions, we leveraged Cd3e-deficient mice in which pre-TCR signaling and progression through β-selection is abrogated to dissect and decouple the roles of pre-TCR, NOTCH1, E2A and HEB in SCL/TAL1-induced T-ALL, via the use of Notch1 gain of function transgenic (Notch1ICtg) and Tcf12+/- or Tcf3+/- heterozygote mice. As a result, we now provide evidence that both HEB and E2A restrain cell proliferation at the β-selection checkpoint while the clonal expansion of SCL-LMO1-induced pre-leukemic stem cells in T-ALL is uniquely dependent on Tcf12 gene dosage. At the molecular level, HEB protein levels are decreased via proteasomal degradation at the leukemic stage, pointing to a reversible loss of function mechanism. Moreover, in SCL-LMO1-induced T-ALL, loss of one Tcf12 allele is sufficient to bypass pre-TCR signaling which is required for Notch1 gain of function mutations and for progression to T-ALL. In contrast, Tcf12 monoallelic deletion does not accelerate Notch1IC-induced T-ALL, indicating that Tcf12 and Notch1 operate in the same pathway. Finally, we identify a tumor suppressor gene set downstream of HEB, exhibiting significantly lower expression levels in pediatric T-ALL compared to B-ALL and brain cancer samples, the three most frequent pediatric cancers. In summary, our results indicate a tumor suppressor function of HEB/TCF12 in T-ALL to mitigate cell proliferation controlled by NOTCH1 in pre-leukemic stem cells and prevent NOTCH1-driven progression to T-ALL.
CAMAP: Artificial neural networks unveil the role of codon arrangement in modulating MHC-I peptides presentation
Tariq Daouda
Maude Dumont-Lagacé
Albert Feghaly
Yahya Benslimane
Rebecca Panes
Mathieu Courcelles
Mohamed Benhammadi
Lea Harrington
Pierre Thibault
François Major
Étienne Gagnon
Claude Perreault
MHC-I associated peptides (MAPs) are small fragments of intracellular proteins presented at the surface of cells and used by the immune syst… (voir plus)em to detect and eliminate cancerous or virus-infected cells. While it is theoretically possible to predict which portions of the intracellular proteins will be naturally processed by the cells to ultimately reach the surface, current methodologies have prohibitively high false discovery rates. Here we introduce an artificial neural network called Codon Arrangement MAP Predictor (CAMAP) which integrates information from mRNA-to-protein translation to other factors regulating MAP biogenesis (e.g. MAP ligand score and transcript expression levels) to improve MAP prediction accuracy. While most MAP predictive approaches focus on MAP sequences per se, CAMAP’s novelty is to analyze the MAP-flanking mRNA sequences, thereby providing completely independent information for MAP prediction. We show on several datasets that the integration of CAMAP scores with other known factors involved in MAP presentation (i.e. MAP ligand score and mRNA expression) significantly improves MAP prediction accuracy, and further validate CAMAP learned features using anin-vitroassay. These findings may have major implications for the design of vaccines against cancers and viruses, and in times of pandemics could accelerate the identification of relevant MAPs of viral origins.
Factorized embeddings learns rich and biologically meaningful embedding spaces using factorized tensor decomposition
The recent development of sequencing technologies revolutionized our understanding of the inner workings of the cell as well as the way dise… (voir plus)ase is treated. A single RNA sequencing (RNA-Seq) experiment, however, measures tens of thousands of parameters simultaneously. While the results are information rich, data analysis provides a challenge. Dimensionality reduction methods help with this task by extracting patterns from the data by compressing it into compact vector representations. We present the factorized embeddings (FE) model, a self-supervised deep learning algorithm that learns simultaneously, by tensor factorization, gene and sample representation spaces. We ran the model on RNA-Seq data from two large-scale cohorts and observed that the sample representation captures information on single gene and global gene expression patterns. Moreover, we found that the gene representation space was organized such that tissue-specific genes, highly correlated genes as well as genes participating in the same GO terms were grouped. Finally, we compared the vector representation of samples learned by the FE model to other similar models on 49 regression tasks. We report that the representations trained with FE rank first or second in all of the tasks, surpassing, sometimes by a considerable margin, other representations. A toy example in the form of a Jupyter Notebook as well as the code and trained embeddings for this project can be found at: https://github.com/TrofimovAssya/FactorizedEmbeddings. Supplementary data are available at Bioinformatics online.