We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)
Patch foraging is one of the most heavily studied behavioral optimization challenges in biology. However, despite its importance to biologic… (see more)al intelligence, this behavioral optimization problem is understudied in artificial intelligence research. Patch foraging is especially amenable to study given that it has a known optimal solution, which may be difficult to discover given current techniques in deep reinforcement learning. Here, we investigate deep reinforcement learning agents in an ecological patch foraging task. For the first time, we show that machine learning agents can learn to patch forage adaptively in patterns similar to biological foragers, and approach optimal patch foraging behavior when accounting for temporal discounting. Finally, we show emergent internal dynamics in these agents that resemble single-cell recordings from foraging non-human primates, which complements experimental and theoretical work on the neural mechanisms of biological foraging. This work suggests that agents interacting in complex environments with ecologically valid pressures arrive at common solutions, suggesting the emergence of foundational computations behind adaptive, intelligent behavior in both biological and artificial agents.
In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace i… (see more)s represented by a neural network, and hence can bescaled to datasets with an effectively infinite number of rows and columns. Our method consistsin defining a loss function whose minimizer is the desired principal subspace, and constructing agradient estimate of this loss whose bias can be controlled.
2023-04-11
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (published)
In this work we theoretically show that conservative objective models (COMs) for offline model-based optimisation (MBO) are a special kind o… (see more)f contrastive divergence-based energy model, one where the energy function represents both the unconditional probability of the input and the conditional probability of the reward variable. While the initial formulation only samples modes from its learned distribution, we propose a simple fix that replaces its gradient ascent sampler with a Langevin MCMC sampler. This gives rise to a special probabilistic model where the probability of sampling an input is proportional to its predicted reward. Lastly, we show that better samples can be obtained if the model is decoupled so that the unconditional and conditional probabilities are modelled separately.
Motivation Combination therapies have emerged as a treatment strategy for cancers to reduce the probability of drug resistance and to improv… (see more)e outcome. Large databases curating the results of many drug screening studies on preclinical cancer cell lines have been developed, capturing the synergistic and antagonistic effects of combination of drugs in different cell lines. However, due to the high cost of drug screening experiments and the sheer size of possible drug combinations, these databases are quite sparse. This necessitates the development of transductive computational models to accurately impute these missing values. Results Here, we developed MARSY, a deep learning multi-task model that incorporates information on gene expression profile of cancer cell lines, as well as the differential expression signature induced by each drug to predict drug-pair synergy scores. By utilizing two encoders to capture the interplay between the drug-pairs, as well as the drug-pairs and cell lines, and by adding auxiliary tasks in the predictor, MARSY learns latent embeddings that improve the prediction performance compared to state-of-the-art and traditional machine learning models. Using MARSY, we then predicted the synergy scores of 133,722 new drug-pair cell line combinations, which we have made available to the community as part of this study. Moreover, we validated various insights obtained from these novel predictions using independent studies, confirming the ability of MARSY in making accurate novel predictions. Availability and Implementation An implementation of the algorithms in Python and cleaned input datasets are provided in https://github.com/Emad-COMBINE-lab/MARSY. Contact amin.emad@mcgill.ca Supplementary Information Online-only supplementary data is available at the journal’s website.
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance… (see more) of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
MHC class I-associated peptides (MAPs), collectively referred to as the immunopeptidome, have a pivotal role in cancer immunosurveillance. W… (see more)hile MAPs were long thought to be solely generated by the degradation of canonical proteins, recent advances in the field of proteogenomics (genomically-informed proteomics) evidenced that ∼10% of them originate from allegedly noncoding genomic sequences. Among these sequences, endogenous retroelements (EREs) are under intense scrutiny as a possible source of actionable tumor antigens (TAs). With the increasing number of cancer-oriented immunopeptidomic and proteogenomic studies comes the need to accurately attribute an RNA expression level to each MAP identified by mass-spectrometry. Here, we introduce BamQuery (BQ), a computational tool to attribute an exhaustive RNA expression to MAPs of any genomic origin (exon, intron, UTR, intergenic) from bulk and single-cell RNA-sequencing data. By using BQ on large datasets of published MAPs identified by mass spectrometry, we show that many of them can arise from more than one genomic region. Indeed, 27% of MAPs reported as deriving from protein-coding exons (canonical MAPs) could also arise from non-canonical genomic regions, sometimes with greater probability, and 61% of non-canonical MAPs could arise from more than a single genomic origin (334 possible regions on average per non-canonical MAP; up to 35,343 for EREs). The consideration of all these origins evidenced an unsuspected high RNA expression in normal human tissues of (i) published neoantigens/TAs (mutated or not); (ii) MAPs derived from proteasomal splicing, supposedly not genomically templated, and (iii) MAPs derived from viruses. In particular, the high expression of candidate immunotherapeutic targets such as TAs highlights the relevance of BamQuery and the necessity of using it to validate such antigens before translating their usage in clinical trials. We also demonstrate that BamQuery can be used to directly identify safe and actionable TAs as well as to predict their immunogenicity through our freely accessible web portal (https://bamquery.iric.ca/search). Therefore, BQ could become an essential tool in any TA prioritization pipeline in the near future.
Citation Format: Maria-Virginia Ruiz Cuevas, Marie-Pierre Hardy, Jean-David Larouche, Anca Apavaloaei, Eralda Kina, Krystel Vincent, Patrick Gendron, Jean-Philippe Laverdure, Chantal Durette, Pierre Thibault, Sebastien Lemieux, Claude Perreault, Gregory Ehx. BamQuery: a new proteogenomic tool to explore the immunopeptidome and prioritize actionable tumor antigens [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 2987.