Peu importe la taille : démocratiser la découverte de protéines avec l'IA
Des chercheurs de Mila ont créé un puissant modèle de langage protéique à source ouverte plus compact et efficace afin de démocratiser la découverte de protéines.
La prochaine cohorte de notre programme, conçu pour fournir aux participant·e·s une compréhension fondamentale des technologies de l'IA, se déroulera à Ottawa les 28 et 29 novembre.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Stochastic Bit-Wise Iterative Decoding of Polar Codes
Polar codes have received recent attention due to their potential to be applied in advanced wireless communication protocols such as the fif… (voir plus)th generation mobile communication system (5G). Among the existing decoding algorithms, Belief Propagation (BP) exhibits high-throughput, low-latency, and soft output with a high hardware cost. Stochastic computing, as a form of approximate computing, provides a potential low-cost implementation solution for the BP algorithm. However, existing stochastic BP decoders suffer from a relatively long decoding latency resulting in low hardware efficiency. In this paper, a novel bit-wise iterative stochastic decoding architecture for the BP algorithm is proposed to improve the throughput and hardware efficiency. By utilizing the frozen bits of polar codes and stochastic computing, multiple novel optimization methods are presented to further speed up convergence and increase the hardware efficiency.
Polar codes have received recent attention due to their potential to be applied in advanced wireless communication protocols such as the fif… (voir plus)th generation mobile communication system (5G). Among the existing decoding algorithms, Belief Propagation (BP) exhibits high-throughput, low-latency, and soft output with a high hardware cost. Stochastic computing, as a form of approximate computing, provides a potential low-cost implementation solution for the BP algorithm. However, existing stochastic BP decoders suffer from a relatively long decoding latency resulting in low hardware efficiency. In this paper, a novel bit-wise iterative stochastic decoding architecture for the BP algorithm is proposed to improve the throughput and hardware efficiency. By utilizing the frozen bits of polar codes and stochastic computing, multiple novel optimization methods are presented to further speed up convergence and increase the hardware efficiency.
2019-03-01
IEEE Transactions on Signal Processing (published)
We present the first automatic end-to-end deep learning framework for the prediction of future patient disability progression (one year from… (voir plus) baseline) based on multi-modal brain Magnetic Resonance Images (MRI) of patients with Multiple Sclerosis (MS). The model uses parallel convolutional pathways, an idea introduced by the popular Inception net and is trained and tested on two large proprietary, multi-scanner, multi-center, clinical trial datasets of patients with Relapsing-Remitting Multiple Sclerosis (RRMS). Experiments on 465 patients on the placebo arms of the trials indicate that the model can accurately predict future disease progression, measured by a sustained increase in the extended disability status scale (EDSS) score over time. Using only the multi-modal MRI provided at baseline, the model achieves an AUC of 0.66 +- 0.055. However, when supplemental lesion label masks are provided as inputs as well, the AUC increases to 0.701 +- 0.027. Furthermore, we demonstrate that uncertainty estimates based on Monte Carlo dropout sample variance correlate with errors made by the model. Clinicians provided with the predictions computed by the model can therefore use the associated uncertainty estimates to assess which scans require further examination.
In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We… (voir plus) propose an algorithm that focuses on the termination function, as opposed to - as is common - the policy. The termination function is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option’s encoding - arguably a key reason for using abstractions.To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a “critic” for the termination function. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning.
Recent progress in artificial intelligence provides researchers with a powerful set of machine learning tools for analyzing brain imaging da… (voir plus)ta. In this work, we explore a variety of classification algorithms and functional network features derived from resting-state fMRI data collected from clinical high-risk (prodromal schizophrenia) patients and controls, trying to identify features predictive of conversion to psychosis among a subset of CHR patients. While there are many existing studies suggesting that functional network features can be highly discriminative of schizophrenia when analyzing fMRI of patients suffering from the disease vs controls, few studies attempt to explore a similar approach to actual prediction of future psychosis development ahead of time, in the prodromal stage. Our preliminary results demonstrate the potential of fMRI functional network features to predict the conversion to psychosis in CHR patients. However, given the high variance of our results across different classifiers and subsets of data, a more extensive empirical investigation is required to reach more robust conclusions.
2019-02-16
Medical Imaging 2019: Biomedical Applications in Molecular, Structural, and Functional Imaging (publié)
Tail averaging consists in averaging the last examples in a stream. Common techniques either have a memory requirement which grows with the … (voir plus)number of samples to average, are not available at every timestep or do not accomodate growing windows. We propose two techniques with a low constant memory cost that perform tail averaging with access to the average at every time step. We also show how one can improve the accuracy of that average at the cost of increased memory consumption.
Despite many algorithmic advances, our theoretical understanding of practical distributional reinforcement learning methods remains limited.… (voir plus) One exception is Rowland et al. (2018)'s analysis of the C51 algorithm in terms of the Cramer distance, but their results only apply to the tabular setting and ignore C51's use of a softmax to produce normalized distributions. In this paper we adapt the Cramer distance to deal with arbitrary vectors. From it we derive a new distributional algorithm which is fully Cramer-based and can be combined to linear function approximation, with formal guarantees in the context of policy evaluation.
In allowing the model's prediction to be any real vector, we lose the probabilistic interpretation behind the method, but otherwise maintain the appealing properties of distributional approaches. To the best of our knowledge, ours is the first proof of convergence of a distributional algorithm combined with function approximation. Perhaps surprisingly, our results provide evidence that Cramer-based distributional methods may perform worse than directly approximating the value function.
Despite many algorithmic advances, our theoretical understanding of practical distributional reinforcement learning methods remains limited.… (voir plus) One exception is Rowland et al. (2018)'s analysis of the C51 algorithm in terms of the Cramer distance, but their results only apply to the tabular setting and ignore C51's use of a softmax to produce normalized distributions. In this paper we adapt the Cramer distance to deal with arbitrary vectors. From it we derive a new distributional algorithm which is fully Cramer-based and can be combined to linear function approximation, with formal guarantees in the context of policy evaluation.
In allowing the model's prediction to be any real vector, we lose the probabilistic interpretation behind the method, but otherwise maintain the appealing properties of distributional approaches. To the best of our knowledge, ours is the first proof of convergence of a distributional algorithm combined with function approximation. Perhaps surprisingly, our results provide evidence that Cramer-based distributional methods may perform worse than directly approximating the value function.
We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functi… (voir plus)ons. We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks. Our formulation considers adapting the representation to minimize the (linear) approximation of the value function of all stationary policies for a given environment. We show that this optimization reduces to making accurate predictions regarding a special class of value functions which we call adversarial value functions (AVFs). We demonstrate that using value functions as auxiliary tasks corresponds to an expected-error relaxation of our formulation, with AVFs a natural candidate, and identify a close relationship with proto-value functions (Mahadevan, 2005). We highlight characteristics of AVFs and their usefulness as auxiliary tasks in a series of experiments on the four-room domain.
Online communities such as Facebook and Twitter are enormously popular and have become an essential part of the daily life of many of their … (voir plus)users. Through these platforms, users can discover and create information that others will then consume. In that context, recommending relevant information to users becomes critical for viability. However, recommendation in online communities is a challenging problem: 1) users' interests are dynamic, and 2) users are influenced by their friends. Moreover, the influencers may be context-dependent. That is, different friends may be relied upon for different topics. Modeling both signals is therefore essential for recommendations. We propose a recommender system for online communities based on a dynamic-graph-attention neural network. We model dynamic user behaviors with a recurrent neural network, and context-dependent social influence with a graph-attention neural network, which dynamically infers the influencers based on users' current interests. The whole model can be efficiently fit on large-scale data. Experimental results on several real-world data sets demonstrate the effectiveness of our proposed approach over several competitive baselines including state-of-the-art models.
2019-01-30
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (publié)