Peu importe la taille : démocratiser la découverte de protéines avec l'IA
Des chercheurs de Mila ont créé un puissant modèle de langage protéique à source ouverte plus compact et efficace afin de démocratiser la découverte de protéines.
La prochaine cohorte de notre programme, conçu pour fournir aux participant·e·s une compréhension fondamentale des technologies de l'IA, se déroulera à Ottawa les 28 et 29 novembre.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Should We Feed the Trolls? Using Marketer-Generated Content to Explain Average Toxicity and Product Usage
Disentanglement aims to recover meaningful latent ground-truth factors from the observed distribution solely, and is formalized through the … (voir plus)theory of identifiability. The identifiability of independent latent factors is proven to be impossible in the unsupervised i.i.d. setting under a general nonlinear map from factors to observations. In this work, however, we demonstrate that it is possible to recover quantized latent factors under a generic nonlinear diffeomorphism. We only assume that the latent factors have independent discontinuities in their density, without requiring the factors to be statistically independent. We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.
Modeling strong gravitational lenses in order to quantify distortions in the images of background sources and to reconstruct the mass densit… (voir plus)y in foreground lenses has been a difficult computational challenge. As the quality of gravitational lens images increases, the task of fully exploiting the information they contain becomes computationally and algorithmically more difficult. In this work, we use a neural network based on the recurrent inference machine to reconstruct simultaneously an undistorted image of the background source and the lens mass density distribution as pixelated maps. The method iteratively reconstructs the model parameters (the image of the source and a pixelated density map) by learning the process of optimizing the likelihood given the data using the physical model (a ray-tracing simulation), regularized by a prior implicitly learned by the neural network through its training data. When compared to more traditional parametric models, the proposed method is significantly more expressive and can reconstruct complex mass distributions, which we demonstrate by using realistic lensing galaxies taken from the IllustrisTNG cosmological hydrodynamic simulation.
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objec… (voir plus)tive and data from a set of near-optimal RL solutions for training tasks. This work relates to meta RL, contextual RL, and transfer learning, with a particular focus on zero-shot performance at test time, enabled by knowledge of the task parameters (also known as context). Our technical approach is based upon viewing each RL algorithm as a mapping from the MDP specifics to the near-optimal value function and policy and seek to approximate it with a hypernetwork that can generate near-optimal value functions and policies, given the parameters of the MDP. We show that, under certain conditions, this mapping can be considered as a supervised learning problem. We empirically evaluate the effectiveness of our method for zero-shot transfer to new reward and transition dynamics on a series of continuous control tasks from DeepMind Control Suite. Our method demonstrates significant improvements over baselines from multitask and meta RL approaches.
2023-06-26
Proceedings of the AAAI Conference on Artificial Intelligence (publié)
This work investigates the evolution of latent space when deep learning models are trained incrementally in non-stationary environments that… (voir plus) stem from concept drift. We propose a methodology for visualizing the incurred change in latent representations. We further show that classes not targeted by concept drift can be negatively affected, suggesting that the observation of all classes during learning may regularize the latent space.
2023-06-26
Proceedings of the AAAI Conference on Artificial Intelligence (publié)
Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that task … (voir plus)distribution plays a vital role in the performance of the model. Conventional wisdom is that task diversity should improve the performance of meta-learning. In this work, we find evidence to the contrary; we study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms. For this experiment, we train on multiple datasets, and with three broad classes of meta-learning models - Metric-based (i.e., Protonet, Matching Networks), Optimization-based (i.e., MAML, Reptile, and MetaOptNet), and Bayesian meta-learning models (i.e., CNAPs). Our experiments demonstrate that the effect of task diversity on all these algorithms follows a similar trend, and task diversity does not seem to offer any benefits to the learning of the model. Furthermore, we also demonstrate that even a handful of tasks, repeated over multiple batches, would be sufficient to achieve a performance similar to uniform sampling and draws into question the need for additional tasks to create better models.
2023-06-26
Proceedings of the AAAI Conference on Artificial Intelligence (publié)
Guessing Random Additive Noise Decoding (GRAND) excels at decoding high-rate codes but struggles to decode low-rate codes with reasonable co… (voir plus)mplexity. Ordered Statistics Decoding (OSD) specifically excels in decoding short codes irrespective of rates; however, OSD necessitates the use of Gaussian elimination which introduces additional time, space and computational complexity. Partial Ordered Statistics Decoding (POSD) was proposed to reduce the time, space, and computational complexity of OSD; however, the current partition-based POSD has poor decoding performance since it does not generate test error patterns across partitions. In this paper, we propose to improve the decoding performance of POSD by incorporating test error patterns inspired by GRAND methods. This work offers a trade-off between performance and complexity compared to existing decoders such as GRAND and OSD. We enhance POSD by optimizing the scheduling of Test Error Patterns (TEPs) and show that our technique can be applied to any code in a standard form. At a target BER 10−4 with eBCH (128,64) the enhanced error patterns achieve more than 0.6 dB gain in performance compared to the POSD with partition-based error patterns. Moreover, at a target frame error rate of 10−5, POSD uses 10× less binary operations compared to GRAND when decoding eBCH (128,64) and RLC(128,64) codes. With BCH (127,29) and RLC(128,32), at a target frame error rate of 10−2, POSD with enhanced error patterns with a maximum number of queries (MQ) of 104 achieves up to a 2 dB gain to its GRAND equivalent which is using 107 maximum number of queries.
2023-06-25
2023 IEEE International Symposium on Information Theory (ISIT) (publié)
In recent years, in-silico molecular design has received much attention from the machine learning community. When designing a new compound f… (voir plus)or pharmaceutical applications, there are usually multiple properties of such molecules that need to be optimised: binding energy to the target, synthesizability, toxicity, EC50, and so on. While previous approaches have employed a scalarization scheme to turn the multi-objective problem into a preference-conditioned single objective, it has been established that this kind of reduction may produce solutions that tend to slide towards the extreme points of the objective space when presented with a problem that exhibits a concave Pareto front. In this work we experiment with an alternative formulation of goal-conditioned molecular generation to obtain a more controllable conditional model that can uniformly explore solutions along the entire Pareto front.