Offert en partenariat avec Indspire, ce parcours professionnel sur mesure est conçu pour permettre aux talents autochtones d'apprendre, de développer et de diriger l'évolution de l'IA. Les candidatures pour le programme 2025 sont ouvertes jusqu'au 31 janvier.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Efficient Leverage Score Sampling for Tensor Train Decomposition
Tensor Train~(TT) decomposition is widely used in the machine learning and quantum physics communities as a popular tool to efficiently comp… (voir plus)ress high-dimensional tensor data. In this paper, we propose an efficient algorithm to accelerate computing the TT decomposition with the Alternating Least Squares (ALS) algorithm relying on exact leverage scores sampling. For this purpose, we propose a data structure that allows us to efficiently sample from the tensor with time complexity logarithmic in the product of the tensor dimensions. Our contribution specifically leverages the canonical form of the TT decomposition. By maintaining the canonical form through each iteration of ALS, we can efficiently compute (and sample from) the leverage scores, thus achieving significant speed-up in solving each sketched least-square problem. Experiments on synthetic and real data on dense and sparse tensors demonstrate that our method outperforms SVD-based and ALS-based algorithms.
Reinforcement learning (RL) algorithms have been very successful at tackling complex control problems, such as AlphaGo or fusion control. Ho… (voir plus)wever, current research mainly emphasizes solution quality, often achieved by using large models trained on large amounts of data, and does not account for the financial, environmental, and societal costs associated with developing and deploying such models. Modern neural networks are often overparameterized and a significant number of parameters can be pruned without meaningful loss in performance, resulting in more efficient use of the model's capacity lottery ticket. We present a methodology for identifying sub-networks within a larger network in reinforcement learning (RL). We call such sub-networks, neural pathways. We show empirically that even very small learned sub-networks, using less than 5% of the large network's parameters, can provide very good quality solutions. We also demonstrate the training of multiple pathways within the same networks in a multitask setup, where each pathway is encouraged to tackle a separate task. We evaluate empirically our approach on several continuous control tasks, in both online and offline training
Predicting low-energy molecular conformations given a molecular graph is an
important but challenging task in computational drug discovery.… (voir plus) Existing state-
of-the-art approaches either resort to large scale transformer-based models that
diffuse over conformer fields, or use computationally expensive methods to gen-
erate initial structures and diffuse over torsion angles. In this work, we introduce
Equivariant Transformer Flow (ET-Flow). We showcase that a well-designed
flow matching approach with equivariance and harmonic prior alleviates the need
for complex internal geometry calculations and large architectures, contrary to
the prevailing methods in the field. Our approach results in a straightforward
and scalable method that directly operates on all-atom coordinates with minimal
assumptions. With the advantages of equivariance and flow matching, ET-Flow
significantly increases the precision and physical validity of the generated con-
formers, while being a lighter model and faster at inference. Code is available
https://github.com/shenoynikhil/ETFlow.
Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional querie… (voir plus)s comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG.
With the projection operation initialized from a pre-trained inductive KG completion model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.
Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though method… (voir plus)s incorporating symmetries often require prior knowledge.
While recent advancements have been made in learning those symmetries directly from the dataset, most of this work has focused on the discriminative setting.
In this paper, we take inspiration from group theoretic ideas to construct a generative model that explicitly aims to capture the data's approximate symmetries.
This results in a model that, given a prespecified broad set of possible symmetries, learns to what extent, if at all, those symmetries are actually present.
Our model can be seen as a generative process for data augmentation.
We provide a simple algorithm for learning our generative model and empirically demonstrate its ability to capture symmetries under affine and color transformations, in an interpretable way.
Combining our symmetry model with standard generative models results in higher marginal test-log-likelihoods and improved data efficiency.
Learning generalist embodied agents, able to solve multitudes of tasks in different domains is a long-standing problem. Reinforcement learni… (voir plus)ng (RL) is hard to scale up as it requires a complex reward design for each task. In contrast, language can specify tasks in a more natural way. Current foundation vision-language models (VLMs) generally require fine-tuning or other adaptations to be adopted in embodied contexts, due to the significant domain gap. However, the lack of multimodal data in such domains represents an obstacle to developing foundation models for embodied applications. In this work, we overcome these problems by presenting multimodal-foundation world models, able to connect and align the representation of foundation VLMs with the latent space of generative world models for RL, without any language annotations. The resulting agent learning framework, GenRL, allows one to specify tasks through vision and/or language prompts, ground them in the embodied domain’s dynamics, and learn the corresponding behaviors in imagination.
As assessed through large-scale multi-task benchmarking in locomotion and manipulation domains, GenRL enables multi-task generalization from language and visual prompts. Furthermore, by introducing a data-free policy learning strategy, our approach lays the groundwork for foundational policy learning using generative world models.
Website, code and data: https://mazpie.github.io/genrl/
Multimodal Large Language Models (MLLMs) have demonstrated a wide range of capabilities across many domains including Embodied AI. In this w… (voir plus)ork, we study how to best ground a MLLM into different embodiments and their associated action spaces, including both continuous and discrete actions. For continuous actions, a set of learned tokenizations that capture an action at various resolutions allows for sufficient modeling precision, yielding the best performance on downstream tasks. For discrete actions, semantically aligning these actions with the native output token space of the MLLM leads to the strongest performance. We arrive at these lessons via a thorough study of seven action grounding approaches on five different environments, encompassing over 114 embodied tasks.
Efficiently determining the satisfiability of a boolean equation --- known as the SAT problem for brevity --- is crucial in various industri… (voir plus)al problems. Recently, the advent of deep learning methods has introduced significant potential for enhancing SAT solving. However, a major barrier to the advancement of this field has been the scarcity of large, realistic datasets. The majority of current public datasets are either randomly generated or extremely limited, containing only a few examples from unrelated problem families. These datasets are inadequate for meaningful training of deep learning methods. In light of this, researchers have started exploring generative techniques to create data that more accurately reflect SAT problems encountered in practical situations. These methods have so far suffered from either the inability to produce challenging SAT problems or time-scalability obstacles. In this paper we address both by identifying and manipulating the key contributors to a problem's ``hardness'', known as cores. Although some previous work has addressed cores, the time costs are unacceptably high due to the expense of traditional heuristic core detection techniques. We introduce a fast core detection procedure that uses a graph neural network. Our empirical results demonstrate that we can efficiently generate problems that remain hard to solve and retain key attributes of the original example problems. We show via experiment that the generated synthetic SAT problems can be used in a data augmentation setting to provide improved prediction of solver runtimes.
Predicting molecular impact on cellular function is a core challenge in therapeutic design. Phenomic experiments, designed to capture cellul… (voir plus)ar morphology, utilize microscopy based techniques and demonstrate a high throughput solution for uncovering molecular impact on the cell. In this work, we learn a joint latent space between molecular structures and microscopy phenomic experiments, aligning paired samples with contrastive learning. Specifically, we study the problem ofContrastive PhenoMolecular Retrieval, which consists of zero-shot molecular structure identification conditioned on phenomic experiments. We assess challenges in multi-modal learning of phenomics and molecular modalities such as experimental batch effect, inactive molecule perturbations, and encoding perturbation concentration. We demonstrate improved multi-modal learner retrieval through (1) a uni-modal pre-trained phenomics model, (2) a novel inter sample similarity aware loss, and (3) models conditioned on a representation of molecular concentration. Following this recipe, we propose MolPhenix, a molecular phenomics model. MolPhenix leverages a pre-trained phenomics model to demonstrate significant performance gains across perturbation concentrations, molecular scaffolds, and activity thresholds. In particular, we demonstrate an 8.1x improvement in zero shot molecular retrieval of active molecules over the previous state-of-the-art, reaching 77.33% in top-1% accuracy. These results open the door for machine learning to be applied in virtual phenomics screening, which can significantly benefit drug discovery applications.
Large-scale training of latent diffusion models (LDMs) has enabled unprecedented quality in image generation.
However, large-scale end-to-e… (voir plus)nd training of these models is computationally costly, and hence most research focuses either on finetuning pretrained models or experiments at smaller scales.
In this work we aim to improve the training efficiency and performance of LDMs with the goal of scaling to larger datasets and higher resolutions.
We focus our study on two points that are critical for good performance and efficient training:
(i) the mechanisms used for semantic level (\eg a text prompt, or class name) and low-level (crop size, random flip, \etc) conditioning of the model, and
(ii) pre-training strategies to transfer representations learned on smaller and lower-resolution datasets to larger ones.
The main contributions of our work are the following:
we present systematic experimental study of these points,
we propose a novel conditioning mechanism that disentangles semantic and low-level conditioning,
we obtain state-of-the-art performance on CC12M for text-to-image at 512 resolution.