Le Studio d'IA pour le climat de Mila vise à combler l’écart entre la technologie et l'impact afin de libérer le potentiel de l'IA pour lutter contre la crise climatique rapidement et à grande échelle.
Le programme a récemment publié sa première note politique, intitulée « Considérations politiques à l’intersection des technologies quantiques et de l’intelligence artificielle », réalisée par Padmapriya Mohan.
Hugo Larochelle nommé directeur scientifique de Mila
Professeur associé à l’Université de Montréal et ancien responsable du laboratoire de recherche en IA de Google à Montréal, Hugo Larochelle est un pionnier de l’apprentissage profond et fait partie des chercheur·euses les plus respecté·es au Canada.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Divergent Perception: Framing Creative Cognition Through the Lens of Sensory Flexibility
Annotating chromatin loops is essential for understanding the 3D genome’s role in gene regulation, but current methods struggle with low c… (voir plus)overage, particularly in single-cell datasets. Chromatin loops are kilo-to mega-range structures that exhibit broader features, such as co-occurring loops, stripes, and domain boundaries along axial directions of Hi-C contact maps. However, existing tools primarily focus on detecting localized, highly-concentrated, interactions. Furthermore, the wide variety of available chromatin conformation datasets is rarely utilized in developing effective loop callers. Here, we present Polaris, a universal tool that integrates axial attention with a U-shaped backbone to accurately detect loops across different 3D genome assays. By leveraging extensive Hi-C contact maps in a pretrain-finetune paradigm, Polaris achieves consistent performance across various datasets. We compare Polaris against existing tools in loop annotation from both bulk and single-cell data and find that Polaris outperforms other programs across different cell types, species, sequencing depths, and assays.
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (voir plus)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.
Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational rea… (voir plus)soning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.
The Cambrian explosion of easily accessible pre-trained diffusion models suggests a demand for methods that combine multiple different pre-t… (voir plus)rained diffusion models without incurring the significant computational burden of re-training a larger combined model. In this paper, we cast the problem of combining multiple pre-trained diffusion models at the generation stage under a novel proposed framework termed superposition. Theoretically, we derive superposition from rigorous first principles stemming from the celebrated continuity equation and design two novel algorithms tailor-made for combining diffusion models in SuperDiff. SuperDiff leverages a new scalable It\^o density estimator for the log likelihood of the diffusion SDE which incurs no additional overhead compared to the well-known Hutchinson's estimator needed for divergence calculations. We demonstrate that SuperDiff is scalable to large pre-trained diffusion models as superposition is performed solely through composition during inference, and also enjoys painless implementation as it combines different pre-trained vector fields through an automated re-weighting scheme. Notably, we show that SuperDiff is efficient during inference time, and mimics traditional composition operators such as the logical OR and the logical AND. We empirically demonstrate the utility of using SuperDiff for generating more diverse images on CIFAR-10, more faithful prompt conditioned image editing using Stable Diffusion, as well as improved conditional molecule generation and unconditional de novo structure design of proteins. https://github.com/necludov/super-diffusion
Ensuring long-term fairness is crucial when developing automated decision making systems, specifically in dynamic and sequential environment… (voir plus)s. By maximizing their reward without consideration of fairness, AI agents can introduce disparities in their treatment of groups or individuals. In this paper, we establish the connection between bisimulation metrics and group fairness in reinforcement learning. We propose a novel approach that leverages bisimulation metrics to learn reward functions and observation dynamics, ensuring that learners treat groups fairly while reflecting the original problem. We demonstrate the effectiveness of our method in addressing disparities in sequential decision making problems through empirical evaluation on a standard fairness benchmark consisting of lending and college admission scenarios.
Ensuring long-term fairness is crucial when developing automated decision making systems, specifically in dynamic and sequential environment… (voir plus)s. By maximizing their reward without consideration of fairness, AI agents can introduce disparities in their treatment of groups or individuals. In this paper, we establish the connection between bisimulation metrics and group fairness in reinforcement learning. We propose a novel approach that leverages bisimulation metrics to learn reward functions and observation dynamics, ensuring that learners treat groups fairly while reflecting the original problem. We demonstrate the effectiveness of our method in addressing disparities in sequential decision making problems through empirical evaluation on a standard fairness benchmark consisting of lending and college admission scenarios.
Time-series studies are critical for uncovering dynamic biological processes, but achieving comprehensive profiling and resolution across mu… (voir plus)ltiple time points and modalities (multi-omics) remains challenging due to cost and scalability constraints. Current methods for studying temporal dynamics, whether at the bulk or single-cell level, often require extensive sampling, making it impractical to deeply profile all time points and modalities. To overcome these limitations, we present DTPSP, a deep learning framework designed to identify the most informative time points in any time-series study, enabling resource-efficient and targeted analyses. DTPSP models temporal gene expression patterns using readily obtainable data, such as bulk RNA-seq, to select time points that capture key system dynamics. It also integrates a deep generative module to infer data for non-sampled time points based on the selected time points, reconstructing the full temporal trajectory. This dual capability enables DTPSP to prioritize key time points for in-depth profiling, such as single-cell sequencing or multi-omics analyses, while filling gaps in the temporal landscape with high fidelity. We apply DTPSP to developmental and disease-associated time courses, demonstrating its ability to optimize experimental designs across bulk and single-cell studies. By reducing costs, enabling strategic multi-omics profiling, and enhancing biological insights, DTPSP provides a scalable and generalized solution for investigating dynamic systems.
In single-cell sequencing analysis, several computational methods have been developed to map the cellular state space, but little has been d… (voir plus)one to map or create embeddings of the gene space. Here, we formulate the gene embedding problem, design tasks with simulated single-cell data to evaluate representations, and establish ten relevant baselines. We then present a graph signal processing approach we call gene signal pattern analysis (GSPA) that learns rich gene representations from single-cell data using a dictionary of diffusion wavelets on the cell-cell graph. GSPA enables characterization of genes based on their patterning on the cellular manifold. It also captures how localized or diffuse the expression of a gene is, for which we present a score called the gene localization score. We motivate and demonstrate the efficacy of GSPA as a framework for a range of biological tasks, such as capturing gene coexpression modules, condition-specific enrichment, and perturbation-specific gene-gene interactions. Then, we showcase the broad utility of gene rep-resentations derived from GSPA, including for cell-cell communication (GSPA-LR), spatial transcriptomics (GSPA-multimodal), and patient response (GSPA-Pt) analysis.
One of the most fundamental laws of physics is the principle of least action. Motivated by its predictive power, we introduce a neuronal lea… (voir plus)st-action principle for cortical processing of sensory streams to produce appropriate behavioural outputs in real time. The principle postulates that the voltage dynamics of cortical pyramidal neurons prospectively minimize the local somato-dendritic mismatch error within individual neurons. For motor output neurons, it implies minimizing an instantaneous behavioural error. For deep network neurons, it implies a prospective firing to overcome integration delays and correct for possible output errors right in time. The neuron-specific errors are extracted in the apical dendrites of pyramidal neurons through a cortical microcircuit that tries to explain away the feedback from the periphery, and correct the trajectory on the fly. Any motor output is in a moving equilibrium with the sensory inputs and the motor feedback during the whole sensory-motor trajectory. Ongoing synaptic plasticity reduces the somato-dendritic mismatch error within each cortical neuron and performs gradient descent on the output cost at any moment in time. The neuronal least-action principle offers an axiomatic framework to derive local neuronal and synaptic dynamics for global real-time computation and learning in the brain and in physical substrates in general.
Combining multiple machine learning models has long been a technique for enhancing performance, particularly in distributed settings. Tradit… (voir plus)ional approaches, such as model ensembles, work well, but are expensive in terms of memory and compute. Recently, methods based on averaging model parameters have achieved good results in some settings and have gained popularity. However, merging models initialized differently that do not share a part of their training trajectories can yield worse results than simply using the base models, even after aligning their neurons. In this paper, we introduce a novel approach, Non-uniform Parameter-wise Model Merging, or NP Merge, which merges models by learning the contribution of each parameter to the final model using gradient-based optimization. We empirically demonstrate the effectiveness of our method for merging models of various architectures in multiple settings, outperforming past methods. We also extend NP Merge to handle the merging of multiple models, showcasing its scalability and robustness.