Mila’s AI for Climate Studio aims to bridge the gap between technology and impact to unlock the potential of AI in tackling the climate crisis rapidly and on a massive scale.
The program recently published its first policy brief, titled "Policy Considerations at the Intersection of Quantum Technologies and Artificial Intelligence," authored by Padmapriya Mohan.
Hugo Larochelle appointed Scientific Director of Mila
An adjunct professor at the Université de Montréal and former head of Google's AI lab in Montréal, Hugo Larochelle is a pioneer in deep learning and one of Canada’s most respected researchers.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Conditional image generative models hold considerable promise to produce infinite amounts of synthetic training data. Yet, recent progress i… (see more)n generation quality has come at the expense of generation diversity, limiting the utility of these models as a source of synthetic training data. Although guidance-based approaches have been introduced to improve the utility of generated data by focusing on quality or diversity, the (implicit or explicit) utility functions oftentimes disregard the potential distribution shift between synthetic and real data. In this work, we introduce Chamfer Guidance: a training-free guidance approach which leverages a handful of real exemplar images to characterize the quality and diversity of synthetic data. We show that by leveraging the proposed Chamfer Guidance, we can boost the diversity of the generations w.r.t. a dataset of real images while maintaining or improving the generation quality on ImageNet-1k and standard geo-diversity benchmarks. Our approach achieves state-of-the-art few-shot performance with as little as 2 exemplar real images, obtaining 96.4\% in terms of precision, and 86.4\% in terms of distributional coverage, which increase to 97.5\% and 92.7\%, respectively, when using 32 real images. We showcase the benefits of the Chamfer Guidance generation by training downstream image classifiers on synthetic data, achieving accuracy boost of up to 15\% for in-distribution over the baselines, and up to 16\% in out-of-distribution. Furthermore, our approach does not require using the unconditional model, and thus obtains a 31\% reduction in FLOPs w.r.t. classifier-free-guidance-based approaches at sampling time.
Precise radial velocity (pRV) measurements of M-dwarfs in the near-infrared (NIR) rely on empirical templates due to the lack of accurate st… (see more)ellar spectral models in this regime. Templates are assumed to approximate the true spectrum when constructed from many observations or in the high signal-to-noise limit. We develop a numerical simulation that generates SPIRou-like pRV observations from PHOENIX spectra, constructs empirical templates, and estimates radial velocities. This simulation solely considers photon noise and evaluates when empirical templates remain reliable for pRV analysis. Our results reveal a previously unrecognized noise source in templates, establishing a fundamental floor for template-based pRV measurements. We find that templates inherently include distortions in stellar line shapes due to imperfect interpolation at the detector's sampling resolution. The magnitude of this interpolation error depends on sampling resolution and RV content. Consequently, while stars with a higher RV content, such as cooler M-dwarfs are expected to yield lower RV uncertainties, their dense spectral features can amplify interpolation errors, potentially biasing RV estimates. For a typical M4V star, SPIRou's spectral and sampling resolution imposes an RV uncertainty floor of 0.5-0.8 m/s, independent of the star's magnitude or the telescope's aperture. These findings reveal a limitation of template-based pRV methods, underscoring the need for improved spectral modeling and better-than-Nyquist detector sampling to reach the next level of RV precision.
Forecasting in real-world settings requires models to integrate not only historical data but also relevant contextual information, often ava… (see more)ilable in textual form. While recent work has shown that large language models (LLMs) can be effective context-aided forecasters via na\"ive direct prompting, their full potential remains underexplored. We address this gap with 4 strategies, providing new insights into the zero-shot capabilities of LLMs in this setting. ReDP improves interpretability by eliciting explicit reasoning traces, allowing us to assess the model's reasoning over the context independently from its forecast accuracy. CorDP leverages LLMs solely to refine existing forecasts with context, enhancing their applicability in real-world forecasting pipelines. IC-DP proposes embedding historical examples of context-aided forecasting tasks in the prompt, substantially improving accuracy even for the largest models. Finally, RouteDP optimizes resource efficiency by using LLMs to estimate task difficulty, and routing the most challenging tasks to larger models. Evaluated on different kinds of context-aided forecasting tasks from the CiK benchmark, our strategies demonstrate distinct benefits over na\"ive prompting across LLMs of different sizes and families. These results open the door to further simple yet effective improvements in LLM-based context-aided forecasting.
Deep learning models operating in the image domain are vulnerable to small input perturbations. For years, robustness to such perturbations … (see more)was pursued by training models from scratch (i.e., with random initializations) using specialized loss objectives. Recently, robust fine-tuning has emerged as a more efficient alternative: instead of training from scratch, pretrained models are adapted to maximize predictive performance and robustness. To conduct robust fine-tuning, practitioners design an optimization strategy that includes the model update protocol (e.g., full or partial) and the specialized loss objective. Additional design choices include the architecture type and size, and the pretrained representation. These design choices affect robust generalization, which is the model's ability to maintain performance when exposed to new and unseen perturbations at test time. Understanding how these design choices influence generalization remains an open question with significant practical implications. In response, we present an empirical study spanning 6 datasets, 40 pretrained architectures, 2 specialized losses, and 3 adaptation protocols, yielding 1,440 training configurations and 7,200 robustness measurements across five perturbation types. To our knowledge, this is the most diverse and comprehensive benchmark of robust fine-tuning to date. While attention-based architectures and robust pretrained representations are increasingly popular, we find that convolutional neural networks pretrained in a supervised manner on large datasets often perform best. Our analysis both confirms and challenges prior design assumptions, highlighting promising research directions and offering practical guidance.