Mila’s mission is to be a global pole for scientific advances that inspires innovation and the progress of AI for the benefit of all. As part of this mission, Mila recognizes the significant potential of AI and the importance of making research more open, interdisciplinary and accessible.
Explore a selection of notable open source software efforts led by or co-developed with Mila researchers over the years.
Academic Torrents is scalable platform using BitTorrent which distributes the cost of hosting data to prevent the rise and fall of dataset hosting providers and the erasure of the data they host.
ALE is a reinforcement learning benchmark and a framework allowing researchers to develop AI agents for Atari 2600 games. It continues to be maintained by Mila researchers.
The AxonDeepSeg framework is a segmentation software for microscopy data of nerve fibers based on a convolutional neural network.
BabyAI is a testbed for training agents to understand and execute language commands.
Chester is a free and accessible prototype system that can be used by medical professionals to understand the reality of deep learning tools for chest X-ray diagnostics.
Distributed Evolutionary Algorithms in Python (DEAP) is an evolutionary computation framework for rapid prototyping and testing of ideas. It incorporates the data structures and tools required to implement most common evolutionary computation techniques such as genetic algorithm, genetic programming, evolution strategies, particle swarm optimization, differential evolution, and estimation of distribution algorithm. It is developed at Université Laval since 2009.
A research framework for fast prototyping of reinforcement learning algorithms. Dopamine was co-developed by Professor Marc G. Bellemare at Google.
Code and data for Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding.
With rich and powerful built-in molecular featurizers, Graphium is an open-source library designed for graph representation learning on real-world chemistry tasks. Graphium provides access to state-of-the-art GNN architectures via an extensible API, enabling researchers to build and train their own large-scale GNNs with ease.
Hrepr outputs HTML/pretty representations for Python objects.
A Hyperscanning Python Pipeline for inter-brain connectivity analysis.
Ivadomed is an integrated framework for medical image analysis with deep learning, based on PyTorch. The name is a portmanteau between IVADO (The Institute for data valorization) and Medical.
Jurigged lets you update your code while it runs.
The MEDomics UdeS research laboratory, led by Professor Martin Vallières at the Université de Sherbrooke, has been focused on the creation of predictive models in health informatics since its founding in 2020.
MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot Prompting. Large pre-trained models have proved to be remarkable zero- and (prompt-based) few-shot learners in unimodal vision and language tasks. We propose MAPL, a simple and parameter-efficient method that reuses frozen pre-trained unimodal models and leverages their strong generalization capabilities in multimodal vision-language (VL) settings. MAPL learns a lightweight mapping between the representation spaces of unimodal models using aligned image-text data, and can generalize to unseen VL tasks from just a few in-context examples. The small number of trainable parameters makes MAPL effective at low-data and in-domain learning. Moreover, MAPL's modularity enables easy extension to other pre-trained models. Extensive experiments on several visual question answering and image captioning benchmarks show that MAPL achieves superior or competitive performance compared to similar methods while training orders of magnitude fewer parameters. MAPL can be trained in just a few hours using modest computational resources and public datasets.
MilaBench is a repository of training benchmarks.
Minimalistic Gridworld Environment (MiniGrid) gym is maintained by the Farama Foundation.
Myia (follow-up to Theano) is a differentiable programming language capable of supporting large scale high-performance computations (e.g. linear algebra) and their gradients.
Myriad is a real-world testbed that aims to bridge the gap between trajectory optimization and deep learning. Myriad offers many real-world relevant, continuous-time dynamical system environments, and several trajectory optimization algorithms. These are all written in JAX, and as such can be easily integrated into a deep learning workflow. The environments and tools in Myriad can be used for trajectory optimization, system identification, imitation learning, and reinforcement learning.
A collaboration between Mila and IBM, Oríon is a black-box function optimization library with a key focus on usability and integrability for its users.
Paperoni allows users to search for scientific papers from the command line.
Ptera allows you to instrument code from the outside by specifying a set of variables to watch in an arbitrary Python call graph and manipulate a stream of their values.
PyNM is a lightweight python implementation of Normative Modeling making it approachable.
qMRLab is a MATLAB/Octave open-source software for quantitative MR image analysis. The main goal of the qMRLab project is to provide the community with software that makes data fitting, simulation and protocol optimization as easy as possible for a myriad of different quantitative models.
Sequoia is a software framework to unify continual learning research. A playground for research at the intersection of Continual, Reinforcement, and Self-Supervised Learning.
Shimming-Toolbox is an open-source Python software package enabling a variety of MRI shimming (magnetic field homogenization) techniques such as static and real-time shimming for use with standard manufacturer-supplied gradient/shim coils or with custom "multi-coil" arrays. The toolbox provides useful set of command line tools as well as a fsleyes plugin dedicated to make shimming more accessible and more reproducible.
SpeechBrain is an open-source, general-purpose PyTorch speech processing toolkit designed to make the research and development of neural speech processing technologies easier by being simple, flexible, user-friendly, and well-rounded.
SCT is a comprehensive, free and open-source set of command-line tools dedicated to the processing and analysis of spinal cord MRI data.
Patients stratification with Graph-regularized Non-negative Matrix Factorization (GNMF) in Python.
Mila PhD student Scott Fujimoto, co-supervised by Doina Precup and David Meger, holds the open source code for TD3, one of the best performing current deep reinforcement learning methods.
The Temporal Graph Benchmark (TGB) is a collection of challenging and diverse benchmark datasets for realistic, reproducible, and robust evaluation of machine learning models on temporal graphs. TGB datasets are of large scale, spanning years in duration, incorporate both node and edge-level prediction tasks and cover a diverse set of domains including social, trade, transaction, and transportation networks. For both tasks, we design evaluation protocols based on realistic use-cases. TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research, including data loading, experiment setup and performance evaluation. TGB will be maintained and updated on a regular basis and welcomes community feedback.
Theano, one of the earliest programming frameworks for deep learning, originated at Mila and Université de Montréal. Theano is a Python library and optimizing compiler for manipulating and evaluating mathematical expressions. The development of Theano was completed in 2017.
TorchDrug is an open-source machine learning platform for drug discovery, covering techniques ranging from graph machine learning, deep generative models to reinforcement learning.
Available as part of TorchDrug, TorchProtein is a ML library for protein science, providing representation learning models for both protein sequences and structures, as well as fundamental protein tasks like function prediction and structure prediction.