Home

Inspiring the development of artificial intelligence for the benefit of all

A professor talks to his students in a café/lounge.

Located in the heart of Quebec’s AI ecosystem, Mila is a community of more than 1,200 researchers specializing in machine learning and dedicated to scientific excellence and innovation.

About

Featured

Mila AI for Climate Studio

Leveraging AI for a Sustainable Future

Mila’s AI for Climate Studio aims to bridge the gap between technology and impact to unlock the potential of AI in tackling the climate crisis rapidly and on a massive scale.

Get involved

Learning

AI Advantage

Learn how to leverage generative AI to support and improve your productivity at work. The next cohort will take place online on August 26 and 28, 2025.

AI Governance

Mila AI Policy Fellowship

The program recently published its first policy brief, titled "Policy Considerations at the Intersection of Quantum Technologies and Artificial Intelligence," authored by Padmapriya Mohan.

Read the brief

a light and its reflexion on a black background

News

20 Jun 2025

Mila’s Science Communication Contest: AI Research in 3 Minutes

Group picture of all the participants of Mila Speed Science contest holding their certificate

Read the story

20 Jun 2025

How Indigenous Pathfinders in AI are Contributing to Transforming the Future of AI

Read the story

19 Jun 2025

Mila and CertX partner to promote AI safety and cybersecurity

Read the story

See more news

Faculty

Founded in 1993 by Professor Yoshua Bengio, Mila today brings together over 140 professors affiliated with Université de Montréal, McGill University, Polytechnique Montréal and HEC Montréal. Mila also welcomes professors from Université Laval, Université de Sherbrooke, École de technologie supérieure (ÉTS) and Concordia University.

Browse the online directory

Latest Publications

Adaptation, Comparison and Practical Implementation of Fairness Schemes in Kidney Exchange Programs

William St-Arnaud

Margarida Carvalho

Golnoosh Farnadi

In Kidney Exchange Programs (KEPs), each participating patient is registered together with an incompatible donor. Donors without an incompat… (see more)ible patient can also register. Then, KEPs typically maximize overall patient benefit through donor exchanges. This aggregation of benefits calls into question potential individual patient disparities in terms of access to transplantation in KEPs. Considering solely this utilitarian objective may become an issue in the case where multiple exchange plans are optimal or near-optimal. In fact, current KEP policies are all-or-nothing, meaning that only one exchange plan is determined. Each patient is either selected or not as part of that unique solution. In this work, we seek instead to find a policy that contemplates the probability of patients of being in a solution. To guide the determination of our policy, we adapt popular fairness schemes to KEPs to balance the usual approach of maximizing the utilitarian objective. Different combinations of fairness and utilitarian objectives are modelled as conic programs with an exponential number of variables. We propose a column generation approach to solve them effectively in practice. Finally, we make an extensive comparison of the different schemes in terms of the balance of utility and fairness score, and validate the scalability of our methodology for benchmark instances from the literature.

2025-08-01

European Journal of Operational Research (published)

doi.org

arxiv.org

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Roger Creus Castanyer

Johan Samir Obando Ceron

Lu Li

Scaling deep reinforcement learning networks is challenging and often results in degraded performance, yet the root causes of this failure m… (see more)ode remain poorly understood. Several recent works have proposed mechanisms to address this, but they are often complex and fail to highlight the causes underlying this difficulty. In this work, we conduct a series of empirical analyses which suggest that the combination of non-stationarity with gradient pathologies, due to suboptimal architectural choices, underlie the challenges of scale. We propose a series of direct interventions that stabilize gradient flow, enabling robust performance across a range of network depths and widths. Our interventions are simple to implement and compatible with well-established algorithms, and result in an effective mechanism that enables strong performance even at large scales. We validate our findings on a variety of agents and suites of environments.

2025-06-18

ArXiv (preprint)

arxiv.org

Can GPT4 Generate Effective Feedback on Code Readability?

Xiaotian Su

Yajie Song

Marcus Messer

Jaromir Savelka

Maria Cutumisu

April Wang

2025-06-17

Proceedings of the 30th ACM Conference on Innovation and Technology in Computer Science Education V. 2 (published)

doi.org

Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning

Martin Klissarov

Akhil Bagaria

Ziyan Luo

George Konidaris

Doina Precup

Marlos C. Machado

Developing agents capable of exploring, planning and learning in complex open-ended environments is a grand challenge in artificial intellig… (see more)ence (AI). Hierarchical reinforcement learning (HRL) offers a promising solution to this challenge by discovering and exploiting the temporal structure within a stream of experience. The strong appeal of the HRL framework has led to a rich and diverse body of literature attempting to discover a useful structure. However, it is still not clear how one might define what constitutes good structure in the first place, or the kind of problems in which identifying it may be helpful. This work aims to identify the benefits of HRL from the perspective of the fundamental challenges in decision-making, as well as highlight its impact on the performance trade-offs of AI agents. Through these benefits, we then cover the families of methods that discover temporal structure in HRL, ranging from learning directly from online experience to offline datasets, to leveraging large language models (LLMs). Finally, we highlight the challenges of temporal structure discovery and the domains that are particularly well-suited for such endeavours.

2025-06-16

ArXiv (preprint)

arxiv.org

See more publications

AI for Humanity

Socially responsible and beneficial development of AI is a fundamental component of Mila’s mission. As a leader in the field, we wish to contribute to social dialogue and the development of applications that will benefit society.

Learn more

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Home

Inspiring the development of artificial intelligence for the benefit of all

Located in the heart of Quebec’s AI ecosystem, Mila is a community of more than 1,200 researchers specializing in machine learning and dedicated to scientific excellence and innovation.

Leveraging AI for a Sustainable Future

AI Advantage

Mila AI Policy Fellowship

News

Faculty

Latest Publications

AI for Humanity