Home

Inspiring the development of artificial intelligence for the benefit of all 

A professor talks to his students in a café/lounge.

Located in the heart of Quebec’s AI ecosystem, Mila is a community of more than 1,200 researchers specializing in machine learning and dedicated to scientific excellence and innovation.

About

Featured
Featured
Featured

Faculty 

Founded in 1993 by Professor Yoshua Bengio, Mila today brings together over 140 professors affiliated with Université de Montréal, McGill University, Polytechnique Montréal and HEC Montréal. Mila also welcomes professors from Université Laval, Université de Sherbrooke, École de technologie supérieure (ÉTS) and Concordia University. 

Browse the online directory

Photo of Yoshua Bengio

Latest Publications

Adaptation, Comparison and Practical Implementation of Fairness Schemes in Kidney Exchange Programs
In Kidney Exchange Programs (KEPs), each participating patient is registered together with an incompatible donor. Donors without an incompat… (see more)ible patient can also register. Then, KEPs typically maximize overall patient benefit through donor exchanges. This aggregation of benefits calls into question potential individual patient disparities in terms of access to transplantation in KEPs. Considering solely this utilitarian objective may become an issue in the case where multiple exchange plans are optimal or near-optimal. In fact, current KEP policies are all-or-nothing, meaning that only one exchange plan is determined. Each patient is either selected or not as part of that unique solution. In this work, we seek instead to find a policy that contemplates the probability of patients of being in a solution. To guide the determination of our policy, we adapt popular fairness schemes to KEPs to balance the usual approach of maximizing the utilitarian objective. Different combinations of fairness and utilitarian objectives are modelled as conic programs with an exponential number of variables. We propose a column generation approach to solve them effectively in practice. Finally, we make an extensive comparison of the different schemes in terms of the balance of utility and fairness score, and validate the scalability of our methodology for benchmark instances from the literature.
Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
Roger Creus Castanyer
Johan Samir Obando Ceron
Lu Li
Scaling deep reinforcement learning networks is challenging and often results in degraded performance, yet the root causes of this failure m… (see more)ode remain poorly understood. Several recent works have proposed mechanisms to address this, but they are often complex and fail to highlight the causes underlying this difficulty. In this work, we conduct a series of empirical analyses which suggest that the combination of non-stationarity with gradient pathologies, due to suboptimal architectural choices, underlie the challenges of scale. We propose a series of direct interventions that stabilize gradient flow, enabling robust performance across a range of network depths and widths. Our interventions are simple to implement and compatible with well-established algorithms, and result in an effective mechanism that enables strong performance even at large scales. We validate our findings on a variety of agents and suites of environments.
Can GPT4 Generate Effective Feedback on Code Readability?
Xiaotian Su
Yajie Song
Marcus Messer
Jaromir Savelka
April Wang
Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning
Martin Klissarov
Akhil Bagaria
Ziyan Luo
George Konidaris
Marlos C. Machado
Developing agents capable of exploring, planning and learning in complex open-ended environments is a grand challenge in artificial intellig… (see more)ence (AI). Hierarchical reinforcement learning (HRL) offers a promising solution to this challenge by discovering and exploiting the temporal structure within a stream of experience. The strong appeal of the HRL framework has led to a rich and diverse body of literature attempting to discover a useful structure. However, it is still not clear how one might define what constitutes good structure in the first place, or the kind of problems in which identifying it may be helpful. This work aims to identify the benefits of HRL from the perspective of the fundamental challenges in decision-making, as well as highlight its impact on the performance trade-offs of AI agents. Through these benefits, we then cover the families of methods that discover temporal structure in HRL, ranging from learning directly from online experience to offline datasets, to leveraging large language models (LLMs). Finally, we highlight the challenges of temporal structure discovery and the domains that are particularly well-suited for such endeavours.

AI for Humanity

Socially responsible and beneficial development of AI is a fundamental component of Mila’s mission. As a leader in the field, we wish to contribute to social dialogue and the development of applications that will benefit society.

Learn more

A person looks up at a starry sky.