Publications

Fairness in Reinforcement Learning with Bisimulation Metrics

Ensuring long-term fairness is crucial when developing automated decision making systems, specifically in dynamic and sequential environment… (see more)s. By maximizing their reward without consideration of fairness, AI agents can introduce disparities in their treatment of groups or individuals. In this paper, we establish the connection between bisimulation metrics and group fairness in reinforcement learning. We propose a novel approach that leverages bisimulation metrics to learn reward functions and observation dynamics, ensuring that learners treat groups fairly while reflecting the original problem. We demonstrate the effectiveness of our method in addressing disparities in sequential decision making problems through empirical evaluation on a standard fairness benchmark consisting of lending and college admission scenarios.

2024-12-21

ArXiv (preprint)

doi.org

arxiv.org

DTPSP: A Deep Learning Framework for Optimized Time Point Selection in Time-Series Single-Cell Studies

Michel Hijazin

Pumeng Shi

Jingtao Wang

Jun Ding

Time-series studies are critical for uncovering dynamic biological processes, but achieving comprehensive profiling and resolution across mu… (see more)ltiple time points and modalities (multi-omics) remains challenging due to cost and scalability constraints. Current methods for studying temporal dynamics, whether at the bulk or single-cell level, often require extensive sampling, making it impractical to deeply profile all time points and modalities. To overcome these limitations, we present DTPSP, a deep learning framework designed to identify the most informative time points in any time-series study, enabling resource-efficient and targeted analyses. DTPSP models temporal gene expression patterns using readily obtainable data, such as bulk RNA-seq, to select time points that capture key system dynamics. It also integrates a deep generative module to infer data for non-sampled time points based on the selected time points, reconstructing the full temporal trajectory. This dual capability enables DTPSP to prioritize key time points for in-depth profiling, such as single-cell sequencing or multi-omics analyses, while filling gaps in the temporal landscape with high fidelity. We apply DTPSP to developmental and disease-associated time courses, demonstrating its ability to optimize experimental designs across bulk and single-cell studies. By reducing costs, enabling strategic multi-omics profiling, and enhancing biological insights, DTPSP provides a scalable and generalized solution for investigating dynamic systems.

2024-12-19

bioRxiv (preprint)

doi.org

Mapping the gene space at single-cell resolution with gene signal pattern analysis

Aarthi Venkat

Martina Damo

Samuel Leone

Scott E. Youlten

Nikhil S. Joshi

Eric Fagerberg

Smita Krishnaswamy

John Attanasio

Michael Perlmutter

In single-cell sequencing analysis, several computational methods have been developed to map the cellular state space, but little has been d… (see more)one to map or create embeddings of the gene space. Here, we formulate the gene embedding problem, design tasks with simulated single-cell data to evaluate representations, and establish ten relevant baselines. We then present a graph signal processing approach we call gene signal pattern analysis (GSPA) that learns rich gene representations from single-cell data using a dictionary of diffusion wavelets on the cell-cell graph. GSPA enables characterization of genes based on their patterning on the cellular manifold. It also captures how localized or diffuse the expression of a gene is, for which we present a score called the gene localization score. We motivate and demonstrate the efficacy of GSPA as a framework for a range of biological tasks, such as capturing gene coexpression modules, condition-specific enrichment, and perturbation-specific gene-gene interactions. Then, we showcase the broad utility of gene rep-resentations derived from GSPA, including for cell-cell communication (GSPA-LR), spatial transcriptomics (GSPA-multimodal), and patient response (GSPA-Pt) analysis.

2024-12-19

Nature Computational Science (published)

doi.org

A neuronal least-action principle for real-time learning in cortical circuits

Walter Senn

Dominik Dold

Akos F. Kungl

Benjamin Ellenberger

Jakob Jordan

Yoshua Bengio

João Sacramento

Mihai A. Petrovici

One of the most fundamental laws of physics is the principle of least action. Motivated by its predictive power, we introduce a neuronal lea… (see more)st-action principle for cortical processing of sensory streams to produce appropriate behavioural outputs in real time. The principle postulates that the voltage dynamics of cortical pyramidal neurons prospectively minimize the local somato-dendritic mismatch error within individual neurons. For motor output neurons, it implies minimizing an instantaneous behavioural error. For deep network neurons, it implies a prospective firing to overcome integration delays and correct for possible output errors right in time. The neuron-specific errors are extracted in the apical dendrites of pyramidal neurons through a cortical microcircuit that tries to explain away the feedback from the periphery, and correct the trajectory on the fly. Any motor output is in a moving equilibrium with the sensory inputs and the motor feedback during the whole sensory-motor trajectory. Ongoing synaptic plasticity reduces the somato-dendritic mismatch error within each cortical neuron and performs gradient descent on the output cost at any moment in time. The neuronal least-action principle offers an axiomatic framework to derive local neuronal and synaptic dynamics for global real-time computation and learning in the brain and in physical substrates in general.

2024-12-19

eLife (published)

doi.org

Roboethics for Everyone – A Hands-On Teaching Module for K-12 and Beyond

In this work, we address the evolving landscape of roboethics, expanding beyond physical safety to encompass broader societal implications. … (see more)Recognizing the siloed nature of existing initiatives to teach and inform ethical implications of artificial intelligence (AI) and robotic systems, we present a roboethics teaching module designed for K-12 students and general audiences. The module focuses on the high-level analysis of the interplay between robot behaviour design choices and ethics, using everyday social dilemmas. We delivered the module in a workshop to high school students in Montreal, Canada. From this experience, we observed that the module successfully fostered critical thinking and ethical considerations in students, without requiring advanced technical knowledge. This teaching module holds promise to reach a wider range of populations. We urge the education community to explore similar approaches and engage in interdisciplinary training opportunities regarding the ethical implications of AI and robotics.

2024-12-19

Proceedings of the Canadian Engineering Education Association (CEEA) (published)

doi.org

Robust Guided Diffusion for Offline Black-Box Optimization

Can Chen

Christopher Beckham

Zixuan Liu

Xue Liu

Christopher Pal

Offline black-box optimization aims to maximize a black-box function using an offline dataset of designs and their measured properties. Two … (see more)main approaches have emerged: the forward approach, which learns a mapping from input to its value, thereby acting as a proxy to guide optimization, and the inverse approach, which learns a mapping from value to input for conditional generation. (a) Although proxy-free~(classifier-free) diffusion shows promise in robustly modeling the inverse mapping, it lacks explicit guidance from proxies, essential for generating high-performance samples beyond the training distribution. Therefore, we propose \textit{proxy-enhanced sampling} which utilizes the explicit guidance from a trained proxy to bolster proxy-free diffusion with enhanced sampling control. (b) Yet, the trained proxy is susceptible to out-of-distribution issues. To address this, we devise the module \textit{diffusion-based proxy refinement}, which seamlessly integrates insights from proxy-free diffusion back into the proxy for refinement. To sum up, we propose \textit{\textbf{R}obust \textbf{G}uided \textbf{D}iffusion for Offline Black-box Optimization}~(\textbf{RGD}), combining the advantages of proxy~(explicit guidance) and proxy-free diffusion~(robustness) for effective conditional generation. RGD achieves state-of-the-art results on various design-bench tasks, underscoring its efficacy. Our code is at https://anonymous.4open.science/r/RGD-27A5/README.md.

2024-12-19

TMLR (accepted)

doi.org

openreview.net

Editorial: Special Issue on Software Engineering and AI for Data Quality

Foutse Khomh

Andreas Metzger

Phu Nguyen

Sagar Sen

This editorial summarizes the content of the Special Issue on Software Engineering and AI for Data Quality of the Journal of Data and Inform… (see more)ation Quality (JDIQ).

2024-12-18

Journal of Data and Information Quality (published)

doi.org

Delays in Care for Children With Low Anorectal Malformations in Southwestern Uganda.

Felix Oyania

Caroline Q. Stephens

Sarah Ullrich

Meera Kotagal

Daniel Kisitu

Francis Bajunirwe

Doruk Ozgediz

Dan Poenaru

2024-12-17

Journal of Surgical Research (published)

doi.org

Large scale Raman spectrum calculations in defective 2D materials using deep learning

Olivier Malenfant-Thuot

Dounia Shaaban Kabakibo

Simon Blackburn

Bruno Rousseau

Michel Côté

We introduce a machine learning prediction workflow to study the impact of defects on the Raman response of 2D materials. By combining the u… (see more)se of machine-learned interatomic potentials, the Raman-active Γ-weighted density of states method and splitting configurations in independant patches, we are able to reach simulation sizes in the tens of thousands of atoms, with diagonalization now being the main bottleneck of the simulation. We apply the method to two systems, isotopic graphene and defective hexagonal boron nitride, and compare our predicted Raman response to experimental results, with good agreement. Our method opens up many possibilities for future studies of Raman response in solid-state physics.

2024-12-17

Journal of Physics: Condensed Matter (published)

doi.org

arxiv.org

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Shengbang Tong

David Fan

Jiachen Zhu

Yunyang Xiong

Xinlei Chen

Koustuv Sinha

Michael G. Rabbat

Yann Lecun

Saining Xie

Zhuang Liu

In this work, we propose Visual-Predictive Instruction Tuning (VPiT) - a simple and effective extension to visual instruction tuning that en… (see more)ables a pretrained LLM to quickly morph into an unified autoregressive model capable of generating both text and visual tokens. VPiT teaches an LLM to predict discrete text tokens and continuous visual tokens from any input sequence of image and text data curated in an instruction-following format. Our empirical investigation reveals several intriguing properties of VPiT: (1) visual generation ability emerges as a natural byproduct of improved visual understanding, and can be unlocked efficiently with a small amount of generation data; (2) while we find understanding and generation to be mutually beneficial, understanding data contributes to both capabilities more effectively than generation data. Building upon these findings, we train our MetaMorph model and achieve competitive performance on both visual understanding and generation. In visual generation, MetaMorph can leverage the world knowledge and reasoning abilities gained from LLM pretraining, and overcome common failure modes exhibited by other generation models. Our results suggest that LLMs may have strong"prior"vision capabilities that can be efficiently adapted to both visual understanding and generation with a relatively simple instruction tuning process.

2024-12-17

ArXiv (preprint)

doi.org

arxiv.org

Agent-state based policies in POMDPs: Beyond belief-state MDPs

Amit Sinha

Aditya Mahajan

The traditional approach to POMDPs is to convert them into fully observed MDPs by considering a belief state as an information state. Howeve… (see more)r, a belief-state based approach requires perfect knowledge of the system dynamics and is therefore not applicable in the learning setting where the system model is unknown. Various approaches to circumvent this limitation have been proposed in the literature. We present a unified treatment of some of these approaches by viewing them as models where the agent maintains a local recursively updateable “agent state” and chooses actions based on the agent state. We highlight the different classes of agent-state based policies and the various approaches that have been proposed in the literature to find good policies within each class. These include the designer’s approach to find optimal non-stationary agent-state based policies, policy search approaches to find a locally optimal stationary agent-state based policies, and the approximate information state to find approximately optimal stationary agent-state based policies. We then present how ideas from the approximate information state approach have been used to improve Q-learning and actor-critic algorithms for learning in POMDPs.

2024-12-15

2024 IEEE 63rd Conference on Decision and Control (CDC) (published)

doi.org

arxiv.org

Constant step-size stochastic approximation with delayed updates

Aditya Mahajan

Silviu-Iulian Niculescu

Mathukumalli Vidyasagar

In this paper, we consider constant step-size stochastic approximation with delayed updates. For the non-delayed case, it is well known that… (see more) under appropriate conditions, the discrete-time iterates of stochastic approximation track the trajectory of a continuous-time ordinary differential equation (ODE). For the delayed case, we show in this paper that, under appropriate conditions, the discrete-time iterates track the trajectory of a delay-differential equation (DDE) rather than an ODE. Thus, delayed updates lead to a qualitative change in the behavior of constant step-size stochastic approximation. We present multiple examples to illustrate the qualitative affect of delay and show that increasing the delay is generally destabilizing but, for some systems, it can be stabilizing as well.

2024-12-15

2024 IEEE 63rd Conference on Decision and Control (CDC) (published)

doi.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications