Publications

A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale

Hao-Jun Michael Shi

Tsung-Hsien Lee

Shintaro Iwasaki

Jose Gallego-Posada

Zhijing Li

Kaushik Rangadurai

Dheevatsa Mudigere

Michael Rabbat

2023-09-12

ArXiv (prépublication)

doi.org

arxiv.org

Leveraging ChatGPT to Democratize and Decolonize Global Surgery: Large Language Models for Small Healthcare Budgets

Fabio Botelho

Jean Marie Tshimula

Dan Poenaru

2023-09-10

World Journal of Surgery (publié)

doi.org

Local field potentials in human motor and non-motor brain areas encode the direction of upcoming movements: An intracerebral EEG classification study

Etienne Combrisson

Franck Di Rienzo

Anne-Lise Saive

Marcela Perrone-Bertolotti

Juan LP Soto

Philippe Kahane

Jean-Philippe Lachaux

Aymeric Guillot

Karim Jerbi

2023-09-10

bioRxiv (prépublication)

doi.org

Neural Causal Structure Discovery from Interventions

Nan Rosemary Ke

Olexa Bilaniuk

Anirudh Goyal

Stefan Bauer

Hugo Larochelle

Bernhard Schölkopf

Michael Curtis Mozer

Chris Pal

Yoshua Bengio

Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (voir plus) However, there are theoretical limitations on the identifiability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository.

2023-09-10

TMLR (accepté)

openreview.net

Let Coarse-Grained Resources Be Shared: Mapping Entire Neural Networks on FPGAs

Tzung-Han Juang

Christof Schlaak

Christophe Dubach

2023-09-09

ACM Transactions on Embedded Computing Systems (publié)

doi.org

Leveraging World Model Disentanglement in Value-Based Multi-Agent Reinforcement Learning

Zhizun Wang

David Meger

In this paper, we propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentang… (voir plus)led World Model to address the challenge of achieving a common goal of multiple agents interacting in the same environment with reduced sample complexity. Due to scalability and non-stationarity problems posed by multi-agent systems, model-free methods rely on a considerable number of samples for training. In contrast, we use a modularized world model, composed of action-conditioned, action-free, and static branches, to unravel the environment dynamics and produce imagined outcomes based on past experience, without sampling directly from the real environment. We employ variational auto-encoders and variational graph auto-encoders to learn the latent representations for the world model, which is merged with a value-based framework to predict the joint action-value function and optimize the overall training objective. We present experimental results in Easy, Hard, and Super-Hard StarCraft II micro-management challenges to demonstrate that our method achieves high sample efficiency and exhibits superior performance in defeating the enemy armies compared to other baselines.

2023-09-08

ArXiv (prépublication)

doi.org

arxiv.org

Bridging the Gap Between Target Networks and Functional Regularization

Alexandre Piché

Valentin Thomas

Joseph Marino

Gian Maria Marconi

Rafael Pardinas

Chris Pal

Mohammad Emtiyaz Khan

2023-09-06

TMLR (accepté)

doi.org

openreview.net

Cardiomyocyte orientation recovery at micrometer scale reveals long‐axis fiber continuum in heart walls

Drisya Dileep

Tabish A Syed

Tyler FW Sloan

Perundurai S Dhandapany

Kaleem Siddiqi

Minhajuddin Sirajuddin

2023-09-06

The EMBO Journal (publié)

doi.org

Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks

Daniel Levy

Sékou-Oumar Kaba

Carmelo Gonzales

Santiago Miret

Siamak Ravanbakhsh

2023-09-06

ArXiv (prépublication)

doi.org

arxiv.org

Breaking Barriers to Creative Expression: Co-Designing and Implementing an Accessible Text-to-Image Interface

Atieh Taheri

Mohammad Izadi

Gururaj Shriram

Negar Rostamzadeh

Shaun Kane

Text-to-image generation models have grown in popularity due to their ability to produce high-quality images from a text prompt. One use for… (voir plus) this technology is to enable the creation of more accessible art creation software. In this paper, we document the development of an alternative user interface that reduces the typing effort needed to enter image prompts by providing suggestions from a large language model, developed through iterative design and testing within the project team. The results of this testing demonstrate how generative text models can support the accessibility of text-to-image models, enabling users with a range of abilities to create visual art.

2023-09-05

ArXiv (prépublication)

doi.org

arxiv.org

Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures

Option Pricing

Saeed Marzban

Erick Delage

Jonathan Yu-Meng Li

2023-09-05

Quantitative Finance (publié)

doi.org

Tidying Up the Conversational Recommender Systems' Biases

Armin Moradi

Golnoosh Farnadi

The growing popularity of language models has sparked interest in conversational recommender systems (CRS) within both industry and research… (voir plus) circles. However, concerns regarding biases in these systems have emerged. While individual components of CRS have been subject to bias studies, a literature gap remains in understanding specific biases unique to CRS and how these biases may be amplified or reduced when integrated into complex CRS models. In this paper, we provide a concise review of biases in CRS by surveying recent literature. We examine the presence of biases throughout the system's pipeline and consider the challenges that arise from combining multiple models. Our study investigates biases in classic recommender systems and their relevance to CRS. Moreover, we address specific biases in CRS, considering variations with and without natural language understanding capabilities, along with biases related to dialogue systems and language models. Through our findings, we highlight the necessity of adopting a holistic perspective when dealing with biases in complex CRS models.

2023-09-05

ArXiv (prépublication)

doi.org

arxiv.org

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Publications

Avantage IA

Mettre à profit l'IA pour un avenir durable

Bourse Mila en politiques de l'IA

Avantage IA

Mettre à profit l'IA pour un avenir durable

Mots-clés populaires:

Publications