Publications

Gap Minimization for Knowledge Sharing and Transfer

Boyu Wang

Jorge A. Mendez

Changjian Shui

Fan Zhou

Di Wu

Gezheng Xu

Christian Gagné

Eric R. Eaton

Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades. In order … (see more)to successfully transfer information from one task to another, it is critical to understand the similarities and differences between the domains. In this paper, we introduce the notion of \emph{performance gap}, an intuitive and novel measure of the distance between learning tasks. Unlike existing measures which are used as tools to bound the difference of expected risks between tasks (e.g.,

arxiv.org

General Purpose AI Systems in the AI Act: Trying to Fit a Square Peg Into a Round Hole

Claire Boine

David Rolnick

2023-01-01

SSRN Electronic Journal (published)

doi.org

Generating QM1B with PySCF$_{\text{IPU}}$

Alexander Mathiasen

Hatem Helal

Kerstin Klaeser

Paul Balanca

Josef Dean

Carlo Luschi

Dominique Beaini

Andrew William Fitzgibbon

Dominic Masters

openreview.net

Generating QM1B with PySCFIPU

Alexander Mathiasen

Hatem Helal

Kerstin Klaser

Paul Balanca

Josef Dean

Carlo Luschi

Dominique Beaini

Andrew William Fitzgibbon

Dominic Masters

2023-01-01

NeurIPS (published)

doi.org

arxiv.org

GEODESIC SINKHORN FOR FAST AND ACCURATE OPTIMAL TRANSPORT ON MANIFOLDS

Guillaume Huguet

Alexander Tong

María Ramos Zapatero

Christopher J. Tape

Guy Wolf

Smita Krishnaswamy

Efficient computation of optimal transport distance between distributions is of growing importance in data science. Sinkhorn-based methods a… (see more)re currently the state-of-the-art for such computations, but require O(n2) computations. In addition, Sinkhorn-based methods commonly use an Euclidean ground distance between datapoints. However, with the prevalence of manifold structured scientific data, it is often desirable to consider geodesic ground distance. Here, we tackle both issues by proposing Geodesic Sinkhorn—based on diffusing a heat kernel on a manifold graph. Notably, Geodesic Sinkhorn requires only O(n log n) computation, as we approximate the heat kernel with Chebyshev polynomials based on the sparse graph Laplacian. We apply our method to the computation of barycenters of several distributions of high dimensional single cell data from patient samples undergoing chemotherapy. In particular, we define the barycentric distance as the distance between two such barycenters. Using this definition, we identify an optimal transport distance and path associated with the effect of treatment on cellular data.

2023-01-01

2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP) (published)

doi.org

arxiv.org

GFlowNets for AI-Driven Scientific Discovery

Moksh J. Jain

Tristan Deleu

Jason Hartford

Cheng-Hao Liu

Alex Hernandez-Garcia

Yoshua Bengio

Tackling the most pressing problems for humanity, such as the climate crisis and the threat of global pandemics, requires accelerating the p… (see more)ace of scientific discovery. While science has traditionally relied...

2023-01-01

Digital Discovery (published)

doi.org

arxiv.org

GFlowOut: Dropout with Generative Flow Networks

Dianbo Liu

Moksh J. Jain

Bonaventure F. P. Dossou

Qianli Shen

Salem Lahlou

Anirudh Goyal

Nikolay Malkin

Chris Emezue

Dinghuai Zhang

Nadhir Hassen

Xu Ji

Kenji Kawaguchi

Yoshua Bengio

2023-01-01

ICML (published)

doi.org

openreview.net

GFlowOut: Dropout with Generative Flow Networks

Dianbo Liu

Moksh J. Jain

Bonaventure F. P. Dossou

Qianli Shen

Salem Lahlou

Anirudh Goyal

Nikolay Malkin

Chris Emezue

Dinghuai Zhang

Nadhir Hassen

Xu Ji

Kenji Kawaguchi

Yoshua Bengio

2023-01-01

ICML (published)

doi.org

openreview.net

GitHub Copilot AI pair programmer: Asset or Liability?

Arghavan Moradi Dakhel

Vahid Majdinasab

Amin Nikanjam

Foutse Khomh

Michel C. Desmarais

Z. Jiang

Automatic program synthesis is a long-lasting dream in software engineering. Recently, a promising Deep Learning (DL) based solution, called… (see more) Copilot, has been proposed by OpenAI and Microsoft as an industrial product. Although some studies evaluate the correctness of Copilot solutions and report its issues, more empirical evaluations are necessary to understand how developers can benefit from it effectively. In this paper, we study the capabilities of Copilot in two different programming tasks: (i) generating (and reproducing) correct and efficient solutions for fundamental algorithmic problems, and (ii) comparing Copilot's proposed solutions with those of human programmers on a set of programming tasks. For the former, we assess the performance and functionality of Copilot in solving selected fundamental problems in computer science, like sorting and implementing data structures. In the latter, a dataset of programming problems with human-provided solutions is used. The results show that Copilot is capable of providing solutions for almost all fundamental algorithmic problems, however, some solutions are buggy and non-reproducible. Moreover, Copilot has some difficulties in combining multiple methods to generate a solution. Comparing Copilot to humans, our results show that the correct ratio of humans' solutions is greater than Copilot's suggestions, while the buggy solutions generated by Copilot require less effort to be repaired.

2023-01-01

J. Syst. Softw. (published)

doi.org

arxiv.org

GOKU-UI: Ubiquitous Inference through Attention and Multiple Shooting for Continuous-time Generative Models

Germán Abrevaya

Mahta Ramezanian-Panahi

Jean-Christophe Gagnon-Audet

Irina Rish

Pablo Polosecki

Silvina Ponce Dawson

Guillermo Cecchi

Guillaume Dumas

Scientiﬁc Machine Learning (SciML) is a burgeoning ﬁeld that synergistically combines domain-aware and interpretable models with agnosti… (see more)c machine learning techniques. In this work, we introduce GOKU-UI, an evolution of the SciML generative model GOKU-nets. The GOKU-UI broadens the original model’s spectrum to incorporate other classes of differential equations, such as Stochastic Differential Equations (SDEs), and integrates a distributed, i.e. ubiquitous, inference through attention mechanisms and a novel multiple shooting training strategy in the latent space. These enhancements have led to a signiﬁcant increase in its performance in both reconstruction and forecast tasks, as demonstrated by our evaluation of simulated and empirical data. Speciﬁcally, GOKU-UI outperformed all baseline models on synthetic datasets even with a training set 32-fold smaller, underscoring its remarkable data efﬁciency. Furthermore, when applied to empirical human brain data, while incorporating stochastic Stuart-Landau

2023-01-01

arXiv.org (preprint)

doi.org

Grammar Generative Models for Music Notation

Anna (Cheng-Zhi) Huang

Deep generative models have been successfully applied in many learning experiments with digital data, such as images or audio. In the field … (see more)of music, they can also be used to generate symbolic representations, in the context of problems such as automatic music generation or transcription [1-3]. A significant challenge for generating structured symbolic data in general is obtaining well-formed results. This is especially true in the case of music. It is indeed widely accepted that musical notation represents, well beyond simple sequences of notes, a hierarchical organization of melodic and harmonic information, inducing non-local dependencies between musical objects [4]. A good representation of this information is essential for the interpretation and analysis of music pieces.

2023-01-01

(published)

www.semanticscholar.org

Graph Inductive Biases in Transformers without Message Passing

Liheng Ma

Chen Lin

Derek Lim

Adriana Romero Soriano

Puneet K. Dokania

Mark Coates

Philip Torr

Ser-Nam Lim

Transformers for graph data are increasingly widely studied and successful in numerous learning tasks. Graph inductive biases are crucial fo… (see more)r Graph Transformers, and previous works incorporate them using message-passing modules and/or positional encodings. However, Graph Transformers that use message-passing inherit known issues of message-passing, and differ significantly from Transformers used in other domains, thus making transfer of research advances more difficult. On the other hand, Graph Transformers without message-passing often perform poorly on smaller datasets, where inductive biases are more crucial. To bridge this gap, we propose the Graph Inductive bias Transformer (GRIT) — a new Graph Transformer that incorporates graph inductive biases without using message passing. GRIT is based on several architectural changes that are each theoretically and empirically justified, including: learned relative positional encodings initialized with random walk probabilities, a flexible attention mechanism that updates node and node-pair representations, and injection of degree information in each layer. We prove that GRIT is expressive — it can express shortest path distances and various graph propagation matrices. GRIT achieves state-of-the-art empirical performance across a variety of graph datasets, thus showing the power that Graph Transformers without message-passing can deliver.

2023-01-01

ICML (published)

doi.org

openreview.net

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Publications