Hugo Larochelle

David Rolnick

2023-05-01

ArXiv (preprint)

Repository-Level Prompt Generation for Large Language Models of Code

Disha Shrivastava

Danny Tarlow

With the success of large language models (LLMs) of code and their use as code assistants (e.g. Codex used in GitHub Copilot), techniques fo… (see more)r introducing domain-specific knowledge in the prompt design process become important. In this work, we propose a framework called Repo-Level Prompt Generator that learns to generate example-specific prompts using prompt proposals. The prompt proposals take context from the entire repository, thereby incorporating both the structure of the repository and the context from other relevant files (e.g. imports, parent class files). Our technique doesn't require any access to the weights of the LLM, making it applicable in cases where we only have black-box access to the LLM. We conduct experiments on the task of single-line code auto-completion using code repositories taken from Google Code archives. We demonstrate that an oracle constructed from our prompt proposals gives a relative improvement of 36% over Codex, showing the quality of these proposals. Further, we show that when we train a model to predict a prompt proposal, we can achieve significant performance gains over Codex and other baselines. We release our code, data, and trained checkpoints at: https://github.com/shrivastavadisha/repo_level_prompt_generation.

2023-04-24

ICML.cc/2023/Conference (poster)

Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions

David Bieber

Rishab Goel

Dan Zheng

Daniel Tarlow

The execution behavior of a program often depends on external resources, such as program inputs or file contents, and so cannot be run in is… (see more)olation. Nevertheless, software developers benefit from fast iteration loops where automated tools identify errors as early as possible, even before programs can be compiled and run. This presents an interesting machine learning challenge: can we predict runtime errors in a"static"setting, where program execution is not possible? Here, we introduce a real-world dataset and task for predicting runtime errors, which we show is difficult for generic models like Transformers. We approach this task by developing an interpreter-inspired architecture with an inductive bias towards mimicking program executions, which models exception handling and"learns to execute"descriptions of the contents of external resources. Surprisingly, we show that the model can also predict the location of the error, despite being trained only on labels indicating the presence/absence and kind of error. In total, we present a practical and difficult-yet-approachable challenge problem related to learning program execution and we demonstrate promising new capabilities of interpreter-inspired machine learning models for code.

2023-02-01

ICLR.cc/2023/Conference (poster)

Neural Causal Structure Discovery from Interventions

Nan Rosemary Ke

Olexa Bilaniuk

Anirudh Goyal

Stefan Bauer

Bernhard Schölkopf

Michael Curtis Mozer

Chris Pal

Yoshua Bengio

Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (see more) However, there are theoretical limitations on the identiﬁability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository

2023-01-01

Trans. Mach. Learn. Res. (published)

dblp.uni-trier.de

Survey of Scientific Rigor Studied in Machine Learning

D. Sculley

Gary R. Holt

Daniel R. Golovin

Eugene V. Davydov

Todd Phillips

Dietmar Ebner

Michael Young

Jean-francois Crespo

Dan Dennison

Emily Fox

The concern that Artificial Intelligence (AI) and Machine Learning (ML) are entering a “reproducibility crisis” has spurred significant … (see more)research in the past few years. Yet with each paper, it is often unclear what someone means by “reproducibility” and where it fits in the larger scope of what we will call the “scientific rigor” literature. Ultimately, the lack of clear rigor standards can affect the manner in which businesses seeking to adopt AI/ML implement such capabilities. In this survey, we will use 66 papers published since 2017 to construct a proposed set of 8 high-level categories of scientific rigor, what they are, and the history of work conducted in each. Our proposal is that these eight rigor types are not mutually exclusive and present a model for how they influence each other. To encourage more to study these questions, we map these rigors to the adoption process in real-world business use cases. In doing so, we can quantify gaps in the literature that suggest an under focus on the issues necessary for scientific rigor research to transition to practice

Teaching Algorithmic Reasoning via In-context Learning

Hattie Zhou

Azade Nova

Aaron Courville

Behnam Neyshabur

Hanie Sedghi

2022-11-15

ArXiv (preprint)

Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning

Utku Evci

Vincent Dumoulin

Michael Curtis Mozer

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

Uniform Priors for Data-Efficient Learning

Samarth Sinha

Karsten Roth

Anirudh Goyal

Marzyeh Ghassemi

Zeynep Akata

Animesh Garg

Few or zero-shot adaptation to novel tasks is important for the scalability and deployment of machine learning models. It is therefore cruci… (see more)al to find properties that encourage more transferable features in deep networks for generalization. In this paper, we show that models that learn uniformly distributed features from the training data, are able to perform better transfer learning at test-time. Motivated by this, we evaluate our method: uniformity regularization (UR) on its ability to facilitate adaptation to unseen tasks and data on six distinct domains: Few-Learning with Images, Few-shot Learning with Language, Deep Metric Learning, 0-Shot Domain Adaptation, Out-of-Distribution classification, and Neural Radiance Fields. Across all experiments, we show that using UR, we are able to learn robust vision systems which consistently offer benefits over baselines trained without uniformity regularization and are able to achieve state-of-the-art performance in Deep Metric Learning, Few-shot learning with images and language.

2022-06-19

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (published)

Matching Feature Sets for Few-Shot Image Classification

Arman Afrasiyabi

Jean‐François Lalonde

Christian Gagné

In image classification, it is common practice to train deep networks to extract a single feature vector per input image. Few-shot classific… (see more)ation methods also mostly follow this trend. In this work, we depart from this established direction and instead propose to extract sets of feature vectors for each image. We argue that a set-based representation intrinsically builds a richer representation of images from the base classes, which can subsequently better transfer to the few-shot classes. To do so, we propose to adapt existing feature extractors to instead produce sets of feature vectors from images. Our approach, dubbed SetFeat, embeds shallow self-attention mechanisms inside existing encoder architectures. The attention modules are lightweight, and as such our method results in encoders that have approximately the same number of parameters as their original versions. During training and inference, a set-to-set matching metric is used to perform image classification. The effectiveness of our proposed architecture and metrics is demonstrated via thorough experiments on standard few-shot datasets-namely miniImageNet, tieredImageNet, and CUB-in both the 1- and 5-shot scenarios. In all cases but one, our method outperforms the state-of-the-art.

2022-06-18

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (published)

Matching Feature Sets for Few-Shot Image Classification

Arman Afrasiyabi

Jean‐François Lalonde

Christian Gagné

In image classification, it is common practice to train deep networks to extract a single feature vector per input image. Few-shot classific… (see more)ation methods also mostly follow this trend. In this work, we depart from this established direction and instead propose to extract sets of feature vectors for each image. We argue that a set-based representation intrinsically builds a richer representation of images from the base classes, which can subsequently better transfer to the few-shot classes. To do so, we propose to adapt existing feature extractors to instead produce sets of feature vectors from images. Our approach, dubbed SetFeat, embeds shallow self-attention mechanisms inside existing encoder architectures. The attention modules are lightweight, and as such our method results in encoders that have approximately the same number of parameters as their original versions. During training and inference, a set-to-set matching metric is used to perform image classification. The effectiveness of our proposed architecture and metrics is demonstrated via thorough experiments on standard few-shot datasets-namely miniImageNet, tieredImageNet, and CUB-in both the 1- and 5-shot scenarios. In all cases but one, our method outperforms the state-of-the-art.

2022-04-02

ArXiv (preprint)

Fortuitous Forgetting in Connectionist Networks

Hattie Zhou

Ankit Vani

Aaron Courville

Forgetting is often seen as an unwanted characteristic in both human and machine learning. However, we propose that forgetting can in fact b… (see more)e favorable to learning. We introduce"forget-and-relearn"as a powerful paradigm for shaping the learning trajectories of artificial neural networks. In this process, the forgetting step selectively removes undesirable information from the model, and the relearning step reinforces features that are consistently useful under different conditions. The forget-and-relearn framework unifies many existing iterative training algorithms in the image classification and language emergence literature, and allows us to understand the success of these algorithms in terms of the disproportionate forgetting of undesirable information. We leverage this understanding to improve upon existing algorithms by designing more targeted forgetting operations. Insights from our analysis provide a coherent view on the dynamics of iterative training in neural networks and offer a clear path towards performance improvements.

2022-01-28

ICLR.cc/2022/Conference (poster)

Learning to Combine Per-Example Solutions for Neural Program Synthesis

Disha Shrivastava

Daniel Tarlow

The goal of program synthesis from examples is to find a computer program that is consistent with a given set of input-output examples. Most… (see more) learning-based approaches try to find a program that satisfies all examples at once. Our work, by contrast, considers an approach that breaks the problem into two stages: (a) find programs that satisfy only one example, and (b) leverage these per-example solutions to yield a program that satisfies all examples. We introduce the Cross Aggregator neural network module based on a multi-head attention mechanism that learns to combine the cues present in these per-example solutions to synthesize a global solution. Evaluation across programs of different lengths and under two different experimental settings reveal that when given the same time budget, our technique significantly improves the success rate over PCCoder [Zohar et. al 2018] and other ablation baselines.