Hugo Larochelle

Consolidating Separate Degradations Model via Weights Fusion and Distillation

Dinesh Daultani

Real-world images prevalently contain different varieties of degradation, such as motion blur and luminance noise. Computer vision recogniti… (voir plus)on models trained on clean images perform poorly on degraded images. Previously, several works have explored how to perform image classification of degraded images while training a single model for each degradation. Nevertheless, it becomes challenging to host several degradation models for each degradation on limited hardware applications and to estimate degradation parameters correctly at the run-time. This work proposes a method for effectively combining several models trained separately on different degradations into a single model to classify images with different types of degradations. Our proposed method is four-fold: (1) train a base model on clean images, (2) fine-tune the base model in-dividually for all given image degradations, (3) perform a fusion of weights given the fine-tuned models for individual degradations, (4) perform fine-tuning on given task using distillation and cross-entropy loss. Our proposed method can outperform previous state-of-the-art methods of pretraining in out-of-distribution generalization based on degradations such as JPEG compression, salt-and-pepper noise, Gaussian blur, and additive white Gaussian noise by 2.5% on CIFAR-100 dataset and by 1.3% on CIFAR-10 dataset. Moreover, our proposed method can handle degra-dation used for training without any explicit information about degradation at the inference time. Code will be available at https://github.com/dineshdaultani/FusionDistill.

2024-01-01

2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) (publié)

Unlearning via Sparse Representations

Vedant Shah

Frederik Träuble

Ashish Malik

Michael Curtis Mozer

Sanjeev Arora

Anirudh Goyal

Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infea… (voir plus)sible by existing techniques. We propose a nearly compute-free zero-shot unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the data set. We evaluate the proposed technique on the problem of \textit{class unlearning} using three datasets: CIFAR-10, CIFAR-100, and LACUNA-100. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all three datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.

2023-11-26

ArXiv (prépublication)

arxiv.org

Unlearning via Sparse Representations

Vedant Shah

Frederik Träuble

Ashish Malik

Michael Curtis Mozer

Sanjeev Arora

Anirudh Goyal

2023-11-26

ArXiv (preprint)

SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data

Mélisande Teng

Amna Elmustafa

Benjamin Akera

Biodiversity is declining at an unprecedented rate, impacting ecosystem services necessary to ensure food, water, and human health and well-… (voir plus)being. Understanding the distribution of species and their habitats is crucial for conservation policy planning. However, traditional methods in ecology for species distribution models (SDMs) generally focus either on narrow sets of species or narrow geographical areas and there remain significant knowledge gaps about the distribution of species. A major reason for this is the limited availability of data traditionally used, due to the prohibitive amount of effort and expertise required for traditional field monitoring. The wide availability of remote sensing data and the growing adoption of citizen science tools to collect species observations data at low cost offer an opportunity for improving biodiversity monitoring and enabling the modelling of complex ecosystems. We introduce a novel task for mapping bird species to their habitats by predicting species encounter rates from satellite images, and present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird, considering summer (breeding) and winter seasons. We also provide a dataset in Kenya representing low-data regimes. We additionally provide environmental data and species range maps for each location. We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks. SatBird opens up possibilities for scalably modelling properties of ecosystems worldwide.

2023-11-02

ArXiv (prépublication)

arxiv.org

SatBird: a Dataset for Bird Species Distribution Modeling using Remote Sensing and Citizen Science Data

Mélisande Teng

Amna Elmustafa

Benjamin Akera

Neural Causal Structure Discovery from Interventions

Nan Rosemary Ke

Bernhard Schölkopf

Michael Curtis Mozer

Chris Pal

Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (voir plus) However, there are theoretical limitations on the identifiability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository.

2023-09-10

TMLR (accepté)

Bird Distribution Modelling using Remote Sensing and Citizen Science data

Mélisande Teng

Amna Elmustafa

Benjamin Akera

David Rolnick

2023-05-01

ArXiv (prépublication)

arxiv.org

Repository-Level Prompt Generation for Large Language Models of Code

Disha Shrivastava

Danny Tarlow

With the success of large language models (LLMs) of code and their use as code assistants (e.g. Codex used in GitHub Copilot), techniques fo… (voir plus)r introducing domain-specific knowledge in the prompt design process become important. In this work, we propose a framework called Repo-Level Prompt Generator that learns to generate example-specific prompts using prompt proposals. The prompt proposals take context from the entire repository, thereby incorporating both the structure of the repository and the context from other relevant files (e.g. imports, parent class files). Our technique doesn't require any access to the weights of the LLM, making it applicable in cases where we only have black-box access to the LLM. We conduct experiments on the task of single-line code auto-completion using code repositories taken from Google Code archives. We demonstrate that an oracle constructed from our prompt proposals gives a relative improvement of 36% over Codex, showing the quality of these proposals. Further, we show that when we train a model to predict a prompt proposal, we can achieve significant performance gains over Codex and other baselines. We release our code, data, and trained checkpoints at: https://github.com/shrivastavadisha/repo_level_prompt_generation.

2023-04-24

ICML.cc/2023/Conference (poster)

Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions

David Bieber

Rishab Goel

Dan Zheng

Daniel Tarlow

The execution behavior of a program often depends on external resources, such as program inputs or file contents, and so cannot be run in is… (voir plus)olation. Nevertheless, software developers benefit from fast iteration loops where automated tools identify errors as early as possible, even before programs can be compiled and run. This presents an interesting machine learning challenge: can we predict runtime errors in a"static"setting, where program execution is not possible? Here, we introduce a real-world dataset and task for predicting runtime errors, which we show is difficult for generic models like Transformers. We approach this task by developing an interpreter-inspired architecture with an inductive bias towards mimicking program executions, which models exception handling and"learns to execute"descriptions of the contents of external resources. Surprisingly, we show that the model can also predict the location of the error, despite being trained only on labels indicating the presence/absence and kind of error. In total, we present a practical and difficult-yet-approachable challenge problem related to learning program execution and we demonstrate promising new capabilities of interpreter-inspired machine learning models for code.

2023-02-01

ICLR.cc/2023/Conference (poster)

Neural Causal Structure Discovery from Interventions

Nan Rosemary Ke

Bernhard Schölkopf

Michael Curtis Mozer

Chris Pal

Recent promising results have generated a surge of interest in continuous optimization methods for causal discovery from observational data.… (voir plus) However, there are theoretical limitations on the identiﬁability of underlying structures obtained solely from observational data. Interventional data, on the other hand, provides richer information about the underlying data-generating process. Nevertheless, extending and applying methods designed for observational data to include interventions is a challenging problem. To address this issue, we propose a general framework based on neural networks to develop models that incorporate both observational and interventional data. Notably, our method can handle the challenging and realistic scenario where the identity of the intervened upon variable is unknown. We evaluate our proposed approach in the context of graph recovery, both de novo and from a partially-known edge set. Our method achieves strong benchmark results on various structure learning tasks, including structure recovery of synthetic graphs as well as standard graphs from the Bayesian Network Repository

2023-01-01

Trans. Mach. Learn. Res. (publié)

dblp.uni-trier.de

Survey of Scientific Rigor Studied in Machine Learning

D. Sculley

Gary R. Holt

Daniel R. Golovin

Eugene V. Davydov

Todd Phillips

Dietmar Ebner

Michael Young

Jean-francois Crespo

Dan Dennison

Emily Fox