Isometric Energies for Recovering Injectivity in Constrained Mapping
Xingyi Du
Danny M. Kaufman
Qingnan Zhou
Shahar Kovalsky
Yajie Yan
Tao Ju
Score-based Denoising Diffusion with Non-Isotropic Gaussian Noise Models
Vikram Voleti
Generative models based on denoising diffusion techniques have led to an unprecedented increase in the quality and diversity of imagery that… (voir plus) is now possible to create with neural generative models. However, most contemporary state-of-the-art methods are derived from a standard isotropic Gaussian formulation. In this work we examine the situation where non-isotropic Gaussian distributions are used. We present the key mathematical derivations for creating denoising diffusion models using an underlying non-isotropic Gaussian noise model. We also provide initial experiments with the CIFAR10 dataset to help verify empirically that this more general modelling approach can also yield high-quality samples.
Continual Learning with Foundation Models: An Empirical Study of Latent Replay
Oleksiy Ostapenko
Timothee LESORT
Pau Rodriguez
Md Rifat Arefin
Arthur Douillard
Rapid development of large-scale pre-training has resulted in foundation models that can act as effective feature extractors on a variety of… (voir plus) downstream tasks and domains. Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios. Our goal is twofold. First, we want to understand the compute-accuracy trade-off between CL in the raw-data space and in the latent space of pre-trained encoders. Second, we investigate how the characteristics of the encoder, the pre-training algorithm and data, as well as of the resulting latent space affect CL performance. For this, we compare the efficacy of various pre-trained models in large-scale benchmarking scenarios with a vanilla replay setting applied in the latent and in the raw-data space. Notably, this study shows how transfer, forgetting, task similarity and learning are dependent on the input data characteristics and not necessarily on the CL algorithms. First, we show that under some circumstances reasonable CL performance can readily be achieved with a non-parametric classifier at negligible compute. We then show how models pre-trained on broader data result in better performance for various replay sizes. We explain this with representational similarity and transfer properties of these representations. Finally, we show the effectiveness of self-supervised pre-training for downstream domains that are out-of-distribution as compared to the pre-training domain. We point out and validate several research directions that can further increase the efficacy of latent CL including representation ensembling. The diverse set of datasets used in this study can serve as a compute-efficient playground for further CL research. We will publish the code.
Improving Meta-Learning Generalization with Activation-Based Early-Stopping
Simon Guiroy
Goncalo Mordido
Shimming toolbox: An open‐source software toolbox for B0 and B1 shimming in MRI
Alexandre D'Astous
Gaspard Cereza
Daniel Papp
Kyle M. Gilbert
Jason P. Stockmann
Eva Alonso‐Ortiz
Shimming toolbox: An open‐source software toolbox for <scp>B0</scp> and <scp>B1</scp> shimming in MRI
Alexandre D'Astous
Gaspard Cereza
Daniel Papp
Kyle M. Gilbert
Jason P. Stockmann
Eva Alonso‐Ortiz
Receptive Field Refinement for Convolutional Neural Networks Reliably Improves Predictive Performance
Mats Leon Richter
Minimal changes to neural architectures (e.g. changing a single hyperparameter in a key layer), can lead to significant gains in predictive … (voir plus)performance in Convolutional Neural Networks (CNNs). In this work, we present a new approach to receptive field analysis that can yield these types of theoretical and empirical performance gains across twenty well-known CNN architectures examined in our experiments. By further developing and formalizing the analysis of receptive field expansion in convolutional neural networks, we can predict unproductive layers in an automated manner before ever training a model. This allows us to optimize the parameter-efficiency of a given architecture at low cost. Our method is computationally simple and can be done in an automated manner or even manually with minimal effort for most common architectures. We demonstrate the effectiveness of this approach by increasing parameter efficiency across past and current top-performing CNN-architectures. Specifically, our approach is able to improve ImageNet1K performance across a wide range of well-known, state-of-the-art (SOTA) model classes, including: VGG Nets, MobileNetV1, MobileNetV3, NASNet A (mobile), MnasNet, EfficientNet, and ConvNeXt - leading to a new SOTA result for each model class.
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning
Sébastien Lachapelle
Tristan Deleu
Divyat Mahajan
Quentin Bertrand
Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding … (voir plus)is limited. In this work, we provide evidence that disentangled representations coupled with sparse base-predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM base-predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning
Sébastien Lachapelle
Tristan Deleu
Divyat Mahajan
Quentin Bertrand
Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding … (voir plus)is limited. In this work, we provide evidence that disentangled representations coupled with sparse base-predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM base-predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.
Applied artificial intelligence in healthcare: Listening to the winds of change in a post-COVID-19 world
Arash Shaban-Nejad
Martin Michalowski
Simone Bianco
John S. Brownstein
Robert L Davis
Beyond Mahalanobis-Based Scores for Textual OOD Detection
Pierre Colombo
Eduardo Dadalto Câmara Gomes
Guillaume Staerman
Nathan Noiry
Towards Adaptive Cybersecurity for Green IoT
Talal Halabi
Martine Bellaiche
The Internet of Things (IoT) paradigm has led to an explosion in the number of IoT devices and an exponential rise in carbon footprint incur… (voir plus)red by overburdened IoT networks and pervasive cloud/edge communications. Hence, there is a growing interest in industry and academia to enable the efficient use of computing infrastructures by optimizing the management of data center and IoT resources (hardware, software, network, and data) and reducing operational costs to slash greenhouse gas emissions and create healthy environments. Cybersecurity has also been considered in such efforts as a contributor to these environmental issues. Nonetheless, most green security approaches focus on designing low-overhead encryption schemes and do not emphasize energy-efficient security from architectural and deployment viewpoints. This paper sheds light on the emerging paradigm of adaptive cybersecurity as one of the research directions to support sustainable computing in green IoT. It presents three potential research directions and their associated methods for designing and deploying adaptive security in green computing and resource-constrained IoT environments to save on energy consumption. Such efforts will transform the development of data-driven IoT security solutions to be greener and more environment-friendly.