We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix
Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime. Although maj… (see more)or advances have been made in the field, one recurring problem which remains unsolved is that of Catastrophic Forgetting (CF). While the issue has been extensively studied empirically, little attention has been paid from a theoretical angle. In this paper, we show that the impact of CF increases as two tasks increasingly align. We introduce a measure of task similarity called the NTK overlap matrix which is at the core of CF. We analyze common projected gradient algorithms and demonstrate how they mitigate forgetting. Then, we propose a variant of Orthogonal Gradient Descent (OGD) which leverages structure of the data through Principal Component Analysis (PCA). Experiments support our theoretical findings and show how our method reduces CF on classical CL datasets.
The coronavirus disease 2019 (COVID-19) pandemic has quickly become a global public health crisis unseen in recent years. It is known that t… (see more)he structure of the human contact network plays an important role in the spread of transmissible diseases. In this work, we study a structure aware model of COVID-19 CGEM. This model becomes similar to the classical compartment-based models in epidemiology if we assume the contact network is a Erdos-Renyi (ER) graph, i.e. everyone comes into contact with everyone else with the same probability. In contrast, CGEM is more expressive and allows for plugging in the actual contact networks, or more realistic proxies for it. Moreover, CGEM enables more precise modelling of enforcing and releasing different non-pharmaceutical intervention (NPI) strategies. Through a set of extensive experiments, we demonstrate significant differences between the epidemic curves when assuming different underlying structures. More specifically we demonstrate that the compartment-based models are overestimating the spread of the infection by a factor of 3, and under some realistic assumptions on the compliance factor, underestimating the effectiveness of some of NPIs, mischaracterizing others (e.g. predicting a later peak), and underestimating the scale of the second peak after reopening.
Most of today's popular deep architectures are hand-engineered for general purpose applications. However, this design procedure usually lead… (see more)s to massive redundant, useless, or even harmful features for specific tasks. Such unnecessarily high complexities render deep nets impractical for many real-world applications, especially those without powerful GPU support. In this paper, we attempt to derive task-dependent compact models from a deep discriminant analysis perspective. We propose an iterative and proactive approach for classification tasks which alternates between (1) a pushing step, with an objective to simultaneously maximize class separation, penalize co-variances, and push deep discriminants into alignment with a compact set of neurons, and (2) a pruning step, which discards less useful or even interfering neurons. Deconvolution is adopted to reverse `unimportant' filters' effects and recover useful contributing sources. A simple network growing strategy based on the basic Inception module is proposed for challenging tasks requiring larger capacity than what the base net can offer. Experiments on the MNIST, CIFAR10, and ImageNet datasets demonstrate our approach's efficacy. On ImageNet, by pushing and pruning our grown Inception-88 model, we achieve better-performing models than smaller deep Inception nets grown, residual nets, and famous compact nets at similar sizes. We also show that our grown deep Inception nets (without hard-coded dimension alignment) can beat residual nets of similar complexities.
In this article, we investigate the optimal control of network-coupled subsystems with coupled dynamics and costs. The dynamics coupling may… (see more) be represented by the adjacency matrix, the Laplacian matrix, or any other symmetric matrix corresponding to an underlying weighted undirected graph. Cost couplings are represented by two coupling matrices which have the same eigenvectors as the coupling matrix in the dynamics. We use the spectral decomposition of these three coupling matrices to decompose the overall system into
Amorphous molecular assemblies appear in a vast array of systems: from living cells to chemical plants and from everyday items to new device… (see more)s. The absence of long-range order in amorphous materials implies that precise knowledge of their underlying structures throughout is needed to rationalize and control their properties at the mesoscale. Standard computational simulations suffer from exponentially unfavorable scaling of the required compute with system size. We present a method based on deep learning that leverages the finite range of structural correlations for an autoregressive generation of disordered molecular aggregates up to arbitrary size from small-scale computational or experimental samples. We benchmark performance on self-assembled nanoparticle aggregates and proceed to simulate monolayer amorphous carbon with atomistic resolution. This method bridges the gap between the nanoscale and mesoscale simulations of amorphous molecular systems.
SC-Flip (SCF) is a low-complexity polar code decoding algorithm with improved performance, and is an alternative to high-complexity (CRC)-ai… (see more)ded SC-List (CA-SCL) decoding. However, the performance improvement of SCF is limited since it can correct up to only one channel error (