We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Predicting Visual Improvement After Macular Hole Surgery: A Combined Model Using Deep Learning and Clinical Features
Purpose The purpose of this study was to assess the feasibility of deep learning (DL) methods to enhance the prediction of visual acuity (VA… (see more)) improvement after macular hole (MH) surgery from a combined model using DL on high-definition optical coherence tomography (HD-OCT) B-scans and clinical features. Methods We trained a DL convolutional neural network (CNN) using pre-operative HD-OCT B-scans of the macula and combined with a logistic regression model of pre-operative clinical features to predict VA increase ≥15 Early Treatment Diabetic Retinopathy Study (ETDRS) letters at 6 months post-vitrectomy in closed MHs. A total of 121 MHs with 242 HD-OCT B-scans and 484 clinical data points were used to train, validate, and test the model. Prediction of VA increase was evaluated using the area under the receiver operating characteristic curve (AUROC) and F1 scores. We also extracted the weight of each input feature in the hybrid model. Results All performances are reported on the held-out test set, matching results obtained with cross-validation. Using a regression on clinical features, the AUROC was 80.6, with an F1 score of 79.7. For the CNN, relying solely on the HD-OCT B-scans, the AUROC was 72.8 ± 14.6, with an F1 score of 61.5 ± 23.7. For our hybrid regression model using clinical features and CNN prediction, the AUROC was 81.9 ± 5.2, with an F1 score of 80.4 ± 7.7. In the hybrid model, the baseline VA was the most important feature (weight = 59.1 ± 6.9%), while the weight of HD-OCT prediction was 9.6 ± 4.2%. Conclusions Both the clinical data and HD-OCT models can predict postoperative VA improvement in patients undergoing vitrectomy for a MH with good discriminative performances. Combining them into a hybrid model did not significantly improve performance. Translational Relevance OCT-based DL models can predict postoperative VA improvement following vitrectomy for MH but fusing those models with clinical data might not provide improved predictive performance.
Planning as Inference in Epidemiological Dynamics Models
Frank N. Wood
Andrew Warrington
Saeid Naderiparizi
Christian Dietrich Weilbach
Vaden Masrani
William Harvey
Adam Ścibior
Boyan Beronov
John Grefenstette
S. Ali Nasseri
Duncan Campbell
In this work we demonstrate how to automate parts of the infectious disease-control policy-making process via performing inference in existi… (see more)ng epidemiological models. The kind of inference tasks undertaken include computing the posterior distribution over controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Among other things, we illustrate the use of a probabilistic programming language that automates inference in existing simulators. Neither the full capabilities of this tool for automating inference nor its utility for planning is widely disseminated at the current time. Timely gains in understanding about how such simulation-based models and inference automation tools applied in support of policy-making could lead to less economically damaging policy prescriptions, particularly during the current COVID-19 pandemic.
The ability to integrate context, including perceptual and temporal cues, plays a pivotal role in grounding the meaning of a linguistic utte… (see more)rance. In order to measure to what extent current vision-and-language models master this ability, we devise a new multimodal challenge, Image Retrieval from Contextual Descriptions (ImageCoDe). In particular, models are tasked with retrieving the correct image from a set of 10 minimally contrastive candidates based on a contextual description.As such, each description contains only the details that help distinguish between images.Because of this, descriptions tend to be complex in terms of syntax and discourse and require drawing pragmatic inferences. Images are sourced from both static pictures and video frames.We benchmark several state-of-the-art models, including both cross-encoders such as ViLBERT and bi-encoders such as CLIP, on ImageCoDe.Our results reveal that these models dramatically lag behind human performance: the best variant achieves an accuracy of 20.9 on video frames and 59.4 on static pictures, compared with 90.8 in humans.Furthermore, we experiment with new model variants that are better equipped to incorporate visual and temporal context into their representations, which achieve modest gains. Our hope is that ImageCoDE will foster progress in grounded language understanding by encouraging models to focus on fine-grained visual differences.
Probabilistic bits (p-bits) have recently been presented as a spin (basic computing element) for the simulated annealing (SA) of Ising model… (see more)s. In this brief, we introduce fast-converging SA based on p-bits designed using integral stochastic computing. The stochastic implementation approximates a p-bit function, which can search for a solution to a combinatorial optimization problem at lower energy than conventional p-bits. Searching around the global minimum energy can increase the probability of finding a solution. The proposed stochastic computing-based SA method is compared with conventional SA and quantum annealing (QA) with a D-Wave Two quantum annealer on the traveling salesman, maximum cut (MAX-CUT), and graph isomorphism (GI) problems. The proposed method achieves a convergence speed a few orders of magnitude faster while dealing with an order of magnitude larger number of spins than the other methods.
2022-03-28
IEEE Transactions on Neural Networks and Learning Systems (published)
Probabilistic bits (p-bits) have recently been presented as a spin (basic computing element) for the simulated annealing (SA) of Ising model… (see more)s. In this brief, we introduce fast-converging SA based on p-bits designed using integral stochastic computing. The stochastic implementation approximates a p-bit function, which can search for a solution to a combinatorial optimization problem at lower energy than conventional p-bits. Searching around the global minimum energy can increase the probability of finding a solution. The proposed stochastic computing-based SA method is compared with conventional SA and quantum annealing (QA) with a D-Wave Two quantum annealer on the traveling salesman, maximum cut (MAX-CUT), and graph isomorphism (GI) problems. The proposed method achieves a convergence speed a few orders of magnitude faster while dealing with an order of magnitude larger number of spins than the other methods.
2022-03-28
IEEE Transactions on Neural Networks and Learning Systems (published)
Diffusion-based generative models learn to iteratively transfer unstructured noise to a complex target distribution as opposed to Generative… (see more) Adversarial Networks (GANs) or the decoder of Variational Autoencoders (VAEs) which produce samples from the target distribution in a single step. Thus, in diffusion models every sample is naturally connected to a random trajectory which is a solution to a learned stochastic differential equation (SDE). Generative models are only concerned with the final state of this trajectory that delivers samples from the desired distribution. Abstreiter et. al showed that these stochastic trajectories can be seen as continuous filters that wash out information along the way. Consequently, it is reasonable to ask if there is an intermediate time step at which the preserved information is optimal for a given downstream task. In this work, we show that a combination of information content from different time steps gives a strictly better representation for the downstream task. We introduce an attention and recurrence based modules that ``learn to mix'' information content of various time-steps such that the resultant representation leads to superior performance in downstream tasks.
We propose a new method for training a supervised source separation system that aims to learn the interdependent relationships between all c… (see more)ombinations of sources in a mixture. Rather than independently estimating each source from a mix, we reframe the source separation problem as an Orderless Neural Autoregressive Density Estimator (NADE), and estimate each source from both the mix and a random subset of the other sources. We adapt a standard source separation architecture, Demucs, with additional inputs for each individual source, in addition to the input mixture. We randomly mask these input sources during training so that the network learns the conditional dependencies between the sources. By pairing this training method with a blocked Gibbs sampling procedure at inference time, we demonstrate that the network can iteratively improve its separation performance by conditioning a source estimate on its earlier source estimates. Experiments on two source separation datasets show that training a Demucs model with an Orderless NADE approach and using Gibbs sampling (up to 512 steps) at inference time strongly outperforms a Demucs baseline that uses a standard regression loss and direct (one step) estimation of sources.
Taxonomies have been widely used in various domains to underpin numerous applications. Specially, product taxonomies serve an essential role… (see more) in the e-commerce domain for the recommendation, browsing, and query understanding. However, taxonomies need to constantly capture the newly emerged terms or concepts in e-commerce platforms to keep up-to-date, which is expensive and labor-intensive if it relies on manual maintenance and updates. Therefore, we target the taxonomy expansion task to attach new concepts to existing taxonomies automatically. In this paper, we present a self-supervised and user behavior-oriented product taxonomy expansion framework to append new concepts into existing taxonomies. Our framework extracts hyponymy relations that conform to users' intentions and cognition. Specifically, i) to fully exploit user behavioral information, we extract candidate hyponymy relations that match user interests from query-click concepts; ii) to enhance the semantic information of new concepts and better detect hyponymy relations, we model concepts and relations through both user-generated content and structural information in existing taxonomies and user click logs, by leveraging Pre-trained Language Models and Graph Neural Network combined with Contrastive Learning; iii) to reduce the cost of dataset construction and overcome data skews, we construct a high-quality and balanced training dataset from existing taxonomy with no supervision. Extensive experiments on real-world product taxonomies in Meituan Platform, a leading Chinese vertical e-commerce platform to order take-out with more than 70 million daily active users, demonstrate the superiority of our proposed framework over state-of-the-art methods. Notably, our method enlarges the size of real-world product taxonomies from 39,263 to 94,698 relations with 88% precision. Our implementation is available: https://github.com/AdaCheng/Product_Taxonomy_Expansion.
In Multiple Sclerosis (MS), there is a large discrepancy between the clinical observations and how the pathology is exhibited on brain image… (see more)s, this is known as the clinical-radiological paradox. One of the hypotheses is that the clinical deficit may be more related to the spinal cord damage than the number or location of lesions in the brain. Therefore, investigating how the spinal cord is damaged becomes an acute challenge to better understand and overcome this paradox. Diffusion MRI is known to provide quantitative figures of neuronal degeneration and axonal loss, in the brain as well as in the spinal cord. In this paper, we propose to investigate how diffusion MRI metrics vary in the different cervical regions with the progression of the disease. We first study the reproducibility of diffusion MRI on healthy volunteers with a test-retest procedure using both standard diffusion tensor imaging (DTI) and multi-compartment Ball-and-Stick models. Then, based on the test re-test quantitative calibration, we provide quantitative figures of pathology evolution between M0 and M12 in the cervical spine on a set of 31 MS patients, exhibiting how the pathology damage spans in the cervical spinal cord.
2022-03-28
2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) (published)
Forgetting is a normal process in healthy brains, and evidence suggests that the mammalian brain forgets more than is required based on limi… (see more)tations of mnemonic capacity. Episodic memories, in particular, are liable to be forgotten over time. Researchers have hypothesized that it may be beneficial for decision making to forget episodic memories over time. Reinforcement learning offers a normative framework in which to test such hypotheses. Here, we show that a reinforcement learning agent that uses an episodic memory cache to find rewards in maze environments can forget a large percentage of older memories without any performance impairments, if they utilize mnemonic representations that contain structural information about space. Moreover, we show that some forgetting can actually provide a benefit in performance compared to agents with unbounded memories. Our analyses of the agents show that forgetting reduces the influence of outdated information and states which are not frequently visited on the policies produced by the episodic control system. These results support the hypothesis that some degree of forgetting can be beneficial for decision making, which can help to explain why the brain forgets more than is required by capacity limitations.
2022-03-25
Frontiers in Computational Neuroscience (published)
Current deep learning approaches have shown good in-distribution performance but struggle in out-of-distribution settings. This is especiall… (see more)y true in the case of tasks involving abstract relations like recognizing rules in sequences, as required in many intelligence tests. In contrast, our brains are remarkably flexible at such tasks, an attribute that is likely linked to anatomical constraints on computations. Inspired by this, recent work has explored how enforcing that relational representations remain distinct from sensory representations can help artificial systems. Building on this work, we further explore and formalize the advantages afforded by ``partitioned'' representations of relations and sensory details. We investigate inductive biases that ensure abstract relations are learned and represented distinctly from sensory data across several neural network architectures and show that they outperform existing architectures on out-of-distribution generalization for various relational tasks. These results show that partitioning relational representations from other information streams may be a simple way to augment existing network architectures' robustness when performing relational computations.
We propose INFERNO, a method to infer object-centric representations of visual scenes without annotations.
Our method decomposes a scene int… (see more)o multiple objects, with each object having a structured representation that disentangles its shape, appearance and pose.
Each object representation defines a localized neural radiance field used to generate 2D views of the scene through differentiable rendering.
Our model is subsequently trained by minimizing a reconstruction loss between inputs and corresponding rendered scenes.
We empirically show that INFERNO discovers objects in a scene without supervision.
We also validate the interpretability of the learned representations by manipulating inferred scenes and showing the corresponding effect in the rendered output.
Finally, we demonstrate the usefulness of our 3D object representations in a visual reasoning task using the CATER dataset.