We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Nearest Neighbour Score Estimators for Diffusion Generative Models
Score function estimation is the cornerstone of both training and sampling from diffusion generative models. Despite this fact, the most com… (see more)monly used estimators are either biased neural network approximations or high variance Monte Carlo estimators based on the conditional score. We introduce a novel nearest neighbour score function estimator which utilizes multiple samples from the training set to dramatically decrease estimator variance. We leverage our low variance estimator in two compelling applications. Training consistency models with our estimator, we report a significant increase in both convergence speed and sample quality. In diffusion models, we show that our estimator can replace a learned network for probability-flow ODE integration, opening promising new avenues of future research. Code will be released upon paper acceptance.
2024-07-08
Proceedings of the 41st International Conference on Machine Learning (published)
Bayesian Persuasion is proposed as a tool for social media platforms to combat the spread of misinformation. Since platforms can use machine… (see more) learning to predict the popularity and misinformation features of to-be-shared posts, and users are largely motivated to share popular content, platforms can strategically signal this informational advantage to change user beliefs and persuade them not to share misinformation. We characterize the optimal signaling scheme with imperfect predictions as a linear program and give sufficient and necessary conditions on the classifier to ensure optimal platform utility is non-decreasing and continuous. Next, this interaction is considered under a performative model, wherein platform intervention affects the user's future behaviour. The convergence and stability of optimal signaling under this performative process are fully characterized. Lastly, we experimentally validate that our approach significantly reduces misinformation in both the single round and performative setting.
2024-07-08
Proceedings of the 41st International Conference on Machine Learning (published)
Ca2+ imaging methods are widely used for studying cellular activity in the brain, allowing detailed analysis of dynamic processes across var… (see more)ious scales. Enhanced by high-contrast optical microscopy and fluorescent Ca2+ sensors, this technique can be used to reveal localized Ca2+ fluctuations within neurons, including in sub-cellular compartments, such as the dendritic shaft or spines. Despite advances in Ca2+ sensors, the analysis of miniature Synaptic Calcium Transients (mSCTs), characterized by variability in morphology and low signal-to-noise ratios, remains challenging. Traditional threshold-based methods struggle with the detection and segmentation of these small, dynamic events. Deep learning (DL) approaches offer promising solutions but are limited by the need for large annotated datasets. Positive Unlabeled (PU) learning addresses this limitation by leveraging unlabeled instances to increase dataset size and enhance performance. This approach is particularly useful in the case of mSCTs that are scarce and small, associated with a very small proportion of the foreground pixels. PU learning significantly increases the effective size of the training dataset, improving model performance. Here, we present a PU learning-based strategy for detecting and segmenting mSCTs. We evaluate the performance of two 3D deep learning models, StarDist-3D and 3D U-Net, which are well established for the segmentation of small volumetric structures in microscopy datasets. By integrating PU learning, we enhance the 3D U-Net’s performance, demonstrating significant gains over traditional methods. This work pioneers the application of PU learning in Ca2+ imaging analysis, offering a robust framework for mSCT detection and segmentation. We also demonstrate how this quantitative analysis pipeline can be used for subsequent mSCTs feature analysis. We characterize morphological and kinetic changes of mSCTs associated with the application of chemical long-term potentiation (cLTP) stimulation in cultured rat hippocampal neurons. Our data-driven approach shows that a cLTP-inducing stimulus leads to the emergence of new active dendritic regions and differently affects mSCTs subtypes.
Property-driven AI-automated material discovery presents unique challenges owing to the complex nature of the chemical structural space and … (see more)computationally expensive simulations. For crystalline solids, the band gap is an important property for designing semiconductors and batteries. However, optimizing crystals for a target band gap is difficult and not well-explored. Reinforcement learning (RL) shows promise towards optimizing crystals, as it can freely explore the chemical space. However, it relies on regular band gap evaluations, which can only be accurately computed through expensive Density Functional Theory (DFT) simulations. In this study, we propose an active learning-inspired pipeline that combines RL and DFT simulations for optimizing crystal compositions given a target band gap. The pipeline includes an RL policy for predicting atom types and a band gap network that is fine-tuned with DFT data. Preliminary results indicate the need for furthering the state-of-the-art to address the inherent challenges of the problem.
The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information t… (see more)o provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.
2024-07-08
Proceedings of the 41st International Conference on Machine Learning (published)
This paper addresses the limitations of current satellite payload architectures, which are predominantly hardware-driven and lack the flexib… (see more)ility to adapt to increasing data demands and uneven traffic. To overcome these challenges, we present a novel architecture for future regenerative and programmable satellite payloads and utilize interconnected modem banks to promote higher scalability and flexibility. We formulate an optimization problem to efficiently manage traffic among these modem banks and balance the load. Additionally, we provide comparative numerical simulation results, considering end-to-end delay and packet loss analysis. The results illustrate that our proposed architecture maintains lower delays and packet loss even with higher traffic demands and smaller buffer sizes.
We introduce the first model-stealing attack that extracts precise,
nontrivial information from black-box production language models like … (see more)OpenAI's ChatGPT or Google's PaLM-2.
Specifically, our attack recovers the embedding projection layer (up to symmetries)
of a transformer model, given typical API access.
For under \\
2024-07-08
Proceedings of the 41st International Conference on Machine Learning (published)