Multi-Scale Representation Learning for Protein Fitness Prediction
Zuobai Zhang
Pascal Notin
Yining Huang
Aurelie Lozano
Vijil Chenthamarakshan
Debora Susan Marks
Payel Das
Designing novel functional proteins crucially depends on accurately modeling their fitness landscape. Given the limited availability of func… (see more)tional annotations from wet-lab experiments, previous methods have primarily relied on self-supervised models trained on vast, unlabeled protein sequence or structure datasets. While initial protein representation learning studies solely focused on either sequence or structural features, recent hybrid architectures have sought to merge these modalities to harness their respective strengths. However, these sequence-structure models have so far achieved only incremental improvements when compared to the leading sequence-only approaches, highlighting unresolved challenges effectively leveraging these modalities together. Moreover, the function of certain proteins is highly dependent on the granular aspects of their surface topology, which have been overlooked by prior models. To address these limitations, we introduce the Sequence-Structure-Surface Fitness (**S3F**) model — a novel multimodal representation learning framework that integrates protein features across several scales. Our approach combines sequence representations from a protein language model with Geometric Vector Perceptron networks encoding protein backbone and detailed surface topology. The proposed method achieves state-of-the-art fitness prediction on the ProteinGym benchmark encompassing 217 substitution deep mutational scanning assays, and provides insights into the determinants of protein function. Our code is at https://github.com/DeepGraphLearning/S3F.
Normalization and effective learning rates in reinforcement learning
Clare Lyle
Zeyu Zheng
James Martens
Hado van Hasselt
Will Dabney
Offline Multitask Representation Learning for Reinforcement Learning
Haque Ishfaq
Thanh Nguyen-Tang
Songtao Feng
Raman Arora
Mengdi Wang
Ming Yin
Parseval Regularization for Continual Reinforcement Learning
Wesley Chung
Lynn Cherif
Periodic agent-state based Q-learning for POMDPs
Amit Sinha
Matthieu Geist
Predicting Future Actions of Reinforcement Learning Agents
Stephen Chung
Scott Niekum
QGFN: Controllable Greediness with Action Values
Elaine Lau
Stephen Zhewen Lu
Ling Pan
Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating … (see more)diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate,
RGFN: Synthesizable Molecular Generation Using GFlowNets
Michał Koziarski
Andrei Rekesh
Dmytro Shevchuk
Almer M. van der Sloot
Piotr Gaiński
Cheng-Hao Liu
Mike Tyers
Robert A. Batey
Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences
Damien Ferbach
Quentin Bertrand
The rapid progress in generative models has resulted in impressive leaps in generation quality, blurring the lines between synthetic and rea… (see more)l data. Web-scale datasets are now prone to the inevitable contamination by synthetic data, directly impacting the training of future generated models. Already, some theoretical results on self-consuming generative models (a.k.a., iterative retraining) have emerged in the literature, showcasing that either model collapse or stability could be possible depending on the fraction of generated data used at each retraining step. However, in practice, synthetic data is often subject to human feedback and curated by users before being used and uploaded online. For instance, many interfaces of popular text-to-image generative models, such as Stable Diffusion or Midjourney, produce several variations of an image for a given query which can eventually be curated by the users. In this paper, we theoretically study the impact of data curation on iterated retraining of generative models and show that it can be seen as an \emph{implicit preference optimization mechanism}. However, unlike standard preference optimization, the generative model does not have access to the reward function or negative samples needed for pairwise comparisons. Moreover, our study doesn't require access to the density function, only to samples. We prove that, if the data is curated according to a reward model, then the expected reward of the iterative retraining procedure is maximized. We further provide theoretical results on the stability of the retraining loop when using a positive fraction of real data at each step. Finally, we conduct illustrative experiments on both synthetic datasets and on CIFAR10 showing that such a procedure amplifies biases of the reward model.
Simplifying Constraint Inference with Inverse Reinforcement Learning
Adriana Hugessen
Harley Wiltzer
Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space
Leo Schwinn
David Dobre
Sophie Xhonneux
Stephan Günnemann
Current research in adversarial robustness of LLMs focuses on discrete input manipulations in the natural language space, which can be direc… (see more)tly transferred to closed-source models. However, this approach neglects the steady progression of open-source models. As open-source models advance in capability, ensuring their safety also becomes increasingly imperative. Yet, attacks tailored to open-source LLMs that exploit full model access remain largely unexplored. We address this research gap and propose the embedding space attack, which directly attacks the continuous embedding representation of input tokens. We find that embedding space attacks circumvent model alignments and trigger harmful behaviors more efficiently than discrete attacks or model fine-tuning. Furthermore, we present a novel threat model in the context of unlearning and show that embedding space attacks can extract supposedly deleted information from unlearned LLMs across multiple datasets and models. Our findings highlight embedding space attacks as an important threat model in open-source LLMs. Trigger Warning: the appendix contains LLM-generated text with violence and harassment.
Source-Free Domain Adaptation for YOLO Object Detection
Simon Varailhon
Masih Aminbeidokhti
Eric Granger
Source-free domain adaptation (SFDA) is a challenging problem in object detection, where a pre-trained source model is adapted to a new targ… (see more)et domain without using any source domain data for privacy and efficiency reasons. Most state-of-the-art SFDA methods for object detection have been proposed for Faster-RCNN, a detector that is known to have high computational complexity. This paper focuses on domain adaptation techniques for real-world vision systems, particularly for the YOLO family of single-shot detectors known for their fast baselines and practical applications. Our proposed SFDA method - Source-Free YOLO (SF-YOLO) - relies on a teacher-student framework in which the student receives images with a learned, target domain-specific augmentation, allowing the model to be trained with only unlabeled target data and without requiring feature alignment. A challenge with self-training using a mean-teacher architecture in the absence of labels is the rapid decline of accuracy due to noisy or drifting pseudo-labels. To address this issue, a teacher-to-student communication mechanism is introduced to help stabilize the training and reduce the reliance on annotated target data for model selection. Despite its simplicity, our approach is competitive with state-of-the-art detectors on several challenging benchmark datasets, even sometimes outperforming methods that use source data for adaptation.