Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
Learning to combine top-down context and feed-forward representations under ambiguity with apical and basal dendrites
In the rapidly evolving landscape of software development, Large Language Models (LLM) have emerged as powerful tools that can significantly… (voir plus) impact the way software code is written, reviewed, and optimized, making them invaluable resources for programmers. They offer developers the ability to leverage pre-trained knowledge and tap into vast code repositories, enabling faster development cycles and reducing the time spent on repetitive or mundane coding tasks. However, while these models offer substantial benefits, their adoption also presents multiple challenges. For example, they might generate code snippets that are syntactically correct but functionally flawed, requiring human review and validation. Moreover, the ethical considerations surrounding these models, such as biases in the training data, should be carefully addressed to ensure fair and inclusive software development practices. This talk will provide an overview and reflection on some of these challenges, present some preliminary solutions, and discuss opportunities for predictive models and data analytics.
2023-12-08
International Conference on Predictive Models in Software Engineering (publié)
The development of signal unmixing algorithms is essential for leveraging multimodal datasets acquired through a wide array of scientific im… (voir plus)aging technologies, including hyperspectral or time-resolved acquisitions. In experimental physics, enhancing the spatio-temporal resolution or expanding the number of detection channels often leads to diminished sampling rate and signal-to-noise ratio (SNR), significantly affecting the efficacy of signal unmixing algorithms. We propose Latent Unmixing, a new approach which applies band-pass filters to the latent space of a multi-dimensional convolutional neural network to disentangle overlapping signal components. It enables better isolation and quantification of individual signal contributions, especially in the context of undersampled distributions. Using multi-dimensional convolution kernels to process all dimensions simultaneously enhances the network's ability to extract information from adjacent pixels, and time- or spectral-bins. This approach enables more effective separation of components in cases where individual pixels do not provide clear, well-resolved information. We showcase the method's practical use in experimental physics through two test cases that highlight the versatility of our approach: fluorescence lifetime microscopy and mode decomposition in optical fibers. The latent unmixing method extracts valuable information from complex signals that cannot be resolved by standard methods. It opens new possibilities in optics and photonics for multichannel separations at an increased sampling rate.
In the realm of antibody therapeutics development, increasing the binding affinity of an antibody to its target antigen is a crucial task. T… (voir plus)his paper presents GearBind, a pretrainable deep neural network designed to be effective for in silico affinity maturation. Leveraging multi-level geometric message passing alongside contrastive pretraining on protein structural data, GearBind capably models the complex interplay of atom-level interactions within protein complexes, surpassing previous state-of-the-art approaches on SKEMPI v2 in terms of Pearson correlation, mean absolute error (MAE) and root mean square error (RMSE). In silico experiments elucidate that pretraining helps GearBind become sensitive to mutation-induced binding affinity changes and reflective of amino acid substitution tendency. Using an ensemble model based on pretrained GearBind, we successfully optimize the affinity of CR3022 to the spike (S) protein of the SARS-CoV-2 Omicron strain. Our strategy yields a high success rate with up to 17-fold affinity increase. GearBind proves to be an effective tool in narrowing the search space for in vitro antibody affinity maturation, underscoring the utility of geometric deep learning and adept pre-training in macromolecule interaction modeling.
Finetuning language models with reinforcement learning (RL), e.g. from human feedback (HF), is a prominent method for alignment. But optimiz… (voir plus)ing against a reward model can improve on reward while degrading performance in other areas, a phenomenon known as reward hacking, alignment tax, or language drift. First, we argue that commonly-used test metrics are insufficient and instead measure how different algorithms tradeoff between reward and drift. The standard method modified the reward with a Kullback-Lieber (KL) penalty between the online and initial model. We propose Elastic Reset, a new algorithm that achieves higher reward with less drift without explicitly modifying the training objective. We periodically reset the online model to an exponentially moving average (EMA) of itself, then reset the EMA model to the initial model. Through the use of an EMA, our model recovers quickly after resets and achieves higher reward with less drift in the same number of steps. We demonstrate that fine-tuning language models with Elastic Reset leads to state-of-the-art performance on a small scale pivot-translation benchmark, outperforms all baselines in a medium-scale RLHF-like IMDB mock sentiment task and leads to a more performant and more aligned technical QA chatbot with LLaMA-7B. Code available at github.com/mnoukhov/elastic-reset.
Parameter-efficient transfer learning (PETL) methods have emerged as a solid alternative to the standard full fine-tuning approach. They onl… (voir plus)y train a few extra parameters for each downstream task, without sacrificing performance and dispensing with the issue of storing a copy of the pre-trained model for each task. For audio classification tasks, the Audio Spectrogram Transformer (AST) model shows impressive results. However, surprisingly, how to efficiently adapt it to several downstream tasks has not been tackled before. In this paper, we bridge this gap and present a detailed investigation of common PETL methods for the adaptation of the AST model to audio/speech tasks. Furthermore, we propose a new adapter design that exploits the convolution module of the Conformer model, leading to superior performance over the standard PETL approaches and surpassing or achieving performance parity with full fine-tuning by updating only 0.29% of the parameters. Finally, we provide ablation studies revealing that our proposed adapter: 1) proves to be effective in few-shot efficient transfer learning, 2) attains optimal results regardless of the amount of the allocated parameters, and 3) can be applied to other pre-trained models. Our code is available at https:/github.com/umbertocappellazzo/PETL_AST.