Publications

In the rapidly evolving landscape of software development, Large Language Models (LLM) have emerged as powerful tools that can significantly… (voir plus) impact the way software code is written, reviewed, and optimized, making them invaluable resources for programmers. They offer developers the ability to leverage pre-trained knowledge and tap into vast code repositories, enabling faster development cycles and reducing the time spent on repetitive or mundane coding tasks. However, while these models offer substantial benefits, their adoption also presents multiple challenges. For example, they might generate code snippets that are syntactically correct but functionally flawed, requiring human review and validation. Moreover, the ethical considerations surrounding these models, such as biases in the training data, should be carefully addressed to ensure fair and inclusive software development practices. This talk will provide an overview and reflection on some of these challenges, present some preliminary solutions, and discuss opportunities for predictive models and data analytics.

2023-12-08

International Conference on Predictive Models in Software Engineering (publié)

doi.org

Unmixing Optical Signals from Undersampled Volumetric Measurements by Filtering the Pixel Latent Variables

Catherine Bouchard

Andréanne Deschênes

Vincent Boulanger

Jean-Michel Bellavance

Julia Chabbert

Alexy Pelletier-Rioux

Flavie Lavoie-Cardinal

Christian Gagné

The development of signal unmixing algorithms is essential for leveraging multimodal datasets acquired through a wide array of scientific im… (voir plus)aging technologies, including hyperspectral or time-resolved acquisitions. In experimental physics, enhancing the spatio-temporal resolution or expanding the number of detection channels often leads to diminished sampling rate and signal-to-noise ratio (SNR), significantly affecting the efficacy of signal unmixing algorithms. We propose Latent Unmixing, a new approach which applies band-pass filters to the latent space of a multi-dimensional convolutional neural network to disentangle overlapping signal components. It enables better isolation and quantification of individual signal contributions, especially in the context of undersampled distributions. Using multi-dimensional convolution kernels to process all dimensions simultaneously enhances the network's ability to extract information from adjacent pixels, and time- or spectral-bins. This approach enables more effective separation of components in cases where individual pixels do not provide clear, well-resolved information. We showcase the method's practical use in experimental physics through two test cases that highlight the versatility of our approach: fluorescence lifetime microscopy and mode decomposition in optical fibers. The latent unmixing method extracts valuable information from complex signals that cannot be resolved by standard methods. It opens new possibilities in optics and photonics for multichannel separations at an increased sampling rate.

2023-12-08

ArXiv (prépublication)

arxiv.org

Unmixing Optical Signals from Undersampled Volumetric Measurements by Filtering the Pixel Latent Variables

Catherine Bouchard

Andréanne Deschênes

Vincent Boulanger

Jean-Michel Bellavance

Julia Chabbert

Alexy Pelletier-Rioux

Flavie Lavoie-Cardinal

Christian Gagné

2023-12-08

ArXiv (prépublication)

arxiv.org

Pretrainable Geometric Graph Neural Network for Antibody Affinity Maturation

Huiyu Cai

Zuobai Zhang

Mingkai Wang

Bozitao Zhong

Yanling Wu

Tianlei Ying

Jian Tang

In the realm of antibody therapeutics development, increasing the binding affinity of an antibody to its target antigen is a crucial task. T… (voir plus)his paper presents GearBind, a pretrainable deep neural network designed to be effective for in silico affinity maturation. Leveraging multi-level geometric message passing alongside contrastive pretraining on protein structural data, GearBind capably models the complex interplay of atom-level interactions within protein complexes, surpassing previous state-of-the-art approaches on SKEMPI v2 in terms of Pearson correlation, mean absolute error (MAE) and root mean square error (RMSE). In silico experiments elucidate that pretraining helps GearBind become sensitive to mutation-induced binding affinity changes and reflective of amino acid substitution tendency. Using an ensemble model based on pretrained GearBind, we successfully optimize the affinity of CR3022 to the spike (S) protein of the SARS-CoV-2 Omicron strain. Our strategy yields a high success rate with up to 17-fold affinity increase. GearBind proves to be an effective tool in narrowing the search space for in vitro antibody affinity maturation, underscoring the utility of geometric deep learning and adept pre-training in macromolecule interaction modeling.

2023-12-07

bioRxiv (preprint)

doi.org

Pretrainable Geometric Graph Neural Network for Antibody Affinity Maturation

Huiyu Cai

Zuobai Zhang

Mingkai Wang

Bozitao Zhong

Yanling Wu

Tianlei Ying

Jian Tang

2023-12-07

bioRxiv (prépublication)

doi.org

Language Model Alignment with Elastic Reset

Michael Noukhovitch

Samuel Lavoie

Florian Strub

Aaron Courville

Finetuning language models with reinforcement learning (RL), e.g. from human feedback (HF), is a prominent method for alignment. But optimiz… (voir plus)ing against a reward model can improve on reward while degrading performance in other areas, a phenomenon known as reward hacking, alignment tax, or language drift. First, we argue that commonly-used test metrics are insufficient and instead measure how different algorithms tradeoff between reward and drift. The standard method modified the reward with a Kullback-Lieber (KL) penalty between the online and initial model. We propose Elastic Reset, a new algorithm that achieves higher reward with less drift without explicitly modifying the training objective. We periodically reset the online model to an exponentially moving average (EMA) of itself, then reset the EMA model to the initial model. Through the use of an EMA, our model recovers quickly after resets and achieves higher reward with less drift in the same number of steps. We demonstrate that fine-tuning language models with Elastic Reset leads to state-of-the-art performance on a small scale pivot-translation benchmark, outperforms all baselines in a medium-scale RLHF-like IMDB mock sentiment task and leads to a more performant and more aligned technical QA chatbot with LLaMA-7B. Code available at github.com/mnoukhov/elastic-reset.

2023-12-06

ArXiv (prépublication)

doi.org

arxiv.org

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

Umberto Cappellazzo

Daniele Falavigna

Alessio Brutti

Mirco Ravanelli

Parameter-efficient transfer learning (PETL) methods have emerged as a solid alternative to the standard full fine-tuning approach. They onl… (voir plus)y train a few extra parameters for each downstream task, without sacrificing performance and dispensing with the issue of storing a copy of the pre-trained model for each task. For audio classification tasks, the Audio Spectrogram Transformer (AST) model shows impressive results. However, surprisingly, how to efficiently adapt it to several downstream tasks has not been tackled before. In this paper, we bridge this gap and present a detailed investigation of common PETL methods for the adaptation of the AST model to audio/speech tasks. Furthermore, we propose a new adapter design that exploits the convolution module of the Conformer model, leading to superior performance over the standard PETL approaches and surpassing or achieving performance parity with full fine-tuning by updating only 0.29% of the parameters. Finally, we provide ablation studies revealing that our proposed adapter: 1) proves to be effective in few-shot efficient transfer learning, 2) attains optimal results regardless of the amount of the allocated parameters, and 3) can be applied to other pre-trained models. Our code is available at https:/github.com/umbertocappellazzo/PETL_AST.

2023-12-06

ArXiv (prépublication)

doi.org

arxiv.org

Bug characterization in machine learning-based systems

Mohammad Mehdi Morovati

Amin Nikanjam

Florian Tambon

Foutse Khomh

Z. Jiang

2023-12-05

Empirical Software Engineering (publié)

doi.org

arxiv.org

Deep Neural Networks pruning via the Structured Perspective Regularization

Matteo Cacciola

Antonio Frangioni

Xinlin Li

Andrea Lodi

2023-12-05

SIAM Journal on Mathematics of Data Science (publié)

doi.org

arxiv.org

Avantage IA

Bourse Mila en politiques de l'IA

Priorités stratégiques

Avantage IA

Bourse Mila en politiques de l'IA

Publications

Avantage IA

Bourse Mila en politiques de l'IA

Priorités stratégiques

Avantage IA

Bourse Mila en politiques de l'IA

Mots-clés populaires:

Publications