Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Multimedia Player
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Publications
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression
Low Rank Decomposition of matrix - splitting a large matrix into a product of two smaller matrix offers a means for compression that reduces… (voir plus) the parameters of a model without sparsification, and hence delivering more speedup on modern hardware. Moreover, unlike quantization, the compressed linear layers remain fully differentiable and all the parameters trainable, while being able to leverage the existing highly efficient kernels over floating point matrices. We study the potential to compress Large Language Models (LLMs) for monolingual Code generation via Low Rank Decomposition (LoRD) and observe that ranks for the linear layers in these models can be reduced by upto 39.58% with less than 1% increase in perplexity. We then use Low Rank Decomposition (LoRD) to compress StarCoder 16B to 13.2B parameter with no drop and to 12.3B with minimal drop in HumanEval Pass@1 score, in less than 10 minutes on a single A100. The compressed models speeds up inference by up to 22.35% with just a single line of change in code over huggingface's implementation with pytorch backend. Low Rank Decomposition (LoRD) models remain compatible with state of the art near-lossless quantization method such as SpQR, which allows leveraging further compression gains of quantization. Lastly, QLoRA over Low Rank Decomposition (LoRD) model further reduces memory requirements by as much as 21.2% over vanilla QLoRA while offering similar gains from parameter efficient fine tuning. Our work shows Low Rank Decomposition (LoRD) as a promising new paradigm for LLM compression.
Public health measures were among the most polarizing topics debated online during the COVID-19 pandemic. Much of the discussion surrounded … (voir plus)specific events, such as when and which particular interventions came into practise. In this work, we develop and apply an approach to measure subnational and event-driven variation of partisan polarization and explore how these dynamics varied both across and within countries. We apply our measure to a dataset of over 50 million tweets posted during late 2020, a salient period of polarizing discourse in the early phase of the pandemic. In particular, we examine regional variations in both the United States and Canada, focusing on three specific health interventions: lockdowns, masks, and vaccines. We find that more politically conservative regions had higher levels of partisan polarization in both countries, especially in the US where a strong negative correlation exists between regional vaccination rates and degree of polarization in vaccine related discussions. We then analyze the timing, context, and profile of spikes in polarization, linking them to specific events discussed on social media across different regions in both countries. These typically last only a few days in duration, suggesting that online discussions reflect and could even drive changes in public opinion, which in the context of pandemic response impacts public health outcomes across different regions and over time.
Public health measures were among the most polarizing topics debated online during the COVID-19 pandemic. Much of the discussion surrounded … (voir plus)specific events, such as when and which particular interventions came into practise. In this work, we develop and apply an approach to measure subnational and event-driven variation of partisan polarization and explore how these dynamics varied both across and within countries. We apply our measure to a dataset of over 50 million tweets posted during late 2020, a salient period of polarizing discourse in the early phase of the pandemic. In particular, we examine regional variations in both the United States and Canada, focusing on three specific health interventions: lockdowns, masks, and vaccines. We find that more politically conservative regions had higher levels of partisan polarization in both countries, especially in the US where a strong negative correlation exists between regional vaccination rates and degree of polarization in vaccine related discussions. We then analyze the timing, context, and profile of spikes in polarization, linking them to specific events discussed on social media across different regions in both countries. These typically last only a few days in duration, suggesting that online discussions reflect and could even drive changes in public opinion, which in the context of pandemic response impacts public health outcomes across different regions and over time.
Despite significant progress, Vision-Language Models (VLMs) still struggle with hallucinations, especially in long-form responses. Existing … (voir plus)strategies have had limited successes in specific cases, and long-form generation remains problematic.
In this work we attempt to establish the link between the data used to train the model and the hallucinations in the model's output.
To this end, we examine hallucinations through data corruption. We develop a method to corrupt training data and then train models with this data to see the effect on performance. We will show that corrupting only a small portion of the long-form training data significantly impairs the performance of the model on long-form tasks, while leaving simpler tasks like visual question-answering and multiple choice relatively intact. All training code and models are released for reproducibility and future research.
Ternary LLMs offer significantly better performance for their size (measured in bits) than the models trained and deployed in FP16/BF16. Giv… (voir plus)en the widespread usage of quantization before deployment and advancements in Post Training Quantization of LLMs, a pivotal question arises: do ternary LLMs indeed provide any discernible benefits? To address this, we first build an open family of pre-trained ternary Large Language Models (TriLM). Additionally, we include their counterparts pre-trained in FP16 (FloatLM) and quantized versions of FloatLM (QuantLM) with parameters across almost two orders of magnitude - from 99M to 3.9B parameters. We demonstrate that TriLMs with 3B+ parameters start to offer competitive performance compared to FloatLMs with the same parameter count, while providing significantly better performance for their size. Specifically, TriLM 3.9B, with less bits than FloatLM 830M, ranks between FloatLM 2.4B and FloatLM 3.9B when averaged across 6 popular commonsense and reasoning benchmarks. TriLMs also outperform quantized models, with TriLM 3.9B surpassing the larger QuantLM-3bit 3.9B. Furthermore, across knowledge-based benchmarks, TriLM maintains a superiority for its size, but lags for its parameter count. TriLM 3.9B falls halfway between FloatLM 1.5B and 2.4B, close to QuantLM-4bit 2.4B. To advance research on Ternary LMs, we open source over 500+ checkpoints across the model families.
Machine learning models often struggle with distribution shifts in real-world scenarios, whereas humans exhibit robust adaptation. Models th… (voir plus)at better align with human perception may achieve higher out-of-distribution generalization. In this study, we investigate how various characteristics of large-scale computer vision models influence their alignment with human capabilities and robustness. Our findings indicate that increasing model and data size, along with incorporating rich semantic information and multiple modalities, significantly enhances models' alignment with human perception and their overall robustness. Our empirical analysis demonstrates a strong correlation between out-of-distribution accuracy and human alignment.
Precise identification of spinal nerve rootlets is relevant to delineate spinal levels for the study of functional activity in the spinal co… (voir plus)rd. The goal of this study was to develop an automatic method for the semantic segmentation of spinal nerve rootlets from T2-weighted magnetic resonance imaging (MRI) scans. Images from two open-access MRI datasets were used to train a 3D multi-class convolutional neural network using an active learning approach to segment C2-C8 dorsal nerve rootlets. Each output class corresponds to a spinal level. The method was tested on 3T T2-weighted images from datasets unseen during training to assess inter-site, inter-session, and inter-resolution variability. The test Dice score was 0.67 +- 0.16 (mean +- standard deviation across rootlets levels), suggesting a good performance. The method also demonstrated low inter-vendor and inter-site variability (coefficient of variation= 1.41 %), as well as low inter-session variability (coefficient of variation= 1.30 %) indicating stable predictions across different MRI