Publications

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Andrew Robert Williams

Arjun Ashok

Étienne Marcotte

Valentina Zantedeschi

Jithendaraa Subramanian

Alexandre Lacoste

Forecasting is a critical task in decision making across various domains. While numerical data provides a foundation, it often lacks crucial… (voir plus) context necessary for accurate predictions. Human forecasters frequently rely on additional information, such as background knowledge or constraints, which can be efficiently communicated through natural language. However, the ability of existing forecasting models to effectively integrate this textual information remains an open question. To address this, we introduce"Context is Key"(CiK), a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context, requiring models to integrate both modalities. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters, and propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings. By presenting this benchmark, we aim to advance multimodal forecasting, promoting models that are both accurate and accessible to decision-makers with varied technical expertise. The benchmark can be visualized at https://servicenow.github.io/context-is-key-forecasting/v0/ .

2024-10-24

ArXiv (prépublication)

doi.org

arxiv.org

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Andrew Robert Williams

Arjun Ashok

Étienne Marcotte

Valentina Zantedeschi

Jithendaraa Subramanian

Alexandre Lacoste

Forecasting is a critical task in decision making across various domains. While numerical data provides a foundation, it often lacks crucial… (voir plus) context necessary for accurate predictions. Human forecasters frequently rely on additional information, such as background knowledge or constraints, which can be efficiently communicated through natural language. However, the ability of existing forecasting models to effectively integrate this textual information remains an open question. To address this, we introduce"Context is Key"(CiK), a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context, requiring models to integrate both modalities. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters, and propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark. Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings. By presenting this benchmark, we aim to advance multimodal forecasting, promoting models that are both accurate and accessible to decision-makers with varied technical expertise. The benchmark can be visualized at https://servicenow.github.io/context-is-key-forecasting/v0/ .

2024-10-24

ArXiv (prépublication)

doi.org

arxiv.org

ConvNTC: Convolutional neural tensor completion for predicting the disease-related miRNA pairs and cell-related drug pairs

Pei Liu

Xiao Liang

Yue Li

Jiawei Luo

2024-10-24

bioRxiv (prépublication)

doi.org

From Efficiency to Equity: Measuring Fairness in Preference Learning

S. Gowaikar

Hugo Berard

Rashid A. Mushkani

Shin (Alexandre) Koseki

As AI systems, particularly generative models, increasingly influence decision-making, ensuring that they are able to fairly represent diver… (voir plus)se human preferences becomes crucial. This paper introduces a novel framework for evaluating epistemic fairness in preference learning models inspired by economic theories of inequality and Rawlsian justice. We propose metrics adapted from the Gini Coefficient, Atkinson Index, and Kuznets Ratio to quantify fairness in these models. We validate our approach using two datasets: a custom visual preference dataset (AI-EDI-Space) and the Jester Jokes dataset. Our analysis reveals variations in model performance across users, highlighting potential epistemic injustices. We explore pre-processing and in-processing techniques to mitigate these inequalities, demonstrating a complex relationship between model efficiency and fairness. This work contributes to AI ethics by providing a framework for evaluating and improving epistemic fairness in preference learning models, offering insights for developing more inclusive AI systems in contexts where diverse human preferences are crucial.

2024-10-24

ArXiv (prépublication)

doi.org

arxiv.org

Structure Language Models for Protein Conformation Generation

Stephen Zhewen Lu

Hongyu Guo

Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is cru… (voir plus)cial for advancing drug discovery. Traditional physics-based simulation methods often struggle with sampling equilibrium conformations and are computationally expensive. Recently, deep generative models have shown promise in generating protein conformations as a more efficient alternative. However, these methods predominantly rely on the diffusion process within a 3D geometric space, which typically centers around the vicinity of metastable states and is often inefficient in terms of runtime. In this paper, we introduce Structure Language Modeling (SLM) as a novel framework for efficient protein conformation generation. Specifically, the protein structures are first encoded into a compact latent space using a discrete variational auto-encoder, followed by conditional language modeling that effectively captures sequence-specific conformation distributions. This enables a more efficient and interpretable exploration of diverse ensemble modes compared to existing methods. Based on this general framework, we instantiate SLM with various popular LM architectures as well as proposing the ESMDiff, a novel BERT-like structure language model fine-tuned from ESM3 with masked diffusion. We verify our approach in various scenarios, including the equilibrium dynamics of BPTI, conformational change pairs, and intrinsically disordered proteins. SLM provides a highly efficient solution, offering a 20-100x speedup than existing methods in generating diverse conformations, shedding light on promising avenues for future research.

2024-10-24

ArXiv (prépublication)

doi.org

arxiv.org

Structure Language Models for Protein Conformation Generation

Stephen Zhewen Lu

Hongyu Guo

Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is cru… (voir plus)cial for advancing drug discovery. Traditional physics-based simulation methods often struggle with sampling equilibrium conformations and are computationally expensive. Recently, deep generative models have shown promise in generating protein conformations as a more efficient alternative. However, these methods predominantly rely on the diffusion process within a 3D geometric space, which typically centers around the vicinity of metastable states and is often inefficient in terms of runtime. In this paper, we introduce Structure Language Modeling (SLM) as a novel framework for efficient protein conformation generation. Specifically, the protein structures are first encoded into a compact latent space using a discrete variational auto-encoder, followed by conditional language modeling that effectively captures sequence-specific conformation distributions. This enables a more efficient and interpretable exploration of diverse ensemble modes compared to existing methods. Based on this general framework, we instantiate SLM with various popular LM architectures as well as proposing the ESMDiff, a novel BERT-like structure language model fine-tuned from ESM3 with masked diffusion. We verify our approach in various scenarios, including the equilibrium dynamics of BPTI, conformational change pairs, and intrinsically disordered proteins. SLM provides a highly efficient solution, offering a 20-100x speedup than existing methods in generating diverse conformations, shedding light on promising avenues for future research.

2024-10-24

ArXiv (prépublication)

doi.org

arxiv.org

The Roles of Neural Networks in Language Acquisition

Eva Portelance

Masoud Jasbi

How can modern neural networks like language models be useful to the field of language acquisition, and more broadly cognitive science, if t… (voir plus)hey are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT‐4, the question of how they can inform our understanding of human language acquisition has re‐emerged. As such, it is critical to examine how in practice linking hypotheses between models and human learners can be safely established. To address these questions, we propose a model taxonomy, including four modelling approaches, each having differing goals, from exploratory hypothesis generation to hypothesis differentiation and testing. We show how the goals of these approaches align with the overarching goals of science and linguistics by connecting our taxonomy to the realist versus instrumentalist approaches in philosophy of science. We survey recent work having adopted each of our modelling approaches and address the importance of computational modelling in language acquisition studies.

2024-10-24

Language and Linguistics Compass (publié)

doi.org

The Roles of Neural Networks in Language Acquisition

Eva Portelance

Masoud Jasbi

How can modern neural networks like language models be useful to the field of language acquisition, and more broadly cognitive science, if t… (voir plus)hey are not a priori designed to be cognitive models? As developments towards natural language understanding and generation have improved leaps and bounds, with models like GPT‐4, the question of how they can inform our understanding of human language acquisition has re‐emerged. As such, it is critical to examine how in practice linking hypotheses between models and human learners can be safely established. To address these questions, we propose a model taxonomy, including four modelling approaches, each having differing goals, from exploratory hypothesis generation to hypothesis differentiation and testing. We show how the goals of these approaches align with the overarching goals of science and linguistics by connecting our taxonomy to the realist versus instrumentalist approaches in philosophy of science. We survey recent work having adopted each of our modelling approaches and address the importance of computational modelling in language acquisition studies.

2024-10-24

Language and Linguistics Compass (publié)

doi.org

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Shengyi Huang

The dominant paradigm for RLHF is online and on-policy RL: synchronously generating from the large language model (LLM) policy, labelling wi… (voir plus)th a reward model, and learning using feedback on the LLM's own outputs. While performant, this paradigm is computationally inefficient. Inspired by classical deep RL literature, we propose separating generation and learning in RLHF. This enables asynchronous generation of new samples while simultaneously training on old samples, leading to faster training and more compute-optimal scaling. However, asynchronous training relies on an underexplored regime, online but off-policy RLHF: learning on samples from previous iterations of our model. To understand the challenges in this regime, we investigate a fundamental question: how much off-policyness can we tolerate for asynchronous training to speed up learning but maintain performance? Among several RLHF algorithms we tested, we find that online DPO is most robust to off-policy data, and robustness increases with the scale of the policy model. We study further compute optimizations for asynchronous RLHF but find that they come at a performance cost, giving rise to a trade-off. Finally, we verify the scalability of asynchronous RLHF by training LLaMA 3.1 8B on an instruction-following task 40% faster than a synchronous run while matching final performance.

2024-10-23

ArXiv (prépublication)

doi.org

arxiv.org

Minimally Invasive Morphology Adaptation via Parameter Efficient Fine-Tuning

Michael Przystupa

Hongyao Tang

Mariano Phielipp

Santiago Miret

Martin Jägersand

Glen Berseth

Learning reinforcement learning policies to control individual robots is often computationally non-economical because minor variations in ro… (voir plus)bot morphology (e.g. dynamics or number of limbs) can negatively impact policy performance. This limitation has motivated morphology agnostic policy learning, in which a monolithic deep learning policy learns to generalize between robotic morphologies. Unfortunately, these policies still have sub-optimal zero-shot performance compared to end-to-end finetuning on target morphologies. This limitation has ramifications in practical robotic applications, as online finetuning large neural networks can require immense computation. In this work, we investigate \textit{parameter efficient finetuning} techniques to specialize morphology-agnostic policies to a target robot that minimizes the number of learnable parameters adapted during online learning. We compare direct finetuning, which update subsets of the base model parameters, and input-learnable approaches, which add additional parameters to manipulate inputs passed to the base model. Our analysis concludes that tuning relatively few parameters (0.01\% of the base model) can measurably improve policy performance over zero shot. These results serve a prescriptive purpose for future research for which scenarios certain PEFT approaches are best suited for adapting policy's to new robotic morphologies.

2024-10-23

corl.org/2024/Workshop/MAPoDeL (publié)

openreview.net

Modulation of leg trajectory by transcranial magnetic stimulation during walking

H. Bourgeois

Rose Guay-Hottin

E.-M. Meftah

M. Martinez

Marco Bonizzato

D. Barthélemy

The primary motor cortex is involved in initiation and adaptive control of locomotion. However, the role of the motor cortex in controlling … (voir plus)gait trajectories remains unclear. In animals, cortical neuromodulation allows for precise control of step height. We hypothesized that a similar control framework applies to humans, whereby cortical stimulation would primarily increase foot elevation. Transcranial magnetic stimulation (TMS) was applied over the motor cortex to assess the involvement of the corticospinal tract over the limb trajectory during human walking. Eight healthy adults (aged 20-32 years) participated in treadmill walking at 1.5 km/h. TMS was applied over the left motor cortex at an intensity of 120% of the threshold to elicit a dorsiflexion of the right ankle during the swing phase of gait. Electromyographic (EMG) measurements and three-dimensional (3D) lower limb kinematics were collected. When delivered during the early swing phase, TMS led to a significant increase in the maximum height of the right toe by a mean of 40.7% ± 14.9% (25.6mm ± 9.4 mm, p = 0.0352) and knee height by 57.8%± 16.8%; (32mm ± 9.3 mm; p = 0.008) across participants. These findings indicate that TMS can influence limb trajectory during walking, highlighting its potential as a tool for studying cortical control of locomotion.

2024-10-23

bioRxiv (prépublication)

doi.org

Multilingual Hallucination Gaps in Large Language Models

Cl'ea Chataigner

Afaf Taïk

Golnoosh Farnadi

Large language models (LLMs) are increasingly used as alternatives to traditional search engines given their capacity to generate text that … (voir plus)resembles human language. However, this shift is concerning, as LLMs often generate hallucinations, misleading or false information that appears highly credible. In this study, we explore the phenomenon of hallucinations across multiple languages in freeform text generation, focusing on what we call multilingual hallucination gaps. These gaps reflect differences in the frequency of hallucinated answers depending on the prompt and language used. To quantify such hallucinations, we used the FactScore metric and extended its framework to a multilingual setting. We conducted experiments using LLMs from the LLaMA, Qwen, and Aya families, generating biographies in 19 languages and comparing the results to Wikipedia pages. Our results reveal variations in hallucination rates, especially between high and low resource languages, raising important questions about LLM multilingual performance and the challenges in evaluating hallucinations in multilingual freeform text generation.

2024-10-23

ArXiv (prépublication)

doi.org

arxiv.org

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Publications

Science éclair

À l’avant-garde d’une nouvelle ère

Demandes de supervision

Mots-clés populaires:

Publications