Publications

Galileo: Learning Global&Local Features of Many Remote Sensing Modalities
Anthony Fuller
Henry Herzog
Patrick Beukema
Favyen Bastani
James R Green
Evan Shelhamer
Hannah Kerner
We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (see more)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and fast) to glaciers (thousands of pixels and slow). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.
Generative AI: Hype, Hope, and Responsible Use in Science and Everyday Life
Half Search Space is All You Need
Pavel Rumiantsev
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Neil He
Rishabh Anand
Hiren Madhu
Ali Maatouk
Leandros Tassiulas
Menglin Yang 0001
Rex Ying
Impact of through‐slice gradient optimization for dynamic slice‐wise shimming in the cervico‐thoracic spinal cord
Arnaud Breheret
Alexandre D'Astous
Yixin Ma
Jason P. Stockmann
Improving Multilingual Math Reasoning for African Languages
Odunayo Ogundepo
Akintunde Oladipo
Kelechi Ogueji
Esther Adenuga
Jimmy Lin
Researchers working on low-resource languages face persistent challenges due to limited data availability and restricted access to computati… (see more)onal resources. Although most large language models (LLMs) are predominantly trained in high-resource languages, adapting them to low-resource contexts, particularly African languages, requires specialized techniques. Several strategies have emerged for adapting models to low-resource languages in todays LLM landscape, defined by multi-stage pre-training and post-training paradigms. However, the most effective approaches remain uncertain. This work systematically investigates which adaptation strategies yield the best performance when extending existing LLMs to African languages. We conduct extensive experiments and ablation studies to evaluate different combinations of data types (translated versus synthetically generated), training stages (pre-training versus post-training), and other model adaptation configurations. Our experiments focuses on mathematical reasoning tasks, using the Llama 3.1 model family as our base model.
Improving the Scaling Laws of Synthetic Data with Deliberate Practice
Learning Penalty for Optimal Partitioning via Automatic Feature Extraction
Tung L. Nguyen
Changepoint detection identifies significant shifts in data sequences, making it important in areas like finance, genetics, and healthcare. … (see more)The Optimal Partitioning algorithms efficiently detect these changes, using a penalty parameter to limit the changepoints number. Determining the appropriate value for this penalty can be challenging. Traditionally, this process involved manually extracting statistical features, such as sequence length or variance to make the prediction. This study proposes a novel approach that uses recurrent neural networks to learn this penalty directly from raw sequences by automatically extracting features. Experiments conducted on 20 benchmark genomic datasets show that this novel method surpasses traditional methods in partitioning accuracy in most cases.
Leveraging Per-Instance Privacy for Machine Unlearning
Anvith Thudi
Berivan Isik
Ashmita Bhattacharyya
Nicolas Papernot
Eleni Triantafillou
Daniel M. Roy
LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs
Foundation models based on large language models (LLMs) have shown great success in handling various tasks and modalities. However, adapting… (see more) these models for general-purpose audio-language tasks is challenging due to differences in acoustic environments and task variations. In this work, we introduce LiSTEN Learning Soft Token Embeddings for Neural Audio LLMs), a framework for adapting LLMs to speech and audio tasks. LiSTEN uses a dynamic prompt selection strategy with learnable key-value pairs, allowing the model to balance general and task-specific knowledge while avoiding overfitting in a multitask setting. Our approach reduces dependence on large-scale ASR or captioning datasets, achieves competitive performance with fewer trainable parameters, and simplifies training by using a single-stage process. Additionally, LiSTEN enhances interpretability by analyzing the diversity and overlap of selected prompts across different tasks.
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
Sergio Arnaud
Paul McVay
Ada Martin
Arjun Majumdar
Krishna Murthy
Phillip Thomas
Ruslan Partsey
Daniel Dugas
Abha Gejji
Alexander Sax
Vincent-Pierre Berges
Mikael Henaff
Ayush Jain
Ang Cao
Ishita Prasad
Mrinal Kalakrishnan
Mahmoud Assran
Oleksandr Maksymets … (see 2 more)
Aravind Rajeswaran
Franziska Meier
Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning