Publications

A Strong Baseline for Molecular Few-Shot Learning

Hugo Jeannin

Ismail Ben Ayed

Few-shot learning has recently attracted significant interest in drug discovery, with a recent, fast-growing literature mostly involving con… (see more)voluted meta-learning strategies. We revisit the more straightforward fine-tuning approach for molecular data, and propose a regularized quadratic-probe loss based on the the Mahalanobis distance. We design a dedicated block-coordinate descent optimizer, which avoid the degenerate solutions of our loss. Interestingly, our simple fine-tuning approach achieves highly competitive performances in comparison to state-of-the-art methods, while being applicable to black-box settings and removing the need for specific episodic pre-training strategies. Furthermore, we introduce a new benchmark to assess the robustness of the competing methods to domain shifts. In this setting, our fine-tuning baseline obtains consistently better results than meta-learning methods.

2025-02-14

TMLR (accepted)

openreview.net

From Markov to Laplace: How Mamba In-Context Learns Markov Chains

Marco Bondaschi

Nived Rajaraman

Xiuying Wei

Kannan Ramchandran

Razvan Pascanu

Caglar Gulçehre

Michael C. Gastpar

Ashok Vardhan Makkuva

2025-02-13

ArXiv (preprint)

doi.org

arxiv.org

A Taxonomy of Linguistic Expressions That Contribute To Anthropomorphism of Language Technologies

Alicia DeVrio

Myra Cheng

Lisa Egede

A.R. Olteanu

Su Lin Blodgett

Recent attention to anthropomorphism -- the attribution of human-like qualities to non-human objects or entities -- of language technologies… (see more) like LLMs has sparked renewed discussions about potential negative impacts of anthropomorphism. To productively discuss the impacts of this anthropomorphism and in what contexts it is appropriate, we need a shared vocabulary for the vast variety of ways that language can be anthropomorphic. In this work, we draw on existing literature and analyze empirical cases of user interactions with language technologies to develop a taxonomy of textual expressions that can contribute to anthropomorphism. We highlight challenges and tensions involved in understanding linguistic anthropomorphism, such as how all language is fundamentally human and how efforts to characterize and shift perceptions of humanness in machines can also dehumanize certain humans. We discuss ways that our taxonomy supports more precise and effective discussions of and decisions about anthropomorphism of language technologies.

2025-02-13

ArXiv (preprint)

doi.org

arxiv.org

Bugs in Large Language Models Generated Code: An Empirical Study

Florian Tambon

Arghavan Moradi Dakhel

Amin Nikanjam

Foutse Khomh

Michel C. Desmarais

Giuliano Antoniol

Large Language Models (LLMs) for code have gained significant attention recently. They can generate code in different programming languages … (see more)based on provided prompts, fulfilling a long-lasting dream in Software Engineering (SE), i.e., automatic code generation. Similar to human-written code, LLM-generated code is prone to bugs, and these bugs have not yet been thoroughly examined by the community. Given the increasing adoption of LLM-based code generation tools (e.g., GitHub Copilot) in SE activities, it is critical to understand the characteristics of bugs contained in code generated by LLMs. This paper examines a sample of 333 bugs collected from code generated using three leading LLMs (i.e., CodeGen, PanGu-Coder, and Codex) and identifies the following 10 distinctive bug patterns: Misinterpretations, Syntax Error, Silly Mistake, Prompt-biased code, Missing Corner Case, Wrong Input Type, Hallucinated Object, Wrong Attribute, Incomplete Generation, and Non-Prompted Consideration. The bug patterns are presented in the form of a taxonomy. The identified bug patterns are validated using an online survey with 34 LLM practitioners and researchers. The surveyed participants generally asserted the significance and prevalence of the bug patterns. Researchers and practitioners can leverage these findings to develop effective quality assurance techniques for LLM-generated code. This study sheds light on the distinctive characteristics of LLM-generated code.

2025-02-12

Empirical Software Engineering (published)

doi.org

arxiv.org

Galileo: Learning Global&Local Features of Many Remote Sensing Modalities

Gabriel Tseng

Anthony Fuller

Marlena Reil

Henry Herzog

Patrick Beukema

Favyen Bastani

James R Green

Evan Shelhamer

Hannah Kerner

David Rolnick

We introduce a highly multimodal transformer to represent many remote sensing modalities - multispectral optical, synthetic aperture radar, … (see more)elevation, weather, pseudo-labels, and more - across space and time. These inputs are useful for diverse remote sensing tasks, such as crop mapping and flood detection. However, learning shared representations of remote sensing data is challenging, given the diversity of relevant data modalities, and because objects of interest vary massively in scale, from small boats (1-2 pixels and fast) to glaciers (thousands of pixels and slow). We present a novel self-supervised learning algorithm that extracts multi-scale features across a flexible set of input modalities through masked modeling. Our dual global and local contrastive losses differ in their targets (deep representations vs. shallow input projections) and masking strategies (structured vs. not). Our Galileo is a single generalist model that outperforms SoTA specialist models for satellite images and pixel time series across eleven benchmarks and multiple tasks.

2025-02-12

ArXiv (preprint)

doi.org

proceedings.mlr.press

INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages

Hao Yu

Jesujoba Oluwadara Alabi

Andiswa Bukula

Zhuang Yun Jian

En-Shiun Annie Lee

Tadesse Kebede Guge

Israel Abebe Azime

Happy Buzaaba

Blessing Kudzaishe Sibanda

Godson Kalipe

Jonathan Mukiibi

S. Kabenamualu

M. Setaka

Lolwethu Ndolela

Nkiruka Bridget Odu

Rooweither Mabuya

Shamsuddeen Hassan Muhammad

Salomey Osei

Sokhar Samb

Juliet W. Murage … (see 2 more)

Dietrich Klakow

David Ifeoluwa Adelani

2025-02-12

ArXiv (preprint)

doi.org

arxiv.org

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons

When training neural networks, dying neurons -- units becoming inactive or saturated -- are traditionally seen as harmful. This paper sheds … (see more)new light on this phenomenon. By exploring the impact of various hyperparameter configurations on dying neurons during training, we gather insights on how to improve upon sparse training approaches to pruning. We introduce Demon Pruning (DemP), a method that controls the proliferation of dead neurons through a combination of noise injection on active units and a one-cycle schedule regularization strategy, dynamically leading to network sparsity. Experiments on CIFAR-10 and ImageNet datasets demonstrate that DemP outperforms existing dense-to-sparse structured pruning methods, achieving better accuracy-sparsity tradeoffs and accelerating training by up to 3.56

2025-02-12

TMLR (accepted)

doi.org

openreview.net

Perception and neural representation of intermittent odor stimuli in mice

Luis Boero

Hao Wu

Joseph D. Zak

Paul Masset

Farhad Pashakhanloo

Siddharth Jayakumar

Bahareh Tolooshams

Demba Ba

Venkatesh N. Murthy

2025-02-12

bioRxiv (preprint)

doi.org

scCobra allows contrastive cell embedding learning with domain adaptation for single cell data integration and harmonization

Bowen Zhao

Kailu Song

Dong-Qing Wei

Yi Xiong

Jun Ding

2025-02-12

Communications Biology (published)

doi.org

Author Correction: Isospin competitions and valley polarized correlated insulators in twisted double bilayer graphene

Le Liu

Shihao Zhang

Yanbang Chu

Cheng Shen

Yuan Huang

Yalong Yuan

Jinpeng Tian

Jian Tang

Yiru Ji

Rong Yang

Kenji Watanabe

Takashi Taniguchi

Dongxia Shi

Jianpeng Liu

Wei Yang

Guangyu Zhang

2025-02-11

Nature Communications (published)

doi.org

Modeling Multivariable High-resolution 3D Urban Microclimate Using Localized Fourier Neural Operator

Shaoxiang Qin

Dongxue Zhan

Dingyang Geng

Wenhui Peng

Geng Tian

Yurong Shi

Naiping Gao

Xue Liu

Liangzhu (Leon) Wang

Accurate urban microclimate analysis with wind velocity and temperature is vital for energy-efficient urban planning, supporting carbon redu… (see more)ction, enhancing public health and comfort, and advancing the low-altitude economy. However, traditional computational fluid dynamics (CFD) simulations that couple velocity and temperature are computationally expensive. Recent machine learning advancements offer promising alternatives for accelerating urban microclimate simulations. The Fourier neural operator (FNO) has shown efficiency and accuracy in predicting single-variable velocity magnitudes in urban wind fields. Yet, for multivariable high-resolution 3D urban microclimate prediction, FNO faces three key limitations: blurry output quality, high GPU memory demand, and substantial data requirements. To address these issues, we propose a novel localized Fourier neural operator (Local-FNO) model that employs local training, geometry encoding, and patch overlapping. Local-FNO provides accurate predictions for rapidly changing turbulence in urban microclimate over 60 seconds, four times the average turbulence integral time scale, with an average error of 0.35 m/s in velocity and 0.30 °C in temperature. It also accurately captures turbulent heat flux represented by the velocity-temperature correlation. In a 2 km by 2 km domain, Local-FNO resolves turbulence patterns down to a 10 m resolution. It provides high-resolution predictions with 150 million feature dimensions on a single 32 GB GPU at nearly 50 times the speed of a CFD solver. Compared to FNO, Local-FNO achieves a 23.9% reduction in prediction error and a 47.3% improvement in turbulent fluctuation correlation.

2025-02-11

Building and Environment (published)

doi.org

arxiv.org

HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell Data

Siddharth Viswanath

Hiren Madhu

Dhananjay Bhaskar

Jake Kovalic

David R. Johnson

Rex Ying

Christopher Tape

Ian Adelstein

Michael Perlmutter

Smita Krishnaswamy

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning o… (see more)n high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality --exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i.e., one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically. We showcase the utility of HiPoNet on point-cloud level tasks, involving classification and regression of entire point-clouds in data cohorts. Experimentally, we find that HiPoNet outperforms other point-cloud and graph-based models on single-cell data. We also apply HiPoNet to spatial transcriptomics datasets using spatial coordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

2025-02-10

ArXiv (preprint)

doi.org

arxiv.org

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications