Publications

Trophic Interactions Are Key to Understanding the Effects of Global Change on the Distribution and Functional Role of the Brown Bear

Pablo M. Lucas

Wilfried Thuiller

Lauren Talluto

Ester Polaina

Jörg Albrecht

Nuria Selva

Marta De Barba

Vincenzo Penteriani

Maya Guéguen

Niko Balkenhol

Trishna Dutta

Ancuta Fedorca

Shane C. Frank

Andreas Zedrosser

Ivan Afonso‐Jordana

Hüseyin Ambarlı

Fernando Ballesteros

Andriy‐Taras Bashta

Cemal Can Bilgin

Neda Bogdanović … (see 67 more)

Edgars Bojārs

Katarzyna Bojarska

Natalia Bragalanti

Henrik Brøseth

Mark W. Chynoweth

Duško Ćirović

Paolo Ciucci

Andrea Corradini

Daniele De Angelis

Miguel de Gabriel Hernando

Csaba Domokos

Aleksander Dutsov

Alper Ertürk

Stefano Filacorda

Lorenzo Frangini

Claudio Groff

Samuli Heikkinen

Bledi Hoxha

Djuro Huber

Otso Huitu

Georgeta Ionescu

Ovidiu Ionescu

Klemen Jerina

Ramon Jurj

Alexandros A. Karamanlidis

Jonas Kindberg

Ilpo Kojola

José Vicente López‐Bao

Peep Männil

Dime Melovski

Yorgos Mertzanis

Paolo Molinari

Anja Molinari‐Jobin

Andrea Mustoni

Javier Naves

Sergey Ogurtsov

Deniz Özüt

Santiago Palazón

Luca Pedrotti

Aleksandar Perović

Vladimir N. Piminov

Ioan‐Mihai Pop

Marius Popa

Maria Psaralexi

Pierre‐Yves Quenette

Georg Rauer

Slaven Reljic

Eloy Revilla

Urmas Saarma

Alexander P. Saveljev

Ali Onur Sayar

Çagan H. Şekercioğlu

Agnieszka Sergiel

George Sîrbu

Tomaž Skrbinšek

Michaela Skuban

Anil Soyumert

Aleksandar Stojanov

Egle Tammeleht

Konstantin Tirronen

Aleksandër Trajçe

Igor Trbojević

Tijana Trbojević

Filip Zięba

Diana Zlatanova

Tomasz Zwijacz‐Kozica

Laura J. Pollock

ABSTRACT Biotic interactions are expected to influence species' responses to global changes, but they are rarely considered across broad spa… (see more)tial extents. Abiotic factors are thought to operate at larger spatial scales, while biotic factors, such as species interactions, are considered more important at local scales within communities, in part because of the knowledge gap on species interactions at large spatial scales (i.e., the Eltonian shortfall). We assessed, at a continental scale, (i) the importance of biotic interactions, through food webs, on species distributions, and (ii) how biotic interactions under scenarios of climate and land‐use change may affect the distribution of the brown bear ( Ursus arctos ). We built a highly detailed, spatially dynamic, and empirically sampled food web based on the energy contribution of 276 brown bear food species from different taxa (plants, vertebrates, and invertebrates) and their ensemble habitat models at high resolution across Europe. Then, combining energy contribution and predicted habitat of food species, we modelled energy contribution across space and included these layers within Bayesian‐based models of the brown bear distribution in Europe. The inclusion of biotic interactions considerably improved our understanding of brown bear distribution at large (continental) scales compared with Bayesian models including only abiotic factors (climate and land use). Predicted future range shifts, which included changes in the distribution of food species, varied greatly when considering various scenarios of change in biotic factors, providing a warning that future indirect climate and land‐use change are likely to have strong but highly uncertain impacts on species biogeography. Our study confirmed that advancing our understanding of ecological networks of species interactions will improve future projections of biodiversity change, especially for modelling species distributions and their functional role under climate and land‐use change scenarios, which is key for effective conservation of biodiversity and ecosystem services.

2025-06-04

Global Change Biology (published)

doi.org

Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems

Emma Harvey

Emily Sheng

Su Lin Blodgett

Alexandra Chouldechova

Jean Garcia-Gathright

Alexandra Olteanu

Hanna Wallach

The NLP research community has made publicly available numerous instruments for measuring representational harms caused by large language mo… (see more)del (LLM)-based systems. These instruments have taken the form of datasets, metrics, tools, and more. In this paper, we examine the extent to which such instruments meet the needs of practitioners tasked with evaluating LLM-based systems. Via semi-structured interviews with 12 such practitioners, we find that practitioners are often unable to use publicly available instruments for measuring representational harms. We identify two types of challenges. In some cases, instruments are not useful because they do not meaningfully measure what practitioners seek to measure or are otherwise misaligned with practitioner needs. In other cases, instruments - even useful instruments - are not used by practitioners due to practical and institutional barriers impeding their uptake. Drawing on measurement theory and pragmatic measurement, we provide recommendations for addressing these challenges to better meet practitioner needs.

2025-06-04

ArXiv (preprint)

arxiv.org

Are Large Language Models Good Temporal Graph Learners?

Zifeng Ding

Michael Bronstein

Reihaneh Rabbany

Guillaume Rabusseau

Large Language Models (LLMs) have recently driven significant advancements in Natural Language Processing and various other applications. Wh… (see more)ile a broad range of literature has explored the graph-reasoning capabilities of LLMs, including their use of predictors on graphs, the application of LLMs to dynamic graphs -- real world evolving networks -- remains relatively unexplored. Recent work studies synthetic temporal graphs generated by random graph models, but applying LLMs to real-world temporal graphs remains an open question. To address this gap, we introduce Temporal Graph Talker (TGTalker), a novel temporal graph learning framework designed for LLMs. TGTalker utilizes the recency bias in temporal graphs to extract relevant structural information, converted to natural language for LLMs, while leveraging temporal neighbors as additional information for prediction. TGTalker demonstrates competitive link prediction capabilities compared to existing Temporal Graph Neural Network (TGNN) models. Across five real-world networks, TGTalker performs competitively with state-of-the-art temporal graph methods while consistently outperforming popular models such as TGN and HTGN. Furthermore, TGTalker generates textual explanations for each prediction, thus opening up exciting new directions in explainability and interpretability for temporal link prediction. The code is publicly available at https://github.com/shenyangHuang/TGTalker.

2025-06-03

ArXiv (preprint)

arxiv.org

Galaxy cluster characterization with machine learning techniques

M. Sadikov

J. Hlavacek-Larrondo

Laurence Perreault-Levasseur

C. L. Rhea

M. McDonald

M. Ntampaka

J. ZuHone

We present an analysis of the X-ray properties of the galaxy cluster population in the z=0 snapshot of the IllustrisTNG simulations, utilizi… (see more)ng machine learning techniques to perform clustering and regression tasks. We examine five properties of the hot gas (the central cooling time, the central electron density, the central entropy excess, the concentration parameter, and the cuspiness) which are commonly used as classification metrics to identify cool core (CC), weak cool core (WCC) and non cool core (NCC) clusters of galaxies. Using mock Chandra X-ray images as inputs, we first explore an unsupervised clustering scheme to see how the resulting groups correlate with the CC/WCC/NCC classification based on the different criteria. We observe that the groups replicate almost exactly the separation of the galaxy cluster images when classifying them based on the concentration parameter. We then move on to a regression task, utilizing a ResNet model to predict the value of all five properties. The network is able to achieve a mean percentage error of 1.8% for the central cooling time, and a balanced accuracy of 0.83 on the concentration parameter, making them the best-performing metrics. Finally, we use simulation-based inference (SBI) to extract posterior distributions for the network predictions. Our neural network simultaneously predicts all five classification metrics using only mock Chandra X-ray images. This study demonstrates that machine learning is a viable approach for analyzing and classifying the large galaxy cluster datasets that will soon become available through current and upcoming X-ray surveys, such as eROSITA.

2025-06-03

The Astrophysical Journal (published)

doi.org

arxiv.org

It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics

Matthew Kowal

Jasper Timm

Jean-François Godbout

Thomas H Costello

Antonio A. Arechar

Gordon Pennycook

David Rand

Adam Gleave

Kellin Pelrine

Persuasion is a powerful capability of large language models (LLMs) that both enables beneficial applications (e.g. helping people quit smok… (see more)ing) and raises significant risks (e.g. large-scale, targeted political manipulation). Prior work has found models possess a significant and growing persuasive capability, measured by belief changes in simulated or real users. However, these benchmarks overlook a crucial risk factor: the propensity of a model to attempt to persuade in harmful contexts. Understanding whether a model will blindly ``follow orders'' to persuade on harmful topics (e.g. glorifying joining a terrorist group) is key to understanding the efficacy of safety guardrails. Moreover, understanding if and when a model will engage in persuasive behavior in pursuit of some goal is essential to understanding the risks from agentic AI systems. We propose the Attempt to Persuade Eval (APE) benchmark, that shifts the focus from persuasion success to persuasion attempts, operationalized as a model's willingness to generate content aimed at shaping beliefs or behavior. Our evaluation framework probes frontier LLMs using a multi-turn conversational setup between simulated persuader and persuadee agents. APE explores a diverse spectrum of topics including conspiracies, controversial issues, and non-controversially harmful content. We introduce an automated evaluator model to identify willingness to persuade and measure the frequency and context of persuasive attempts. We find that many open and closed-weight models are frequently willing to attempt persuasion on harmful topics and that jailbreaking can increase willingness to engage in such behavior. Our results highlight gaps in current safety guardrails and underscore the importance of evaluating willingness to persuade as a key dimension of LLM risk. APE is available at github.com/AlignmentResearch/AttemptPersuadeEval

2025-06-03

ArXiv (preprint)

arxiv.org

Sociodemographic characteristics of SARS-CoV-2 serosurveillance studies with diverse recruitment strategies, Canada, 2020 to 2023

Matthew J Knight

Yuan Yu

Jiacheng Chen

Sheila F O’Brien

David Buckeridge

Carmen Charlton

W Alton Russell

2025-06-03

BMC Public Health (published)

doi.org

ToothForge: Automatic Dental Shape Generation using Synchronized Spectral Embeddings

Tibor Kubík

Franccois Guibault

Michal vSpanvel

Hervé Lombaert

We introduce ToothForge, a spectral approach for automatically generating novel 3D teeth, effectively addressing the sparsity of dental shap… (see more)e datasets. By operating in the spectral domain, our method enables compact machine learning modeling, allowing the generation of high-resolution tooth meshes in milliseconds. However, generating shape spectra comes with the instability of the decomposed harmonics. To address this, we propose modeling the latent manifold on synchronized frequential embeddings. Spectra of all data samples are aligned to a common basis prior to the training procedure, effectively eliminating biases introduced by the decomposition instability. Furthermore, synchronized modeling removes the limiting factor imposed by previous methods, which require all shapes to share a common fixed connectivity. Using a private dataset of real dental crowns, we observe a greater reconstruction quality of the synthetized shapes, exceeding those of models trained on unaligned embeddings. We also explore additional applications of spectral analysis in digital dentistry, such as shape compression and interpolation. ToothForge facilitates a range of approaches at the intersection of spectral analysis and machine learning, with fewer restrictions on mesh structure. This makes it applicable for shape analysis not only in dentistry, but also in broader medical applications, where guaranteeing consistent connectivity across shapes from various clinics is unrealistic. The code is available at https://github.com/tiborkubik/toothForge.

2025-06-03

ArXiv (preprint)

arxiv.org

Self-Refining Training for Amortized Density Functional Theory

Majdi Hassan

Cristian Gabellini

Hatem Helal

Dominique Beaini

Kirill Neklyudov

Density Functional Theory (DFT) allows for predicting all the chemical and physical properties of molecular systems from first principles by… (see more) finding an approximate solution to the many-body Schr\"odinger equation. However, the cost of these predictions becomes infeasible when increasing the scale of the energy evaluations, e.g., when calculating the ground-state energy for simulating molecular dynamics. Recent works have demonstrated that, for substantially large datasets of molecular conformations, Deep Learning-based models can predict the outputs of the classical DFT solvers by amortizing the corresponding optimization problems. In this paper, we propose a novel method that reduces the dependency of amortized DFT solvers on large pre-collected datasets by introducing a self-refining training strategy. Namely, we propose an efficient method that simultaneously trains a deep-learning model to predict the DFT outputs and samples molecular conformations that are used as training data for the model. We derive our method as a minimization of the variational upper bound on the KL-divergence measuring the discrepancy between the generated samples and the target Boltzmann distribution defined by the ground state energy. To demonstrate the utility of the proposed scheme, we perform an extensive empirical study comparing it with the models trained on the pre-collected datasets. Finally, we open-source our implementation of the proposed algorithm, optimized with asynchronous training and sampling stages, which enables simultaneous sampling and training. Code is available at https://github.com/majhas/self-refining-dft.

2025-06-02

ArXiv (preprint)

arxiv.org

Unpacking Softmax: How Temperature Drives Representation Collapse, Compression, and Generalization

Wojciech Masarczyk

Mateusz Ostaszewski

Tin Sum Cheng

Tomasz Trzci'nski

Aurélien Lucchi

Razvan Pascanu

The softmax function is a fundamental building block of deep neural networks, commonly used to define output distributions in classification… (see more) tasks or attention weights in transformer architectures. Despite its widespread use and proven effectiveness, its influence on learning dynamics and learned representations remains poorly understood, limiting our ability to optimize model behavior. In this paper, we study the pivotal role of the softmax function in shaping the model's representation. We introduce the concept of rank deficit bias - a phenomenon in which softmax-based deep networks find solutions of rank much lower than the number of classes. This bias depends on the softmax function's logits norm, which is implicitly influenced by hyperparameters or directly modified by softmax temperature. Furthermore, we demonstrate how to exploit the softmax dynamics to learn compressed representations or to enhance their performance on out-of-distribution data. We validate our findings across diverse architectures and real-world datasets, highlighting the broad applicability of temperature tuning in improving model performance. Our work provides new insights into the mechanisms of softmax, enabling better control over representation learning in deep neural networks.

2025-06-02

ArXiv (preprint)

arxiv.org

Advancing global antifungal development to combat invasive fungal infection

Xiu-Li Wang

Jun Ding

Koon Ho Wong

Chen Ding

Chang-Bin Chen

Wen-Juan Wu

Ningning Liu

2025-06-01

hLife (published)

doi.org

Adversarial Attack Classification and Robustness Testing for Large Language Models for Code

Yang Liu

Armstrong Foundjem

Foutse Khomh

Heng Li

Large Language Models (LLMs) have become vital tools in software development tasks such as code generation, completion, and analysis. As the… (see more)ir integration into workflows deepens, ensuring robustness against vulnerabilities especially those triggered by diverse or adversarial inputs becomes increasingly important. Such vulnerabilities may lead to incorrect or insecure code generation when models encounter perturbed task descriptions, code, or comments. Prior research often overlooks the role of natural language in guiding code tasks. This study investigates how adversarial perturbations in natural language inputs including prompts, comments, and descriptions affect LLMs for Code (LLM4Code). It examines the effects of perturbations at the character, word, and sentence levels to identify the most impactful vulnerabilities. We analyzed multiple projects (e.g., ReCode, OpenAttack) and datasets (e.g., HumanEval, MBPP), establishing a taxonomy of adversarial attacks. The first dimension classifies the input type code, prompts, or comments while the second dimension focuses on granularity: character, word, or sentence-level changes. We adopted a mixed-methods approach, combining quantitative performance metrics with qualitative vulnerability analysis. LLM4Code models show varying robustness across perturbation types. Sentence-level attacks were least effective, suggesting models are resilient to broader contextual changes. In contrast, word-level perturbations posed serious challenges, exposing semantic vulnerabilities. Character-level effects varied, showing model sensitivity to subtle syntactic deviations.Our study offers a structured framework for testing LLM4Code robustness and emphasizes the critical role of natural language in adversarial evaluation. Improving model resilience to semantic-level disruptions is essential for secure and reliable code-generation systems.

2025-06-01

arXiv (published)

doi.org

arxiv.org

Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery

Mélisande Teng

Etienne Lalibert'e

Information on trees at the individual level is crucial for monitoring forest ecosystems and planning forest management. Current monitoring … (see more)methods involve ground measurements, requiring extensive cost, time and labor. Advances in drone remote sensing and computer vision offer great potential for mapping individual trees from aerial imagery at broad-scale. Large pre-trained vision models, such as the Segment Anything Model (SAM), represent a particularly compelling choice given limited labeled data. In this work, we compare methods leveraging SAM for the task of automatic tree crown instance segmentation in high resolution drone imagery in three use cases: 1) boreal plantations, 2) temperate forests and 3) tropical forests. We also study the integration of elevation data into models, in the form of Digital Surface Model (DSM) information, which can readily be obtained at no additional cost from RGB drone imagery. We present BalSAM, a model leveraging SAM and DSM information, which shows potential over other methods, particularly in the context of plantations. We find that methods using SAM out-of-the-box do not outperform a custom Mask R-CNN, even with well-designed prompts. However, efficiently tuning SAM end-to-end and integrating DSM information are both promising avenues for tree crown instance segmentation models.

2025-06-01

arXiv (published)

doi.org

arxiv.org

Speed Science

Leading in a New Era

Supervision Requests

Publications

Speed Science

Leading in a New Era

Supervision Requests

Popular keywords:

Publications