Publications

A Hardware‐in‐Loop Digital Twin Approach for Intelligent Optimization of Municipal Solid Waste Incineration

Jian Tang

Wen Yu

JunFei Qiao

2025-10-20

(published)

doi.org

Neural FIM: Bridging Statistical Manifolds and Generative Modeling through Fisher Geometry

Yanlei Zhang

Guillaume Huguet

Edward De Brouwer

Danqi Liao

Oluwadamilola Fasina

Alexander Tong

Ricky T. Q. Chen

Guy Wolf

Maximilian Nickel

Ian Adelstein

Smita Krishnaswamy

While data diffusion-based embeddings are widely used in unsupervised learning to reveal the intrinsic geometry of data, they are fundamenta… (see more)lly constrained by their discrete nature and inability to generalize beyond training points. This limitation ob

2025-10-20

TechRxiv (accepted)

doi.org

Rapid De Novo Antibody Design with GeoFlow-V3

BioGeometry Team

Jian Tang

Recent years have witnessed striking advances in miniprotein design, yet de novo antibody discovery remains challenging, marked by low bindi… (see more)ng rates and the need for extensive, labor-intensive experimental screening of millions of candidates. This technical report introduces GeoFlow-V3, a unified atomic generative model for structure prediction and protein design. GeoFlow-V3 delivers improved accuracy on antibody-antigen complex structure prediction relative to our previous version, and its performance is further enhanced when experimental constraints or prior knowledge are provided, enabling precise control over both folding and design. The model also demonstrates reliable ability to discriminate binders from non-binders based on its confidence scores. Leveraging this capability, we build a GeoFlow-V3 in silico pipeline to design no more than 50 nanobodies per therapeutically relevant target de novo, completing a single round of wet-lab characterization in under three weeks. GeoFlow-V3 identifies at least one binder for 8 tested epitopes and achieves an average hit rate of 15.5%, representing a two-orders-of-magnitude improvement over prior computational pipelines. These results position GeoFlow-V3 as an appealing platform for rapid, AI-driven therapeutic antibody discovery, significantly reducing experimental screening demands and offering a powerful avenue to tackle previously undruggable targets. A demo of GeoFlow-V3 can be accessed via prot.design for non-commercial use.

2025-10-20

bioRxiv (preprint)

doi.org

Improved Localized Machine Unlearning Through the Lens of Memorization

Reihaneh Torkzadehmahani

Reza Nasirigerdeh

Georgios Kaissis

Daniel Rueckert

Gintare Karolina Dziugaite

Eleni Triantafillou

Machine unlearning refers to removing the influence of a specified subset of training data from a machine learning model, efficiently, after… (see more) it has already been trained. This is important for key applications, including making the model more accurate by removing outdated, mislabeled, or poisoned data. In this work, we study localized unlearning, where the unlearning algorithm operates on a (small) identified subset of parameters. Drawing inspiration from the memorization literature, we propose an improved localization strategy that yields strong results when paired with existing unlearning algorithms. We also propose a new unlearning algorithm, Deletion by Example Localization (DEL), that resets the parameters deemed-to-be most critical according to our localization strategy, and then finetunes them. Our extensive experiments on different datasets, forget sets and metrics reveal that DEL sets a new state-of-the-art for unlearning metrics, against both localized and full-parameter methods, while modifying a small subset of parameters, and outperforms the state-of-the-art localized unlearning in terms of test accuracy too.

2025-10-19

TMLR (accepted)

doi.org

openreview.net

The spatially-resolved effect of mergers on the stellar mass assembly of MaNGA galaxies

Eirini Angeloudi

Marc Huertas-Company

Jesús Falcón-Barroso

Laurence Perreault-Levasseur

Alexandre Adam

Alina Boecker

Understanding the origin of stars within a galaxy - whether formed in-situ or accreted from other galaxies (ex-situ) - is key to constrainin… (see more)g its evolution. Spatially resolving these components provides crucial insights into a galaxy's mass assembly history. We aim to predict the spatial distribution of ex-situ stellar mass fraction in MaNGA galaxies, and to identify distinct assembly histories based on the radial gradients of these predictions in the central regions. We employ a diffusion model trained on mock MaNGA analogs (MaNGIA), derived from the TNG50 cosmological simulation. The model learns to predict the posterior distribution of resolved ex-situ stellar mass fraction maps, conditioned on stellar mass density, velocity, and velocity dispersion gradient maps. After validating the model on an unseen test set from MaNGIA, we apply it to MaNGA galaxies to infer the spatially-resolved distribution of their ex-situ stellar mass fractions - i.e. the fraction of stellar mass in each spaxel originating from mergers. We identify four broad categories of ex-situ mass distributions: flat gradient, in-situ dominated; flat gradient, ex-situ dominated; positive gradient; and negative gradient. The vast majority of MaNGA galaxies fall in the first category - flat gradients with low ex-situ fractions - confirming that in-situ star formation is the main assembly driver for low- to intermediate-mass galaxies. At high stellar masses, the ex-situ maps are more diverse, highlighting the key role of mergers in building the most massive systems. Ex-situ mass distributions correlate with morphology, star-formation activity, stellar kinematics, and environment, indicating that accretion history is a primary factor shaping massive galaxies. Finally, by tracing their assembly histories in TNG50, we link each class to distinct merger scenarios, ranging from secular evolution to merger-dominated growth.

2025-10-19

Astronomy & Astrophysics (published)

doi.org

arxiv.org

Toward the Decarbonization of Maritime Supply Chains: A Ship Emissions Prediction Framework

Abdelhak El Aissi

Ismail Bourzak

Loubna Benabbou

Abdelaziz Berrado

Maritime transport is a vital component of international trade, yet the industry contributes substantially to greenhouse gas (GHG) emissions… (see more), with carbon dioxide

2025-10-19

2025 International Conference on Intelligent Systems: Theories and Applications (SITA) (published)

doi.org

High-Dimensional Privacy-Utility Dynamics of Noisy Stochastic Gradient Descent on Least Squares

Shurong Lin

Eric D. Kolaczyk

Adam Smith

Elliot Paquette

2025-10-18

ArXiv (preprint)

doi.org

arxiv.org

Perpetua: Multi-Hypothesis Persistence Modeling for Semi-Static Environments

Miguel Saavedra-Ruiz

Samer B. Nashed

Charlie Gauthier

Liam Paull

Many robotic systems require extended deployments in complex, dynamic environments. In such deployments, parts of the environment may change… (see more) between subsequent robot observations. Most robotic mapping or environment modeling algorithms are incapable of representing dynamic features in a way that enables predicting their future state. Instead, they opt to filter certain state observations, either by removing them or some form of weighted averaging. This paper introduces Perpetua, a method for modeling the dynamics of semi-static features. Perpetua is able to: incorporate prior knowledge about the dynamics of the feature if it exists, track multiple hypotheses, and adapt over time to enable predicting of future feature states. Specifically, we chain together mixtures of"persistence"and"emergence"filters to model the probability that features will disappear or reappear in a formal Bayesian framework. The approach is an efficient, scalable, general, and robust method for estimating the states of features in an environment, both in the present as well as at arbitrary future times. Through experiments on simulated and real-world data, we find that Perpetua yields better accuracy than similar approaches while also being online adaptable and robust to missing observations.

2025-10-18

2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (published)

doi.org

arxiv.org

Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models

Anita Kriz

Elizabeth Laura Janes

Xing Shen

Tal Arbel

2025-10-18

ICCVW @ IEEE/CVF International Conference on Computer Vision (published)

doi.org

arxiv.org

Continuously Learning Bug Locations

Paulina Stevia Nouwou Mindom

Leuson Da Silva

Amin Nikanjam

Foutse Khomh

Automatically locating buggy changesets associated with bug reports is crucial in the software development process. Deep Learning (DL)-based… (see more) techniques show promising results by leveraging structural information from the code and learning links between changesets and bug reports. However, since source code associated with changesets evolves, the performance of such models tends to degrade over time due to concept drift. Aiming to address this challenge, in this paper, we evaluate the potential of using Continual Learning (CL) techniques in multiple sub-tasks setting for bug localization (each of which operates on either stationary or non-stationary data), comparing it against a bug localization technique that leverages the BERT model, a deep reinforcement learning-based technique that leverages the A2C algorithm, and a DL-based function-level interaction model for semantic bug localization. Additionally, we enhanced the CL techniques by using logistic regression to identify and integrate the most significant bug-inducing factors. Our empirical evaluation across seven widely used software projects shows that CL techniques perform better than DL-based techniques by up to 61% in terms of Mean Reciprocal Rank (MRR), 44% in terms of Mean Average Precision (MAP), 83% in terms of top@1, 56% in terms of top@5, and 66% in terms of top@10 metrics in non-stationary setting. Further, we show that the CL techniques we studied are effective at localizing changesets relevant to a bug report while being able to mitigate catastrophic forgetting across the studied tasks and require up to 5x less computational effort during training. Our findings demonstrate the potential of adopting CL for bug localization in non-stationary settings, and we hope it helps to improve bug localization activities in Software Engineering using CL techniques.

2025-10-17

ACM Transactions on Software Engineering and Methodology (published)

doi.org

arxiv.org

Hierarchical Differentiable Fluid Simulation

Xiangyu Kong

Arnaud Schoentgen

Damien Rioux‐Lavoie

Paul G. Kry

Derek Nowrouzezahrai

Differentiable simulation is an emerging field that offers a powerful and flexible route to fluid control. In grid‐based settings, high me… (see more)mory consumption is a long‐standing bottleneck that constrains optimization resolution. We introduce a two‐step algorithm that significantly reduces memory usage: our method first optimizes for bulk forces at reduced resolution, then refines local details over sub‐domains while maintaining differentiability. In trading runtime for memory, it enables optimization at previously unattainable resolutions. We validate its effectiveness and memory savings on a series of fluid control problems.

2025-10-16

Computer Graphics Forum (published)

doi.org

Improving autoformalization via cycle consistency and incremental type-checking using language-model probabilistic programs

Mauricio Barba da Costa

Fabian Zaiser

Katherine M. Collins

Romir Patel

Timothy J. O'Donnell

Alexander K. Lew

Joshua B. Tenenbaum

Vikash Mansinghka

Cameron Freer

2025-10-16

NeurIPS.cc/2025/Workshop/MATH-AI (poster)

openreview.net

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Publications