Publications

GraIP: A Benchmarking Framework For Neural Graph Inverse Problems
Andrei Manolache
Arman Mielke
Chendi Qian
Antoine Siraudin
Mathias Niepert
A wide range of graph learning tasks, such as structure discovery, temporal graph analysis, and combinatorial optimization, focus on inferri… (voir plus)ng graph structures from data, rather than making predictions on given graphs. However, the respective methods to solve such problems are often developed in an isolated, task-specific manner and thus lack a unifying theoretical foundation. Here, we provide a stepping stone towards the formation of such a foundation and further development by introducing the Neural Graph Inverse Problem (GraIP) conceptual framework, which formalizes and reframes a broad class of graph learning tasks as inverse problems. Unlike discriminative approaches that directly predict target variables from given graph inputs, the GraIP paradigm addresses inverse problems, i.e., it relies on observational data and aims to recover the underlying graph structure by reversing the forward process, such as message passing or network dynamics, that produced the observed outputs. We demonstrate the versatility of GraIP across various graph learning tasks, including rewiring, causal discovery, and neural relational inference. We also propose benchmark datasets and metrics for each GraIP domain considered, and characterize and empirically evaluate existing baseline methods used to solve them. Overall, our unifying perspective bridges seemingly disparate applications and provides a principled approach to structural learning in constrained and combinatorial settings while encouraging cross-pollination of existing methods across graph inverse problems.
<i>In silico</i> Neutron Relative Biological Effectiveness Estimations For Pre-DNA Repair And Post-DNA Repair Endpoints
Nicolas Desjardins
J. Kildea
Monitoring morphometric drift in lifelong learning segmentation of the spinal cord.
Enamundram Naga Karthik
Christoph Stefan Aigner
Elise Bannier
Josef Bednařík
Virginie Callot
Anna Combes
Armin Curt
Gergely David
Falk Eippert
Lynn Farner
Michael G. Fehlings
Patrick Freund
Tobias Granberg
Cristina Granziera
Rhscir Network Imaging Group
Ulrike Horn
Tomáš Horák
Suzanne Humphreys … (voir 36 de plus)
Markus Hupp
Anne Kerbrat
Nawal Kinany
Shannon Kolind
Petr Kudlička
Anna Lebret
Lisa Eunyoung Lee
Cristina Granziera
Allan R. Martin
Govind Nair
Megan McGrath
Kristin P. O’Grady
Jiwon Oh
Russell Ouellette
Nikolai Pfender
Dario Pfyffer
Pierre‐François Pradat
Alexandre Prat
Alexandre Prat
Daniel S. Reich
Ilaria Ricchi
Naama Rotem‐Kohavi
Simon Schading-Sassenhausen
Maryam Seif
Andrew Smith
Seth A. Smith
Grace Sweeney
Roger Tam
Anthony Traboulsee
Constantina A. Treaba
Charidimos Tsagkas
Dimitri Van De Ville
Zachary Vavasour
Kenneth A. Weber
Morphometric measures derived from spinal cord segmentations can serve as diagnostic and prognostic biomarkers in neurological diseases and … (voir plus)injuries affecting the spinal cord. For instance, the spinal cord cross-sectional area can be used to monitor cord atrophy in multiple sclerosis and to characterize compression in degenerative cervical myelopathy. While robust, automatic segmentation methods to a wide variety of contrasts and pathologies have been developed over the past few years, whether their predictions are stable as the model is updated using new datasets has not been assessed. This is particularly important for deriving normative values from healthy participants. In this study, we present a spinal cord segmentation model trained on a multisite (n=75) dataset, including 9 different MRI contrasts and several spinal cord pathologies. We also introduce a lifelong learning framework to automatically monitor the morphometric drift as the model is updated using additional datasets. The framework is triggered by an automatic GitHub Actions workflow every time a new model is created, recording the morphometric values derived from the model's predictions over time. As a real-world application of the proposed framework, we employed the spinal cord segmentation model to update a recently-introduced normative database of healthy participants containing commonly used measures of spinal cord morphometry. Results showed that: (i) our model performs well compared to its previous versions and existing pathology-specific models on the lumbar spinal cord, images with severe compression, and in the presence of intramedullary lesions and/or atrophy achieving an average Dice score of 0.95 ± 0.03; (ii) the automatic workflow for monitoring morphometric drift provides a quick feedback loop for developing future segmentation models; and (iii) the scaling factor required to update the database of morphometric measures is nearly constant among slices across the given vertebral levels, showing minimum drift between the current and previous versions of the model monitored by the framework. The model is freely available in Spinal Cord Toolbox v7.0.
Divergent creativity in humans and large language models
Antoine Bellemare-Pepin
François Lespinasse
Yann Harel
Kory Mathewson
Jay A. Olson
Psychology Department
U. Montr'eal
Montreal
Qc
Canada
Music department
C. University
Sociology
Anthropology department
Mila
Departmentof Psychology
University of Toronto Mississauga … (voir 5 de plus)
Mississauga
On
Department of Computer Science
Operations Research
Unique Center
The recent surge of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilitie… (voir plus)s. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLMs’ semantic diversity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in computational creativity to analyze semantic divergence in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. These divergence-based measures index associative thinking—the ability to access and combine remote concepts in semantic space—an established facet of creative cognition. We benchmark performance on the Divergent Association Task (DAT) and across multiple creative-writing tasks (haiku, story synopses, and flash fiction), using identical, objective scoring. We found evidence that LLMs can surpass average human performance on the DAT, and approach human creative writing abilities, yet they remain below the mean creativity scores observed among the more creative segment of human participants. Notably, even the top performing LLMs are still largely surpassed by the aggregated top half of human participants, underscoring a ceiling that current LLMs still fail to surpass. We also systematically varied linguistic strategy prompts and temperature, observing reliable gains in semantic divergence for several models. Our human-machine benchmarking framework addresses the polemic surrounding the imminent replacement of human creative labor by AI, disentangling the quality of the respective creative linguistic outputs using established objective measures. While prompting deeper exploration of the distinctive elements of human inventive thought compared to those of AI systems, we lay out a series of techniques to improve their outputs with respect to semantic diversity, such as prompt design and hyper-parameter tuning.
Spatial analysis of healthcare services availability and demand for people aged 65 and over in Québec
Juliette Duc
Nevena Veljanovic
Sébastien Barbat-Artigas
David L. Buckeridge
Delphine Bosson-Rieutort
As people age, their healthcare needs increase and become more complex, requiring a corresponding increase in healthcare and services use. M… (voir plus)oreover, heterogeneity of healthcare needs and availability can be observed among the health regions within Canadian provinces, especially between rural and urban regions. The province of Québec has received limited attention in this regard. This study aims to describe and compare healthcare services location and aging population healthcare demand across Québec. We used data from Données Québec to describe the distribution of available healthcare (such as facilities, their services and capacity) and potential demand of services (represented by the location of the aged population) and mapped their relationship based on urbanization level. Analyses were performed using QGIS and R software. We found a substantial variability of the population aged 65 and over, the number of facilities, the number and type of services, and long-term care (LTC) beds between regions in Québec. The number of LTC beds was significantly correlated with the number of people aged 65 and over (R² = 0.88, p  0.001), but not with their proportion. LTC accommodation is a service most offered in urban areas, especially in the Montréal region. H
Diffusion Large Language Models for Black-Box Optimization
Can Chen
Christopher Pal
Xue Liu
Offline black-box optimization (BBO) aims to find optimal designs based solely on an offline dataset of designs and their labels. Such scena… (voir plus)rios frequently arise in domains like DNA sequence design and robotics, where only a few labeled data points are available. Traditional methods typically rely on task-specific proxy or generative models, overlooking the in-context learning capabilities of pre-trained large language models (LLMs). Recent efforts have adapted autoregressive LLMs to BBO by framing task descriptions and offline datasets as natural language prompts, enabling direct design generation. However, these designs often contain bidirectional dependencies, which left-to-right models struggle to capture. In this paper, we explore diffusion LLMs for BBO, leveraging their bidirectional modeling and iterative refinement capabilities. This motivates our in-context denoising module: we condition the diffusion LLM on the task description and the offline dataset, both formatted in natural language, and prompt it to denoise masked designs into improved candidates. To guide the generation toward high-performing designs, we introduce masked diffusion tree search, which casts the denoising process as a step-wise Monte Carlo Tree Search that dynamically balances exploration and exploitation. Each node represents a partially masked design, each denoising step is an action, and candidates are evaluated via expected improvement under a Gaussian Process trained on the offline dataset. Our method, dLLM, achieves state-of-the-art results in few-shot settings on design-bench.
Enhancing link prediction in biomedical knowledge graphs with BioPathNet
Emy Yue Hu
Svitlana Oleshko
Samuele Firmani
Hui Cheng
Maria Ulmer
Matthias Arnold
Maria Colomé-Tatché
Annalisa Marsico
Enhancing link prediction in biomedical knowledge graphs with BioPathNet
Emy Yue Hu
Svitlana Oleshko
Samuele Firmani
Hui Cheng
Maria Ulmer
Matthias Arnold
Maria Colomé-Tatché
Annalisa Marsico
Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) m… (voir plus)ethods are limited in capturing this complexity. We present BioPathNet, a graph neural network framework based on the neural Bellman–Ford network (NBFNet), addressing limitations of traditional representation-based learning methods through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability, and allowing visualization of influential paths and biological validation. BioPathNet leverages a background regulatory graph for enhanced message passing and uses stringent negative sampling to improve precision and scalability. BioPathNet outperforms or matches existing methods across diverse tasks including gene function annotation, drug–disease indication, synthetic lethality and lncRNA–target interaction prediction. Our study identifies promising additional drug indications for diseases such as acute lymphoblastic leukaemia and Alzheimer’s disease, validated by medical experts and clinical trials. In addition, we prioritize putative synthetic lethal gene pairs and regulatory lncRNA–target interactions. BioPathNet’s interpretability will enable researchers to trace prediction paths and gain molecular insights.
Modeling and Simulation of Neocortical Micro- and Mesocircuitry. Part I: Anatomy
Michael W. Reimann
Sirio Bolaños-Puchet
Jean-Denis Courcol
Daniela Egas Santander
Alexis Arnaudon
Benoît Coste
Fabien Delalondre
Thomas Delemontex
Adrien Devresse
Hugo Dictus
Alexander Dietz
András Ecker
Cyrille Favreau
Gianluca Ficarelli
Michael Gevaert
Juan B. Hernando
Joni Herttuainen
James B. Isbister
Lida Kanari
Daniel Keller … (voir 24 de plus)
James King
Pramod Kumbhar
Samuel Lapere
Jānis Lazovskis
Huanxiang Lu
Nicolas Ninin
Fernando Pereira
Judit Planas
Christoph Pokorny
Juan Luis Riquelme
Armando Romani
Ying Shi
Jason P. Smith
Vishal Sood
Mohit Srivastava
Werner Van Geit
Liesbeth Vanherpe
Matthias Wolf
Ran Levi
Kathryn Hess
Felix Schürmann
Henry Markram
Srikanth Ramaswamy
The function of the neocortex is fundamentally determined by its repeating microcircuit motif, but also by its rich, interregional connectiv… (voir plus)ity. We present a data-driven computational model of the anatomy of non-barrel primary somatosensory cortex of juvenile rat, integrating whole-brain scale data while providing cellular and subcellular specificity. The model consists of 4.2 million morphologically detailed neurons, placed in a digital brain atlas. They are connected by 14.2 billion synapses, comprising local, long-range and extrinsic connectivity. We delineated the limits of determining connectivity from anatomy, finding that it reproduces the targeting of PV+ and VIP+ interneurons only with explicitly added specificity, but the one of Sst+ neurons even without. Globally, connectivity was characterized by local clusters tied together through hub neurons in layer 5, demonstrating how local and interegional connectivity are complicit, inseparable networks. A 211,712 neuron subvolume of the model has been made freely and openly available to the community.
Modeling and Simulation of Neocortical Micro- and Mesocircuitry. Part II: Physiology and Experimentation
James B. Isbister
András Ecker
Christoph Pokorny
Sirio Bolaños-Puchet
Daniela Egas Santander
Alexis Arnaudon
Omar Awile
Natali Barros-Zulaica
Jorge Blanco Alonso
Elvis Boci
Giuseppe Chindemi
Jean-Denis Courcol
Tanguy Damart
Thomas Delemontex
Alexander Dietz
Gianluca Ficarelli
Michael Gevaert
Joni Herttuainen
Genrich Ivaska
Weina Ji … (voir 22 de plus)
Daniel Keller
James King
Pramod Kumbhar
Samuel Lapere
Polina Litvak
Darshan Mandge
Fernando Pereira
Judit Planas
Rajnish Ranjan
Maria Reva
Armando Romani
Christian Rössert
Felix Schürmann
Vishal Sood
Aleksandra Teska
Anıl Tuncel
Werner Van Geit
Matthias Wolf
Henry Markram
Srikanth Ramaswamy
Michael W. Reimann
Cortical dynamics underlie many cognitive processes and emerge from complex multi-scale interactions, which can be studied in large-scale, b… (voir plus)iophysically detailed models. We present a model comprising eight somatosensory cortex subregions, 4.2 million morpho-logical and electrically-detailed neurons, and 13.2 billion local and long-range synapses. In silico tools enabled reproduction and extension of complex laboratory experiments under a single parameterization, providing strong validation. We reproduced millisecond-precise stimulus-responses, stimulus-encoding under targeted optogenetic activation, and selective propagation of stimulus-evoked activity to downstream areas. The model’s di-rect correspondence with biology generated predictions about how multiscale organisation shapes activity. We predict that structural and functional recurrency increases towards deeper layers and that stronger innervation by long-range connectivity increases local correlated activity. The model also predicts the role of inhibitory interneuron types in stimulus encoding, and of different layers in driving layer 2/3 stimulus responses. Simu-slation tools and a large subvolume of the model are made available.
Opportunities in AI/ML for the Rubin LSST Dark Energy Science Collaboration
LSST Dark Energy Science Collaboration
Eric Aubourg
Camille Avestruz
M. R. Becker
Biswajit Biswas
Rahul Biswas
Boris Bolliet
César Briceño
Clecio Bom
Raphaël Bonnet-Guerrini
Alexandre Boucaud
J.E. Campagne
Chihway Chang
Aleksandra Ćiprijanović
Johann Cohen-Tanugi
Michael W. Coughlin
John Franklin Crenshaw
Juan C. Cuevas‐Tello
Juan de Vicente
Seth William Digel … (voir 50 de plus)
Steven Dillmann
Mariano Javier de León Dominguez Romero
Alex Drlica-Wagner
Sydney Erickson
Alexander Gagliano
Christos Georgiou
Aritra Ghosh
Matthew Grayling
Kirill A. Grishin
Alan Heavens
Lindsay R. House
Mustapha Ishak
Wassim Kabalan
Olivia Lynn
François Lanusse
C. Danielle Leonard
P.-F. Léget
Michelle Lochner
Joel Meyers
Peter Melchior
Grant Merz
Martin Millon
Anais Möller
G. Narayan
Yuuki Omori
Hiranya Peiris
A. A. Plazas
Nesar Ramachandra
B. Remy
C. Roucelle
Jaime Ruiz-Zapatero
Stefan Schuldt
I. Sevilla-Noarbe
Ved G. Shah
Tjitske Starkenburg
Stephen Thorp
Tianqing Zhang
Tilman Tröster
Roberto Trotta
Padma T. Venkatraman
A. R. Wasserman
Tim White
Tianqing Zhang
Yuanyuan Zhang
Adam S. Bolton
Arun Kannawadi
Yao-Yuan Mao
Laura Toribio San Cipriano
The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will produce unprecedented volumes of heterogeneous astronomical data… (voir plus) (images, catalogs, and alerts) that challenge traditional analysis pipelines. The LSST Dark Energy Science Collaboration (DESC) aims to derive robust constraints on dark energy and dark matter from these data, requiring methods that are statistically powerful, scalable, and operationally reliable. Artificial intelligence and machine learning (AI/ML) are already embedded across DESC science workflows, from photometric redshifts and transient classification to weak lensing inference and cosmological simulations. Yet their utility for precision cosmology hinges on trustworthy uncertainty quantification, robustness to covariate shift and model misspecification, and reproducible integration within scientific pipelines. This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses, revealing that the same core methodologies and fundamental challenges recur across disparate science cases. Since progress on these cross-cutting challenges would benefit multiple probes simultaneously, we identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery. With an eye on emerging techniques, we also explore the potential of the latest foundation model methodologies and LLM-driven agentic AI systems to reshape DESC workflows, provided their deployment is coupled with rigorous evaluation and governance. Finally, we discuss critical software, computing, data infrastructure, and human capital requirements for the successful deployment of these new methodologies, and consider associated risks and opportunities for broader coordination with external actors.
<scp>CISO</scp> : Species distribution modelling Conditioned on Incomplete Species Observations
Hager Radi Abdelwahed
Mélisande Teng
Robin Zbinden
Laura Pollock
Hugol Larochelle
D. Tuia
Rolnick David
Species distribution models (SDMs) are widely used to predict species' geographic distributions, serving as critical tools for ecological re… (voir plus)search and conservation planning. Typically, SDMs relate species occurrences to environmental variables representing abiotic factors, such as temperature, precipitation, and soil properties. However, species distributions are also strongly influenced by biotic interactions with other species, which are often overlooked in traditional models. While some methods, such as joint species distribution models (JSDMs), partially address this limitation by incorporating biotic interactions, they often assume symmetrical pairwise relationships between species and require consistent co‐occurrence data. In practice, species observations are often sparse, and the availability of information about the presence or absence of other species varies significantly across locations. To address these challenges, we propose CISO, a deep learning‐based method for species distribution modelling Conditioned on Incomplete Species Observations. CISO enables predictions to be conditioned on a flexible number of species observations alongside environmental variables, accommodating the variability and incompleteness of available biotic data. We demonstrate our approach using three datasets representing different species groups: sPlotOpen for plants, SatBird for birds, and a new dataset, SatButterfly, for butterflies. Our results show that including partial biotic information improves predictive performance on spatially separate test sets. When conditioned on a subset of species within the same dataset, CISO outperforms alternative methods in predicting the distribution of the remaining species for plants and birds. Furthermore, we show that combining and conditioning on observations from multiple datasets can improve the prediction of species occurrences in scenarios with sufficient co‐occurrences between datasets to train CISO effectively. Our results show that CISO is a promising ecological tool, capable of incorporating incomplete biotic information and identifying potential interactions between species from disparate taxa.