Publications

Health satisfaction outcome from integrated autonomous mobile clinics
Yuzhang Huang
Shaoshan Liu
Zhongying Pan
Carl Wu
Herng-Chia Chiu
Xue Liu
Leiyu Shi
GFlowNets for Hamiltonian decomposition in groups of compatible operators
Isaac L. Huidobro-Meezs
R. A. Vargas-Hern'andez
Quantum computing presents a promising alternative for the direct simulation of quantum systems with the potential to explore chemical probl… (voir plus)ems beyond the capabilities of classical methods. However, current quantum algorithms are constrained by hardware limitations and the increased number of measurements required to achieve chemical accuracy. To address the measurement challenge, techniques for grouping commuting and anti-commuting terms, driven by heuristics, have been developed to reduce the number of measurements needed in quantum algorithms on near-term quantum devices. In this work, we propose a probabilistic framework using GFlowNets to group fully (FC) or qubit-wise commuting (QWC) terms within a given Hamiltonian. The significance of this approach is demonstrated by the reduced number of measurements for the found groupings; 51% and 67% reduction factors respectively for FC and QWC partitionings with respect to greedy coloring algorithms, highlighting the potential of GFlowNets for future applications in the measurement problem. Furthermore, the flexibility of our algorithm extends its applicability to other resource optimization problems in Hamiltonian simulation, such as circuit design.
Generating Tabular Data Using Heterogeneous Sequential Feature Forest Flow Matching
Circulating IL-17F, but not IL-17A, is elevated in severe COVID-19 and leads to an ERK1/2 and p38 MAPK-dependent increase in ICAM-1 cell surface expression and neutrophil adhesion on endothelial cells
Jérôme Bédard-Matteau
Katelyn Yixiu Liu
Lyvia Fourcade
Douglas D. Fraser
Simon Rousseau
Severe COVID-19 is associated with neutrophilic inflammation and immunothrombosis. Several members of the IL-17 cytokine family have been as… (voir plus)sociated with neutrophilic inflammation and activation of the endothelium. Therefore, we investigated whether these cytokines were associated with COVID-19. We investigated the association between COVID-19 and circulating plasma levels of IL-17 cytokine family members in participants to the Biobanque québécoise de la COVID-19 (BQC19), a prospective observational cohort and an independent cohort from Western University (London, Ontario). We measured the in vitro impact of IL-17F on intercellular adhesion molecule 1 (ICAM-1) cell surface expression and neutrophil adhesion on endothelial cells in culture. The contribution of two Mitogen Activated Protein Kinase (MAPK) pathways was determined using small molecule inhibitors PD184352 (a MKK1/MKK2 inhibitor) and BIRB0796 (a p38 MAPK inhibitor). We found increased IL-17D and IL-17F plasma levels when comparing SARS-CoV-2-positive vs negative hospitalized participants. Moreover, increased plasma levels of IL-17D, IL-17E and IL-17F were noted when comparing severe versus mild COVID-19. IL-17F, but not IL-17A, was significantly elevated in people with COVID-19 compared to healthy controls and with more severe disease. In vitro work on endothelial cells treated with IL-17F for 24h showed an increase cell surface expression of ICAM-1 accompanied by neutrophil adhesion. The introduction of two MAPK inhibitors significantly reduced the binding of neutrophils while also reducing ICAM-1 expression at the surface level of endothelial cells, but not its intracellular expression. Overall, these results have identified an association between two cytokines of the IL-17 family (IL-17D and IL-17F) with COVID-19 and disease severity. Considering that IL-17F stimulation promotes neutrophil adhesion to the endothelium in a MAPK-dependent manner, it is attractive to speculate that this pathway may contribute to pathogenic immunothrombosis in concert with other molecular effectors.
A Complexity-Based Theory of Compositionality
Convergence of Manifold Filter-Combine Networks
David R. Johnson
Joyce Chew
Edward De Brouwer
Deanna Needell
Michael Perlmutter
In order to better understand manifold neural networks (MNNs), we introduce Manifold Filter-Combine Networks (MFCNs). The filter-combine fra… (voir plus)mework parallels the popular aggregate-combine paradigm for graph neural networks (GNNs) and naturally suggests many interesting families of MNNs which can be interpreted as the manifold analog of various popular GNNs. We then propose a method for implementing MFCNs on high-dimensional point clouds that relies on approximating the manifold by a sparse graph. We prove that our method is consistent in the sense that it converges to a continuum limit as the number of data points tends to infinity.
Assessment of the Climate Trace global powerplant CO2 emissions
Kevin R. Gurney
Bilal Aslam
Pawlok Dass
Lech Gawuc
Jarrett J Barber
Anna Kato
Accurate estimation of planetary greenhouse gas (GHG) emissions at the scale of individual emitting activities is a critical need for both s… (voir plus)cientific and policy applications. Powerplants represent the single largest and most concentrated form of global GHG emissions. Climate Trace, co-founded and promoted by former U.S. Vice President Al Gore, is a new effort using, in part, artificial intelligence (AI) approaches to estimate asset-scale GHG emissions. Climate Trace recently released a database of global powerplant CO2 emissions at the facility-scale that uses both AI and non-AI estimation approaches. However, no independent peer-reviewed assessment has been made of this important global emissions database. Here, we compare the Climate Trace powerplant CO2 emissions to an atmospherically calibrated, multi-constraint estimate of powerplant CO2 emissions in the United States. The 3.7% (65) of compared facilities that used an AI-based approach show a mean relative difference (MRD) of −1.1% (SD: 46.4%) in the year 2019. The 96.3% (1726) of the facilities that used a non-AI-based approach show a MRD of −50.0% (SD: 117.7%). Of the non-AI estimated facilities, 151 (8.7%) facilities agree to within ±20%. The large differences between Climate Trace and Vulcan-power emission estimates for these facilities is primarily caused by Climate Trace’ use of a national-mean power plant capacity factor (CF) which is a poor representation of the reported power plant CFs of individual US facilities and leads to very large errors at those same 1726 facilities.
A Simulation System Towards Solving Societal-Scale Manipulation
Austin Welch
Gayatri K
Dan Zhao
Hao Yu
Ethan Kosak-Hine
Tom Gibbs
Busra Tugce Gurbuz
The rise of AI-driven manipulation poses significant risks to societal trust and democratic processes. Yet, studying these effects in real-w… (voir plus)orld settings at scale is ethically and logistically impractical, highlighting a need for simulation tools that can model these dynamics in controlled settings to enable experimentation with possible defenses. We present a simulation environment designed to address this. We elaborate upon the Concordia framework that simulates offline, `real life' activity by adding online interactions to the simulation through social media with the integration of a Mastodon server. We improve simulation efficiency and information flow, and add a set of measurement tools, particularly longitudinal surveys. We demonstrate the simulator with a tailored example in which we track agents' political positions and show how partisan manipulation of agents can affect election results.
Training Compute-Optimal Vision Transformers for Brain Encoding
Sana Ahmadi
Fraçois Paugam
Tristan Glatard
Lune P Bellec
The optimal training of a vision transformer for brain encoding depends on three factors: model size, data size, and computational resources… (voir plus). This study investigates these three pillars, focusing on the effects of data scaling, model scaling, and high-performance computing on brain encoding results. Using VideoGPT to extract efficient spatiotemporal features from videos and training a Ridge model to predict brain activity based on these features, we conducted benchmark experiments with varying data sizes (10k, 100k, 1M, 6M) and different model configurations of GPT-2, including hidden layer dimensions, number of layers, and number of attention heads. We also evaluated the effects of training models with 32-bit vs 16-bit floating point representations. Our results demonstrate that increasing the hidden layer dimensions significantly improves brain encoding performance, as evidenced by higher Pearson correlation coefficients across all subjects. In contrast, the number of attention heads does not have a significant effect on the encoding results. Additionally, increasing the number of layers shows some improvement in brain encoding correlations, but the trend is not as consistent as that observed with hidden layer dimensions. The data scaling results show that larger training datasets lead to improved brain encoding performance, with the highest Pearson correlation coefficients observed for the largest dataset size (6M). These findings highlight that the effects of data scaling are more significant compared to model scaling in enhancing brain encoding performance. Furthermore, we explored the impact of floating-point precision by comparing 32-bit and 16-bit representations. Training with 16-bit precision yielded the same brain encoding accuracy as 32-bit, while reducing training time by 1.17 times, demonstrating its efficiency for high-performance computing tasks.
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation
Ricardo de Azambuja
Real-time aerial image segmentation plays an important role in the environmental perception of Uncrewed Aerial Vehicles (UAVs). We introduce… (voir plus) BlabberSeg, an optimized Vision-Language Model built on CLIPSeg for on-board, real-time processing of aerial images by UAVs. BlabberSeg improves the efficiency of CLIPSeg by reusing prompt and model features, reducing computational overhead while achieving real-time open-vocabulary aerial segmentation. We validated BlabberSeg in a safe landing scenario using the Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI) framework, which uses visual servoing and open-vocabulary segmentation. BlabberSeg reduces computational costs significantly, with a speed increase of 927.41% (16.78 Hz) on a NVIDIA Jetson Orin AGX (64GB) compared with the original CLIPSeg (1.81Hz), achieving real-time aerial segmentation with negligible loss in accuracy (2.1% as the ratio of the correctly segmented area with respect to CLIPSeg). BlabberSeg's source code is open and available online.
The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse
Ekansh Sharma
Daniel M. Roy
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Genta Indra Winata
Frederikus Hudi
Patrick Amadeus Irawan
David Anugraha
Rifki Afina Putri
Yutong Wang
Adam Nohejl
Ubaidillah Ariq Prathama
Nedjma OUSIDHOUM
Afifa Amriani
Anar Rzayev
Anirban Das
Ashmari Pramodya
Aulia Adila
Bryan Wilie
Candy Olivia Mawalim
Ching Lam Cheng
Daud Abolade
Emmanuele Chersoni
Enrico Santus … (voir 31 de plus)
Fariz Ikhwantri
Garry Kuwanto
Hanyang Zhao
Haryo Akbarianto Wibowo
Holy Lovenia
Jan Christian Blaise Cruz
Jan Wira Gotama Putra
Junho Myung
Lucky Susanto
Maria Angelica Riera Machin
Marina Zhukova
Michael Anugraha
Muhammad Farid Adilazuarda
Natasha Santosa
Peerat Limkonchotiwat
Raj Dabre
Rio Alexander Audino
Samuel Cahyawijaya
Shi-Xiong Zhang
Stephanie Yulia Salim
Yi Zhou
Yinxuan Gui
En-Shiun Annie Lee
Shogo Okada
Ayu Purwarianti
Alham Fikri Aji
Taro Watanabe
Derry Tanti Wijaya
Alice Oh
Chong-Wah Ngo