Publications

Hierarchies define the scalability of robot swarms
Vivek Shankar Vardharajan
Karthik Soma
Sepand Dyanatkar
Pierre-Yves Lajoie
The emerging behaviors of swarms have fascinated scientists and gathered significant interest in the field of robotics. Traditionally, swarm… (see more)s are viewed as egalitarian, with robots sharing identical roles and capabilities. However, recent findings highlight the importance of hierarchy for deploying robot swarms more effectively in diverse scenarios. Despite nature's preference for hierarchies, the robotics field has clung to the egalitarian model, partly due to a lack of empirical evidence for the conditions favoring hierarchies. Our research demonstrates that while egalitarian swarms excel in environments proportionate to their collective sensing abilities, they struggle in larger or more complex settings. Hierarchical swarms, conversely, extend their sensing reach efficiently, proving successful in larger, more unstructured environments with fewer resources. We validated these concepts through simulations and physical robot experiments, using a complex radiation cleanup task. This study paves the way for developing adaptable, hierarchical swarm systems applicable in areas like planetary exploration and autonomous vehicles. Moreover, these insights could deepen our understanding of hierarchical structures in biological organisms.
Schrödinger's Update: User Perceptions of Uncertainties in Proprietary Large Language Model Updates
Zilin Ma
Yiyang Mei
Krzysztof Z. Gajos
Stochastic Frank-Wolfe: Unified Analysis and Zoo of Special Cases
Ruslan Nazykov
Aleksandr Shestakov
Vladimir Solodkin
Aleksandr Beznosikov
Alexander Gasnikov
The Conditional Gradient (or Frank-Wolfe) method is one of the most well-known methods for solving constrained optimization problems appeari… (see more)ng in various machine learning tasks. The simplicity of iteration and applicability to many practical problems helped the method to gain popularity in the community. In recent years, the Frank-Wolfe algorithm received many different extensions, including stochastic modifications with variance reduction and coordinate sampling for training of huge models or distributed variants for big data problems. In this paper, we present a unified convergence analysis of the Stochastic Frank-Wolfe method that covers a large number of particular practical cases that may have completely different nature of stochasticity, intuitions and application areas. Our analysis is based on a key parametric assumption on the variance of the stochastic gradients. But unlike most works on unified analysis of other methods, such as SGD, we do not assume an unbiasedness of the real gradient estimation. We conduct analysis for convex and non-convex problems due to the popularity of both cases in machine learning. With this general theoretical framework, we not only cover rates of many known methods, but also develop numerous new methods. This shows the flexibility of our approach in developing new algorithms based on the Conditional Gradient approach. We also demonstrate the properties of the new methods through numerical experiments.
2851: Operational Ontology for Oncology (O3) - Multi-professional society standard supporting AI
Charles S. Mayo
Mary U. Feng
Kristy K. Brock
Randi Kudner
Peter Balter
Jeffrey Buchsbaum
Amanda Caissie
Emily Daugherty
Andre Dekker
Clifton D. Fuller
Julian Hong
David Hong
Sophia Kamran
Evangelia Katsoulakis
J. Kildea
Andra Krauze
Jon Kruse
Todd McNutt
Michelle Mierzwa
Amy Moreno … (see 5 more)
Jatinder Palta
Richard Popple
Thomas Purdie
Susan Yom
Xiao Ying
Beyond the Norms: Detecting Prediction Errors in Regression Models
Andres Altieri
Marco Romanelli
Georg Pichler
Florence Alberge
This paper tackles the challenge of detecting unreliable behavior in regression algorithms, which may arise from intrinsic variability (e.g.… (see more), aleatoric uncertainty) or modeling errors (e.g., model uncertainty). First, we formally introduce the notion of unreliability in regression, i.e., when the output of the regressor exceeds a specified discrepancy (or error). Then, using powerful tools for probabilistic modeling, we estimate the discrepancy density, and we measure its statistical diversity using our proposed metric for statistical dissimilarity. In turn, this allows us to derive a data-driven score that expresses the uncertainty of the regression outcome. We show empirical improvements in error detection for multiple regression tasks, consistently outperforming popular baseline approaches, and contributing to the broader field of uncertainty quantification and safe machine learning systems.
Body size interacts with the structure of the central nervous system: A multi-center in vivo neuroimaging study
René Labounek
Monica T. Bondy
Amy L. Paulson
Mihael Abramovic
Eva Alonso-Ortiz
Nicole T Atcheson
Laura R. Barlow
Robert L. Barry
Markus Barth
Marco Battiston
Christian Büchel
Matthew D. Budde
Virginie Callot
Anna Combes
Benjamin De Leener
Maxime Descoteaux
Paulo Loureiro de Sousa
Marek Dostál
Julien Doyon … (see 74 more)
Adam V. Dvorak
Falk Eippert
Karla R. Epperson
Kevin S. Epperson
Patrick Freund
Jürgen Finsterbusch
Alexandru Foias
Michela Fratini
Issei Fukunaga
Claudia A. M. Gandini Wheeler-Kingshott
Giancarlo Germani
Guillaume Gilbert
Federico Giove
Francesco Grussu
Akifumi Hagiwara
Pierre-Gilles Henry
Tomáš Horák
Masaaki Hori
James M. Joers
Kouhei Kamiya
Haleh Karbasforoushan
Miloš Keřkovský
Ali Khatibi
Joo-Won Kim
Nawal Kinany
Hagen Kitzler
Shannon Kolind
Yazhuo Kong
Petr Kudlička
Paul Kuntke
Nyoman D. Kurniawan
Slawomir Kusmia
Maria Marcella Laganà
Cornelia Laule
Christine S. W. Law
Christine S. W. Law
Tobias Leutritz
Yaou Liu
Sara Llufriu
Sean Mackey
Allan R. Martin
Eloy Martinez-Heras
Loan Mattera
Kristin P. O’Grady
Nico Papinutto
Daniel Papp
Deborah Pareto
Todd B. Parrish
Anna Pichiecchio
Ferran Prados
Àlex Rovira
Marc J. Ruitenberg
Rebecca S. Samson
Giovanni Savini
Maryam Seif
Alan C. Seifert
Alex K. Smith
Seth A. Smith
Zachary A. Smith
Elisabeth Solana
Yuichi Suzuki
George W Tackley
Alexandra Tinnermann
Dimitri Van De Ville
Marios C. Yiannakas
Kenneth A. Weber
Nikolaus Weiskopf
Richard G. Wise
Patrik O. Wyss
Junqian Xu
Christophe Lenglet
Igor Nestrasil
Clinical research emphasizes the implementation of rigorous and reproducible study designs that rely on between-group matching or controllin… (see more)g for sources of biological variation such as subject’s sex and age. However, corrections for body size (i.e. height and weight) are mostly lacking in clinical neuroimaging designs. This study investigates the importance of body size parameters in their relationship with spinal cord (SC) and brain magnetic resonance imaging (MRI) metrics. Data were derived from a cosmopolitan population of 267 healthy human adults (age 30.1±6.6 years old, 125 females). We show that body height correlated strongly or moderately with brain gray matter (GM) volume, cortical GM volume, total cerebellar volume, brainstem volume, and cross-sectional area (CSA) of cervical SC white matter (CSA-WM; 0.44≤r≤0.62). In comparison, age correlated weakly with cortical GM volume, precentral GM volume, and cortical thickness (−0.21≥r≥−0.27). Body weight correlated weakly with magnetization transfer ratio in the SC WM, dorsal columns, and lateral corticospinal tracts (−0.20≥r≥−0.23). Body weight further correlated weakly with the mean diffusivity derived from diffusion tensor imaging (DTI) in SC WM (r=−0.20) and dorsal columns (−0.21), but only in males. CSA-WM correlated strongly or moderately with brain volumes (0.39≤r≤0.64), and weakly with precentral gyrus thickness and DTI-based fractional anisotropy in SC dorsal columns and SC lateral corticospinal tracts (−0.22≥r≥−0.25). Linear mixture of sex and age explained 26±10% of data variance in brain volumetry and SC CSA. The amount of explained variance increased at 33±11% when body height was added into the mixture model. Age itself explained only 2±2% of such variance. In conclusion, body size is a significant biological variable. Along with sex and age, body size should therefore be included as a mandatory variable in the design of clinical neuroimaging studies examining SC and brain structure.
ChatGPT: What Every Pediatric Surgeon Should Know About Its Potential Uses and Pitfalls
Raquel González
Russell Woo
A Francois Trappey
Stewart Carter
David Darcy
Ellen Encisco
Brian Gulack
Doug Miniati
Edzhem Tombash
Eunice Y. Huang
Code as Reward: Empowering Reinforcement Learning with VLMs
David Venuto
Sami Nur Islam
Sherry Yang
Pre-trained Vision-Language Models (VLMs) are able to understand visual concepts, describe and decompose complex tasks into sub-tasks, and p… (see more)rovide feedback on task completion. In this paper, we aim to leverage these capabilities to support the training of reinforcement learning (RL) agents. In principle, VLMs are well suited for this purpose, as they can naturally analyze image-based observations and provide feedback (reward) on learning progress. However, inference in VLMs is computationally expensive, so querying them frequently to compute rewards would significantly slowdown the training of an RL agent. To address this challenge, we propose a framework named Code as Reward (VLM-CaR). VLM-CaR produces dense reward functions from VLMs through code generation, thereby significantly reducing the computational burden of querying the VLM directly. We show that the dense rewards generated through our approach are very accurate across a diverse set of discrete and continuous environments, and can be more effective in training RL policies than the original sparse environment rewards.
A Distributional Analogue to the Successor Representation
Arthur Gretton
Yunhao Tang
Andre Barreto
Will Dabney
Bellemare Marc-Emmanuel
Mark Rowland
This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure … (see more)and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm that learns the distributional SM from data by minimizing a two-level maximum mean discrepancy. Key to our method are a number of algorithmic techniques that are independently valuable for learning generative models of state. As an illustration of the usefulness of the distributional SM, we show that it enables zero-shot risk-sensitive policy evaluation in a way that was not previously possible.
Dynamic System Modeling Using a Multisource Transfer Learning-Based Modular Neural Network for Industrial Application
Haoshan Duan
Xi Meng
JunFei Qiao
Establishing an accurate model of dynamic systems poses a challenge for complex industrial processes. Due to the ability to handle complex t… (see more)asks, modular neural networks (MNN) have been widely applied to industrial process modeling. However, the phenomenon of domain drift caused by operating conditions may lead to a cold start of the model, which affects the performance of MNN. For this reason, a multisource transfer learning-based MNN (MSTL-MNN) is proposed in this study. First, the knowledge-driven transfer learning process is performed with domain similarity evaluation, knowledge extraction, and fusion, aiming to form an initial subnetwork in the target domain. Then, the positive transfer process of effective knowledge can avoid the cold start problem of MNN. Second, during the data-driven fine-tuning process, a regularized self-organizing long short-term memory algorithm is designed to fine-tune the structure and parameters of the initial subnetwork, which can improve the prediction performance of MNN. Meanwhile, relevant theoretical analysis is given to ensure the feasibility of MSTL-MNN. Finally, the effectiveness of the proposed method is confirmed by two benchmark simulations and a real industrial dataset of a municipal solid waste incineration process. Experimental results demonstrate the merits of MSTL-MNN for industrial applications.
Fairness-aware data-driven-based model predictive controller: A study on thermal energy storage in a residential building
Ying Sun
Fariborz Haghighat
Benjamin C. M. Fung
Faithfulness Measurable Masked Language Models
A common approach to explaining NLP models is to use importance measures that express which tokens are important for a prediction. Unfortuna… (see more)tely, such explanations are often wrong despite being persuasive. Therefore, it is essential to measure their faithfulness. One such metric is if tokens are truly important, then masking them should result in worse model performance. However, token masking introduces out-of-distribution issues, and existing solutions that address this are computationally expensive and employ proxy models. Furthermore, other metrics are very limited in scope. This work proposes an inherently faithfulness measurable model that addresses these challenges. This is achieved using a novel fine-tuning method that incorporates masking, such that masking tokens become in-distribution by design. This differs from existing approaches, which are completely model-agnostic but are inapplicable in practice. We demonstrate the generality of our approach by applying it to 16 different datasets and validate it using statistical in-distribution tests. The faithfulness is then measured with 9 different importance measures. Because masking is in-distribution, importance measures that themselves use masking become consistently more faithful. Additionally, because the model makes faithfulness cheap to measure, we can optimize explanations towards maximal faithfulness; thus, our model becomes indirectly inherently explainable.