TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters
Jonathan Wilder Lavington
Ke Zhang
Vasileios Lioutas
Matthew Niedoba
Yunpeng Liu
Dylan Green
Saeid Naderiparizi
Xiaoxuan Liang
Setareh Dabiri
Adam Ścibior
Berend Zwartsenberg
Frank Wood
The training, testing, and deployment, of autonomous vehicles requires realistic and efficient simulators. Moreover, because of the high var… (see more)iability between different problems presented in different autonomous systems, these simulators need to be easy to use, and easy to modify. To address these problems we introduce TorchDriveSim and its benchmark extension TorchDriveEnv. TorchDriveEnv is a lightweight reinforcement learning benchmark programmed entirely in Python, which can be modified to test a number of different factors in learned vehicle behavior, including the effect of varying kinematic models, agent types, and traffic control patterns. Most importantly unlike many replay based simulation approaches, TorchDriveEnv is fully integrated with a state of the art behavioral simulation API. This allows users to train and evaluate driving models alongside data driven Non-Playable Characters (NPC) whose initializations and driving behavior are reactive, realistic, and diverse. We illustrate the efficiency and simplicity of TorchDriveEnv by evaluating common reinforcement learning baselines in both training and validation environments. Our experiments show that TorchDriveEnv is easy to use, but difficult to solve.
Deep Clustering with Self-Supervision using Pairwise Similarities
Mohammadreza Sadeghi
Deep clustering incorporates embedding into clustering to find a lower-dimensional space appropriate for clustering. In this paper, we propo… (see more)se a novel deep clustering framework with self-supervision using pairwise similarities (DCSS). The proposed method consists of two successive phases. In the first phase, we propose to form hypersphere-like groups of similar data points, i.e. one hypersphere per cluster, employing an autoencoder that is trained using cluster-specific losses. The hyper-spheres are formed in the autoencoder's latent space. In the second phase, we propose to employ pairwise similarities to create a
Characterizing the voxel-based approaches in radioembolization dosimetry with reDoseMC.
Taehyung Peter Kim
BACKGROUND Yttrium-90 ( 90 Y …
Machine learning data practices through a data curation lens: An evaluation framework
Eshta Bhardwaj
Harshit Gujral
Siyi Wu
Ciara Zogheib
Christoph Becker
Studies of dataset development in machine learning call for greater attention to the data practices that make model development possible and… (see more) shape its outcomes. Many argue that the adoption of theory and practices from archives and data curation fields can support greater fairness, accountability, transparency, and more ethical machine learning. In response, this paper examines data practices in machine learning dataset development through the lens of data curation. We evaluate data practices in machine learning as data curation practices. To do so, we develop a framework for evaluating machine learning datasets using data curation concepts and principles through a rubric. Through a mixed-methods analysis of evaluation results for 25 ML datasets, we study the feasibility of data curation principles to be adopted for machine learning data work in practice and explore how data curation is currently performed. We find that researchers in machine learning, which often emphasizes model development, struggle to apply standard data curation principles. Our findings illustrate difficulties at the intersection of these fields, such as evaluating dimensions that have shared terms in both fields but non-shared meanings, a high degree of interpretative flexibility in adapting concepts without prescriptive restrictions, obstacles in limiting the depth of data curation expertise needed to apply the rubric, and challenges in scoping the extent of documentation dataset creators are responsible for. We propose ways to address these challenges and develop an overall framework for evaluation that outlines how data curation concepts and methods can inform machine learning data practices.
A Comprehensive Dataset of Four Provincial Legislative Assembly Members
Alex B. Rivard
Marc André Bodet
Éric Montigny
This research note reports on a new dataset about legislators in four Canadian provinces since the establishment of their colonial assemblie… (see more)s in the eighteenth century. Over 7,000 legislators from Ontario, Quebec, New Brunswick, and Nova Scotia are included, with consolidated information drawn from multiple sources about parliamentarians’ years of birth and death, religion, electoral performance, kinship, and several other biographical indicators. We also illustrate the utility of such data with the help of a few descriptive examples drawn from the four provinces. We believe this consolidated dataset offers several opportunities for future research on representation, legislative activities and party politics.
Hierarchies define the scalability of robot swarms
Vivek Shankar Vardharajan
Karthik Soma
Sepand Dyanatkar
Pierre-Yves Lajoie
The emerging behaviors of swarms have fascinated scientists and gathered significant interest in the field of robotics. Traditionally, swarm… (see more)s are viewed as egalitarian, with robots sharing identical roles and capabilities. However, recent findings highlight the importance of hierarchy for deploying robot swarms more effectively in diverse scenarios. Despite nature's preference for hierarchies, the robotics field has clung to the egalitarian model, partly due to a lack of empirical evidence for the conditions favoring hierarchies. Our research demonstrates that while egalitarian swarms excel in environments proportionate to their collective sensing abilities, they struggle in larger or more complex settings. Hierarchical swarms, conversely, extend their sensing reach efficiently, proving successful in larger, more unstructured environments with fewer resources. We validated these concepts through simulations and physical robot experiments, using a complex radiation cleanup task. This study paves the way for developing adaptable, hierarchical swarm systems applicable in areas like planetary exploration and autonomous vehicles. Moreover, these insights could deepen our understanding of hierarchical structures in biological organisms.
Generative Active Learning for the Search of Small-molecule Protein Binders
Maksym Korablyov
Cheng-Hao Liu
Moksh J. Jain
Almer M. van der Sloot
Eric Jolicoeur
Edward Ruediger
Andrei Cristian Nica
Kostiantyn Lapchevskyi
Daniel St-Cyr
Doris Alexandra Schuetz
Victor I Butoi
Jarrid Rector-Brooks
Simon R. Blackburn
Leo Feng
Hadi Nekoei
Sai Krishna Gottipati
Priyesh Vijayan
Prateek Gupta
Ladislav Rampášek … (see 14 more)
Sasikanth Avancha
William L. Hamilton
Brooks Paige
Sanchit Misra
Stanisław Jastrzębski
Bharat Kaul
José Miguel Hernández-Lobato
Marwin Segler
Michael M. Bronstein
Anne Marinier
Mike Tyers
Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exh… (see more)ibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecules to discover candidates with a desired property. We apply LambdaZero with molecular docking to design novel small molecules that inhibit the enzyme soluble Epoxide Hydrolase 2 (sEH), while enforcing constraints on synthesizability and drug-likeliness. LambdaZero provides an exponential speedup in terms of the number of calls to the expensive molecular docking oracle, and LambdaZero de novo designed molecules reach docking scores that would otherwise require the virtual screening of a hundred billion molecules. Importantly, LambdaZero discovers novel scaffolds of synthesizable, drug-like inhibitors for sEH. In in vitro experimental validation, a series of ligands from a generated quinazoline-based scaffold were synthesized, and the lead inhibitor N-(4,6-di(pyrrolidin-1-yl)quinazolin-2-yl)-N-methylbenzamide (UM0152893) displayed sub-micromolar enzyme inhibition of sEH.
Schrödinger's Update: User Perceptions of Uncertainties in Proprietary Large Language Model Updates
Zilin Ma
Yiyang Mei
Krzysztof Z. Gajos
2851: Operational Ontology for Oncology (O3) - Multi-professional society standard supporting AI
Charles S. Mayo
Mary U. Feng
Kristy K. Brock
Randi Kudner
Peter Balter
Jeffrey Buchsbaum
Amanda Caissie
Emily Daugherty
Andre Dekker
Clifton D. Fuller
Julian Hong
David Hong
Sophia Kamran
Evangelia Katsoulakis
Andra Krauze
Jon Kruse
Todd McNutt
Michelle Mierzwa
Amy Moreno … (see 5 more)
Jatinder Palta
Richard Popple
Thomas Purdie
Susan Yom
Xiao Ying
295. Rare Variant Genetic Architecture of the Human Cortical MRI Phenotypes in General Population
Kuldeep Kumar
Sayeh Kazem
Zhijie Liao
Jakub Kopal
Guillaume Huguet
Thomas Renne
Martineau Jean-Louis
Zhe Xie
Zohra Saci
Laura Almasy
David C. Glahn
Tomas Paus
Carrie Bearden
Paul Thompson
Richard A.I. Bethlehem
Varun Warrier
Sébastien Jacquemont
Beyond the Norms: Detecting Prediction Errors in Regression Models
Andres Altieri
Marco Romanelli
Georg Pichler
Florence Alberge
This paper tackles the challenge of detecting unreliable behavior in regression algorithms, which may arise from intrinsic variability (e.g.… (see more), aleatoric uncertainty) or modeling errors (e.g., model uncertainty). First, we formally introduce the notion of unreliability in regression, i.e., when the output of the regressor exceeds a specified discrepancy (or error). Then, using powerful tools for probabilistic modeling, we estimate the discrepancy density, and we measure its statistical diversity using our proposed metric for statistical dissimilarity. In turn, this allows us to derive a data-driven score that expresses the uncertainty of the regression outcome. We show empirical improvements in error detection for multiple regression tasks, consistently outperforming popular baseline approaches, and contributing to the broader field of uncertainty quantification and safe machine learning systems.
Body size interacts with the structure of the central nervous system: A multi-center in vivo neuroimaging study
René Labounek
Monica T. Bondy
Amy L. Paulson
Sandrine Bédard
Mihael Abramovic
Eva Alonso‐Ortiz
Nicole Atcheson
Laura R. Barlow
Robert L. Barry
Markus Barth
Marco Battiston
Christian Büchel
Matthew D. Budde
Virginie Callot
Anna Combes
Benjamin De Leener
Maxime Descoteaux
Paulo Loureiro de Sousa
Marek Dostál
Julien Doyon … (see 74 more)
Adam Dvorak
Falk Eippert
Karla R. Epperson
Kevin S. Epperson
Patrick Freund
Jürgen Finsterbusch
Alexandru Foias
Michela Fratini
Issei Fukunaga
Claudia A. M. Gandini Wheeler-Kingshott
Giancarlo Germani
Guillaume Gilbert
Federico Giove
Francesco Grussu
Akifumi Hagiwara
Pierre-Gilles Henry
Tomáš Horák
Masaaki Hori
James M. Joers
Kouhei Kamiya
Haleh Karbasforoushan
Miloš Keřkovský
Ali Khatibi
Joo‐Won Kim
Nawal Kinany
Hagen H. Kitzler
Shannon Kolind
Yazhuo Kong
Petr Kudlička
Paul Kuntke
Nyoman D. Kurniawan
Slawomir Kusmia
Maria Marcella Lagana
Cornelia Laule
Christine S. W. Law
Csw Law
Tobias Leutritz
Yaou Liu
Sara Llufriu
Sean Mackey
Allan R. Martin
Eloy Martinez-Heras
Loan Mattera
Kristin P. O’Grady
Nico Papinutto
Daniel Papp
Deborah Pareto
Todd B. Parrish
Anna Pichiecchio
Ferran Prados
Àlex Rovira
Marc J. Ruitenberg
Rebecca S. Samson
Giovanni Savini
Maryam Seif
Alan C. Seifert
Alex K. Smith
Seth A. Smith
Zachary A. Smith
Elisabeth Solana
Yuichi Suzuki
George Tackley
Alexandra Tinnermann
Jan Valošek
Dimitri Van De Ville
Marios C. Yiannakas
Kenneth A. Weber
Nikolaus Weiskopf
Richard G. Wise
Patrik O. Wyss
Junqian Xu
Christophe Lenglet
Igor Nestrašil
Clinical research emphasizes the implementation of rigorous and reproducible study designs that rely on between-group matching or controllin… (see more)g for sources of biological variation such as subject’s sex and age. However, corrections for body size (i.e. height and weight) are mostly lacking in clinical neuroimaging designs. This study investigates the importance of body size parameters in their relationship with spinal cord (SC) and brain magnetic resonance imaging (MRI) metrics. Data were derived from a cosmopolitan population of 267 healthy human adults (age 30.1±6.6 years old, 125 females). We show that body height correlated strongly or moderately with brain gray matter (GM) volume, cortical GM volume, total cerebellar volume, brainstem volume, and cross-sectional area (CSA) of cervical SC white matter (CSA-WM; 0.44≤r≤0.62). In comparison, age correlated weakly with cortical GM volume, precentral GM volume, and cortical thickness (-0.21≥r≥-0.27). Body weight correlated weakly with magnetization transfer ratio in the SC WM, dorsal columns, and lateral corticospinal tracts (-0.20≥r≥-0.23). Body weight further correlated weakly with the mean diffusivity derived from diffusion tensor imaging (DTI) in SC WM (r=-0.20) and dorsal columns (-0.21), but only in males. CSA-WM correlated strongly or moderately with brain volumes (0.39≤r≤0.64), and weakly with precentral gyrus thickness and DTI-based fractional anisotropy in SC dorsal columns and SC lateral corticospinal tracts (-0.22≥r≥-0.25). Linear mixture of sex and age explained 26±10% of data variance in brain volumetry and SC CSA. The amount of explained variance increased at 33±11% when body height was added into the mixture model. Age itself explained only 2±2% of such variance. In conclusion, body size is a significant biological variable. Along with sex and age, body size should therefore be included as a mandatory variable in the design of clinical neuroimaging studies examining SC and brain structure.