Publications

TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters

Jonathan Wilder Lavington

Ke Zhang

Vasileios Lioutas

Matthew Niedoba

Yunpeng Liu

Dylan Green

Saeid Naderiparizi

Xiaoxuan Liang

Setareh Dabiri

Adam Ścibior

Berend Zwartsenberg

Frank Wood

The training, testing, and deployment, of autonomous vehicles requires realistic and efficient simulators. Moreover, because of the high var… (see more)iability between different problems presented in different autonomous systems, these simulators need to be easy to use, and easy to modify. To address these problems we introduce TorchDriveSim and its benchmark extension TorchDriveEnv. TorchDriveEnv is a lightweight reinforcement learning benchmark programmed entirely in Python, which can be modified to test a number of different factors in learned vehicle behavior, including the effect of varying kinematic models, agent types, and traffic control patterns. Most importantly unlike many replay based simulation approaches, TorchDriveEnv is fully integrated with a state of the art behavioral simulation API. This allows users to train and evaluate driving models alongside data driven Non-Playable Characters (NPC) whose initializations and driving behavior are reactive, realistic, and diverse. We illustrate the efficiency and simplicity of TorchDriveEnv by evaluating common reinforcement learning baselines in both training and validation environments. Our experiments show that TorchDriveEnv is easy to use, but difficult to solve.

2024-05-07

ArXiv (preprint)

doi.org

arxiv.org

Deep Clustering with Self-Supervision using Pairwise Similarities

Mohammadreza Sadeghi

Narges Armanfard

Deep clustering incorporates embedding into clustering to find a lower-dimensional space appropriate for clustering. In this paper, we propo… (see more)se a novel deep clustering framework with self-supervision using pairwise similarities (DCSS). The proposed method consists of two successive phases. In the first phase, we propose to form hypersphere-like groups of similar data points, i.e. one hypersphere per cluster, employing an autoencoder that is trained using cluster-specific losses. The hyper-spheres are formed in the autoencoder's latent space. In the second phase, we propose to employ pairwise similarities to create a

2024-05-06

ArXiv (preprint)

doi.org

arxiv.org

Characterizing the voxel-based approaches in radioembolization dosimetry with reDoseMC.

Taehyung Peter Kim

Shirin A. Enger

BACKGROUND Yttrium-90 ( 90 Y …

2024-05-04

Medical Physics (published)

doi.org

Machine learning data practices through a data curation lens: An evaluation framework

Eshta Bhardwaj

Harshit Gujral

Siyi Wu

Ciara Zogheib

Tegan Maharaj

Christoph Becker

Studies of dataset development in machine learning call for greater attention to the data practices that make model development possible and… (see more) shape its outcomes. Many argue that the adoption of theory and practices from archives and data curation fields can support greater fairness, accountability, transparency, and more ethical machine learning. In response, this paper examines data practices in machine learning dataset development through the lens of data curation. We evaluate data practices in machine learning as data curation practices. To do so, we develop a framework for evaluating machine learning datasets using data curation concepts and principles through a rubric. Through a mixed-methods analysis of evaluation results for 25 ML datasets, we study the feasibility of data curation principles to be adopted for machine learning data work in practice and explore how data curation is currently performed. We find that researchers in machine learning, which often emphasizes model development, struggle to apply standard data curation principles. Our findings illustrate difficulties at the intersection of these fields, such as evaluating dimensions that have shared terms in both fields but non-shared meanings, a high degree of interpretative flexibility in adapting concepts without prescriptive restrictions, obstacles in limiting the depth of data curation expertise needed to apply the rubric, and challenges in scoping the extent of documentation dataset creators are responsible for. We propose ways to address these challenges and develop an overall framework for evaluation that outlines how data curation concepts and methods can inform machine learning data practices.

2024-05-04

ArXiv (preprint)

doi.org

arxiv.org

A Comprehensive Dataset of Four Provincial Legislative Assembly Members

Alex B. Rivard

Marc André Bodet

Jean-François Godbout

Éric Montigny

This research note reports on a new dataset about legislators in four Canadian provinces since the establishment of their colonial assemblie… (see more)s in the eighteenth century. Over 7,000 legislators from Ontario, Quebec, New Brunswick, and Nova Scotia are included, with consolidated information drawn from multiple sources about parliamentarians’ years of birth and death, religion, electoral performance, kinship, and several other biographical indicators. We also illustrate the utility of such data with the help of a few descriptive examples drawn from the four provinces. We believe this consolidated dataset offers several opportunities for future research on representation, legislative activities and party politics.

2024-05-03

Canadian Journal of Political Science/Revue canadienne de science politique (published)

doi.org

Hierarchies define the scalability of robot swarms

Vivek Shankar Vardharajan

Karthik Soma

Sepand Dyanatkar

Pierre-Yves Lajoie

Giovanni Beltrame

The emerging behaviors of swarms have fascinated scientists and gathered significant interest in the field of robotics. Traditionally, swarm… (see more)s are viewed as egalitarian, with robots sharing identical roles and capabilities. However, recent findings highlight the importance of hierarchy for deploying robot swarms more effectively in diverse scenarios. Despite nature's preference for hierarchies, the robotics field has clung to the egalitarian model, partly due to a lack of empirical evidence for the conditions favoring hierarchies. Our research demonstrates that while egalitarian swarms excel in environments proportionate to their collective sensing abilities, they struggle in larger or more complex settings. Hierarchical swarms, conversely, extend their sensing reach efficiently, proving successful in larger, more unstructured environments with fewer resources. We validated these concepts through simulations and physical robot experiments, using a complex radiation cleanup task. This study paves the way for developing adaptable, hierarchical swarm systems applicable in areas like planetary exploration and autonomous vehicles. Moreover, these insights could deepen our understanding of hierarchical structures in biological organisms.

2024-05-03

ArXiv (preprint)

doi.org

arxiv.org

Generative Active Learning for the Search of Small-molecule Protein Binders

Maksym Korablyov

Cheng-Hao Liu

Moksh J. Jain

Almer M. van der Sloot

Eric Jolicoeur

Edward Ruediger

Andrei Cristian Nica

Emmanuel Bengio

Kostiantyn Lapchevskyi

Daniel St-Cyr

Doris Alexandra Schuetz

Victor I Butoi

Jarrid Rector-Brooks

Simon R. Blackburn

Leo Feng

Hadi Nekoei

Sai Krishna Gottipati

Priyesh Vijayan

Prateek Gupta

Ladislav Rampášek … (see 14 more)

Sasikanth Avancha

Pierre-Luc Bacon

William L. Hamilton

Brooks Paige

Sanchit Misra

Stanisław Jastrzębski

Bharat Kaul

Doina Precup

José Miguel Hernández-Lobato

Marwin Segler

Michael M. Bronstein

Anne Marinier

Mike Tyers

Yoshua Bengio

Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exh… (see more)ibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecules to discover candidates with a desired property. We apply LambdaZero with molecular docking to design novel small molecules that inhibit the enzyme soluble Epoxide Hydrolase 2 (sEH), while enforcing constraints on synthesizability and drug-likeliness. LambdaZero provides an exponential speedup in terms of the number of calls to the expensive molecular docking oracle, and LambdaZero de novo designed molecules reach docking scores that would otherwise require the virtual screening of a hundred billion molecules. Importantly, LambdaZero discovers novel scaffolds of synthesizable, drug-like inhibitors for sEH. In in vitro experimental validation, a series of ligands from a generated quinazoline-based scaffold were synthesized, and the lead inhibitor N-(4,6-di(pyrrolidin-1-yl)quinazolin-2-yl)-N-methylbenzamide (UM0152893) displayed sub-micromolar enzyme inhibition of sEH.

2024-05-02

ArXiv (preprint)

doi.org

arxiv.org

Schrödinger's Update: User Perceptions of Uncertainties in Proprietary Large Language Model Updates

Zilin Ma

Yiyang Mei

Krzysztof Z. Gajos

Ian Arawjo

2024-05-02

CHI Extended Abstracts (published)

doi.org

2851: Operational Ontology for Oncology (O3) - Multi-professional society standard supporting AI

Charles S. Mayo

Mary U. Feng

Kristy K. Brock

Randi Kudner

Peter Balter

Jeffrey Buchsbaum

Amanda Caissie

Emily Daugherty

Andre Dekker

Clifton D. Fuller

Julian Hong

David Hong

Sophia Kamran

Evangelia Katsoulakis

John Kildea

Andra Krauze

Jon Kruse

Todd McNutt

Michelle Mierzwa

Amy Moreno … (see 5 more)

Jatinder Palta

Richard Popple

Thomas Purdie

Susan Yom

Xiao Ying

2024-05-01

Radiotherapy and Oncology (published)

doi.org

295. Rare Variant Genetic Architecture of the Human Cortical MRI Phenotypes in General Population

Kuldeep Kumar

Sayeh Kazem

Zhijie Liao

Jakub Kopal

Guillaume Huguet

Thomas Renne

Martineau Jean-Louis

Zhe Xie

Zohra Saci

Laura Almasy

David C. Glahn

Tomas Paus

Guillaume Dumas

Carrie Bearden

Paul Thompson

Richard A.I. Bethlehem

Varun Warrier

Sébastien Jacquemont

2024-05-01

Biological Psychiatry (published)

doi.org

Beyond the Norms: Detecting Prediction Errors in Regression Models

Andres Altieri

Marco Romanelli

Georg Pichler

Florence Alberge

Pablo Piantanida

This paper tackles the challenge of detecting unreliable behavior in regression algorithms, which may arise from intrinsic variability (e.g.… (see more), aleatoric uncertainty) or modeling errors (e.g., model uncertainty). First, we formally introduce the notion of unreliability in regression, i.e., when the output of the regressor exceeds a specified discrepancy (or error). Then, using powerful tools for probabilistic modeling, we estimate the discrepancy density, and we measure its statistical diversity using our proposed metric for statistical dissimilarity. In turn, this allows us to derive a data-driven score that expresses the uncertainty of the regression outcome. We show empirical improvements in error detection for multiple regression tasks, consistently outperforming popular baseline approaches, and contributing to the broader field of uncertainty quantification and safe machine learning systems.

2024-05-01

ICML.cc/2024/Conference (spotlight)

doi.org

openreview.net

Body size interacts with the structure of the central nervous system: A multi-center in vivo neuroimaging study

René Labounek

Monica T. Bondy

Amy L. Paulson

Sandrine Bédard

Mihael Abramovic

Eva Alonso‐Ortiz

Nicole Atcheson

Laura R. Barlow

Robert L. Barry

Markus Barth

Marco Battiston

Christian Büchel

Matthew D. Budde

Virginie Callot

Anna Combes

Benjamin De Leener

Maxime Descoteaux

Paulo Loureiro de Sousa

Marek Dostál

Julien Doyon … (see 74 more)

Adam Dvorak

Falk Eippert

Karla R. Epperson

Kevin S. Epperson

Patrick Freund

Jürgen Finsterbusch

Alexandru Foias

Michela Fratini

Issei Fukunaga

Claudia A. M. Gandini Wheeler-Kingshott

Giancarlo Germani

Guillaume Gilbert

Federico Giove

Francesco Grussu

Akifumi Hagiwara

Pierre-Gilles Henry

Tomáš Horák

Masaaki Hori

James M. Joers

Kouhei Kamiya

Haleh Karbasforoushan

Miloš Keřkovský

Ali Khatibi

Joo‐Won Kim

Nawal Kinany

Hagen H. Kitzler

Shannon Kolind

Yazhuo Kong

Petr Kudlička

Paul Kuntke

Nyoman D. Kurniawan

Slawomir Kusmia

Maria Marcella Lagana

Cornelia Laule

Christine S. W. Law

Csw Law

Tobias Leutritz

Yaou Liu

Sara Llufriu

Sean Mackey

Allan R. Martin

Eloy Martinez-Heras

Loan Mattera

Kristin P. O’Grady

Nico Papinutto

Daniel Papp

Deborah Pareto

Todd B. Parrish

Anna Pichiecchio

Ferran Prados

Àlex Rovira

Marc J. Ruitenberg

Rebecca S. Samson

Giovanni Savini

Maryam Seif

Alan C. Seifert

Alex K. Smith

Seth A. Smith

Zachary A. Smith

Elisabeth Solana

Yuichi Suzuki

George Tackley

Alexandra Tinnermann

Jan Valošek

Dimitri Van De Ville

Marios C. Yiannakas

Kenneth A. Weber

Nikolaus Weiskopf

Richard G. Wise

Patrik O. Wyss

Junqian Xu

Julien Cohen-Adad

Christophe Lenglet

Igor Nestrašil

Clinical research emphasizes the implementation of rigorous and reproducible study designs that rely on between-group matching or controllin… (see more)g for sources of biological variation such as subject’s sex and age. However, corrections for body size (i.e. height and weight) are mostly lacking in clinical neuroimaging designs. This study investigates the importance of body size parameters in their relationship with spinal cord (SC) and brain magnetic resonance imaging (MRI) metrics. Data were derived from a cosmopolitan population of 267 healthy human adults (age 30.1±6.6 years old, 125 females). We show that body height correlated strongly or moderately with brain gray matter (GM) volume, cortical GM volume, total cerebellar volume, brainstem volume, and cross-sectional area (CSA) of cervical SC white matter (CSA-WM; 0.44≤r≤0.62). In comparison, age correlated weakly with cortical GM volume, precentral GM volume, and cortical thickness (-0.21≥r≥-0.27). Body weight correlated weakly with magnetization transfer ratio in the SC WM, dorsal columns, and lateral corticospinal tracts (-0.20≥r≥-0.23). Body weight further correlated weakly with the mean diffusivity derived from diffusion tensor imaging (DTI) in SC WM (r=-0.20) and dorsal columns (-0.21), but only in males. CSA-WM correlated strongly or moderately with brain volumes (0.39≤r≤0.64), and weakly with precentral gyrus thickness and DTI-based fractional anisotropy in SC dorsal columns and SC lateral corticospinal tracts (-0.22≥r≥-0.25). Linear mixture of sex and age explained 26±10% of data variance in brain volumetry and SC CSA. The amount of explained variance increased at 33±11% when body height was added into the mixture model. Age itself explained only 2±2% of such variance. In conclusion, body size is a significant biological variable. Along with sex and age, body size should therefore be included as a mandatory variable in the design of clinical neuroimaging studies examining SC and brain structure.

2024-05-01

bioRxiv (preprint)

doi.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications