Publications

Towards Machines that Trust: AI Agents Learn to Trust in the Trust Game
Ardavan S. Nobandegani
Thomas Shultz
Widely considered a cornerstone of human morality, trust shapes many aspects of human social interactions. In this work, we present a theore… (see more)tical analysis of the
Diagnosis and management of autoimmune diseases in the ICU
Yaseen M. Arabi
Raquel Bartz
Otavio Ranzani
Franziska Scheibe
Michael Darmon
Julie Helms
An empirical study of testing machine learning in the wild
Moses Openja
Armstrong Foundjem
Zhen Ming
Mouna Abidi
Ahmed E. Hassan
Recently, machine and deep learning (ML/DL) algorithms have been increasingly adopted in many software systems. Due to their inductive natur… (see more)e, ensuring the quality of these systems remains a significant challenge for the research community. Unlike traditional software built deductively by writing explicit rules, ML/DL systems infer rules from training data. Recent research in ML/DL quality assurance has adapted concepts from traditional software testing, such as mutation testing, to improve reliability. However, it is unclear if these proposed testing techniques are adopted in practice, or if new testing strategies have emerged from real-world ML deployments. There is little empirical evidence about the testing strategies. To fill this gap, we perform the first fine-grained empirical study on ML testing in the wild to identify the ML properties being tested, the testing strategies, and their implementation throughout the ML workflow. We conducted a mixed-methods study to understand ML software testing practices. We analyzed test files and cases from 11 open-source ML/DL projects on GitHub. Using open coding, we manually examined the testing strategies, tested ML properties, and implemented testing methods to understand their practical application in building and releasing ML/DL software systems. Our findings reveal several key insights: 1.) The most common testing strategies, accounting for less than 40%, are Grey-box and White-box methods, such as Negative Testing, Oracle Approximation and Statistical Testing. 2.) A wide range of 17 ML properties are tested, out of which only 20% to 30% are frequently tested, including Consistency, Correctness}, and Efficiency. 3.) Bias and Fairness is more tested in Recommendation, while Security & Privacy is tested in Computer Vision (CV) systems, Application Platforms, and Natural Language Processing (NLP) systems.
Gemini: A Family of Highly Capable Multimodal Models
Gemini Team Google Rohan Anil
Sebastian Borgeaud
Yonghui Wu
Jean-Baptiste Alayrac
Jiahui Yu
Radu Soricut
J. Schalkwyk
Andrew M. Dai
Anja Hauth
Katie Millican
David Silver
Slav Petrov
Melvin Johnson
Ioannis Antonoglou
Julian Schrittwieser
Amelia Glaese
Jilin Chen
Emily Pitler
Timothy P Lillicrap
Angeliki Lazaridou … (see 480 more)
James L. Molloy
Michael Acheson Isard
Paul R. Barham
Tom Hennigan
Benjamin Lee
Malcolm Reynolds
Yuanzhong Xu
Ryan Doherty
Eli Collins
Clemens Meyer
Eliza Rutherford
Erica Moreira
Kareem W. Ayoub
Megha Goel
George Tucker
Enrique Piqueras
M. Krikun
Iain Barr
Nikolay Savinov
Ivo Danihelka
Becca Roelofs
Anais White
Anders Johan Andreassen
Tamara von Glehn
Laksh-man Yagati
Mehran Kazemi
Lucas Gonzalez
Misha Khalman
Alexandre Fréchette
Charlotte Smith
Laura Culp
Lev Proleev
Yi Luan
X. T. Chen
James Lottes
Federico Lebron
Alban Rrustemi
Natalie Clay
Phil Crone
Tomas Kocisky
Jeffrey Zhao
Bartek Perz
Dian Yu
Heidi Howard
Adam E. Bloniarz
Jack W. Rae
Han Lu
Laurent Sifre
Marcello Maggioni
Fred Alcober
Dan Garrette
Megan Barnes
Shantanu Thakoor
Jacob Austin
Gabriel Barth-Maron
William Wong
Rishabh Joshi
Rahma Chaabouni
Deeni Fatiha
Arun Ahuja
Ruibo Liu
Yunxuan Li
Sarah Cogan
Jeremy Chen
Chao Jia
Chenjie Gu
Qiao Zhang
Jordan Grimstad
Ale Jakse Hartman
Martin J. Chadwick
Gaurav Singh Tomar
Xavier Garcia
Evan Senter
Emanuel Taropa
Thanumalayan Sankaranarayana Pillai
Jacob Devlin
Michael Laskin
Diego de Las Casas
Dasha Valter
Connie Tao
Lorenzo Blanco
Adrià Puigdomènech Badia
David Reitter
Mianna Chen
Jenny Brennan
Clara E. Rivera
Sergey Brin
Shariq Iqbal
Gabriela Surita
Jane Labanowski
Abhishek Rao
Stephanie Winkler
Emilio Parisotto
Yiming Gu
Kate Olszewska
Yujing Zhang
Ravichandra Addanki
Antoine Miech
Annie Louis
Laurent El Shafey
Denis Teplyashin
Geoff Brown
Elliot Catt
Nithya Attaluri
Jan Balaguer
Jackie Xiang
Pidong Wang
Zoe Ashwood
Anton Briukhov
Alex Webson
Sanjay Ganapathy
Smit Sanghavi
Ajay Kannan
Ming-Wei Chang
Axel Stjerngren
Josip Djolonga
Yuting Sun
Ankur Bapna
Matthew Aitchison
Pedram Pejman
Henryk Michalewski
Tianhe Yu
Cindy Wang
J Christopher Love
Junwhan Ahn
Dawn Bloxwich
Kehang Han
Peter Conway Humphreys
Thibault Sellam
James Bradbury
Varun Godbole
Sina Samangooei
Bogdan Damoc
Alex Kaskasoli
S'ebastien M. R. Arnold
Vijay Vasudevan
Shubham Agrawal
Jason Riesa
Dmitry Lepikhin
Richard Tanburn
Srivatsan Srinivasan
Hyeontaek Lim
Sarah Hodkinson
Pranav Shyam
Johan Ferret
Steven Hand
Ankush Garg
T. Paine
Jian Li
Yujia Li
Minh Giang
Zaheer Abbas
Sarah York
Machel Reid
Elizabeth Cole
Aakanksha Chowdhery
Dipanjan Das
Dominika Rogozi'nska
Vitaly Nikolaev
Pablo G. Sprechmann
Zachary Nado
Lukáš Žilka
Flavien Prost
Luheng He
Marianne Monteiro
Gaurav Mishra
Christoper A. Welty
Joshua Newlan
Dawei Jia
Miltiadis Allamanis
Clara Huiyi Hu
Raoul de Liedekerke
Justin Gilmer
Carl Saroufim
Shruti Rijhwani
Shaobo Hou
Disha Shrivastava
Anirudh Baddepudi
Alex Goldin
Adnan Ozturel
Albin Cassirer
Yunhan Xu
Daniel Sohn
Devendra Singh Sachan
Reinald Kim Amplayo
Craig Swanson
Dessie Petrova
Shashi Narayan
Arthur Guez
Siddhartha Brahma
Jessica Landon
Miteyan Patel
Ruizhe Zhao
Kevin Villela
Luyu Wang
Wenhao Jia
Matthew Rahtz
Mai Gim'enez
Legg Yeung
Hanzhao Lin
James Keeling
Petko Georgiev
Diana Mincu
Boxi Wu
Salem Haykal
Rachel Saputro
Kiran N. Vodrahalli
James Qin
Zeynep Cankara
Abhanshu Sharma
Nicholas Fernando
Will Hawkins
Behnam Neyshabur
Solomon Kim
Adrian Hutter
Priyanka Agrawal
Alex Castro-Ros
George van den Driessche
Tao Wang
Fan Yang
Shuo-yiin Chang
Paul Komarek
Ross McIlroy
Mario Luvci'c
Guodong Zhang
Wael Farhan
Michael Sharman
Paul Natsev
Paul Michel
Yong Cheng
Yamini Bansal
Siyuan Qiao
Kris Cao
Siamak Shakeri
Christina Butterfield
Justin Chung
Paul Kishan Rubenstein
Shivani Agrawal
Arthur Mensch
Kedar Soparkar
Karel Lenc
Timothy Chung
Aedan Pope
Lorenzo Maggiore
Jackie Kay
Priya Jhakra
Shibo Wang
Joshua Maynez
Mary Phuong
Taylor Tobin
Andrea Tacchetti
Maja Trebacz
Kevin Robinson
Yash Katariya
Sebastian Riedel
Paige Bailey
Kefan Xiao
Nimesh Ghelani
Lora Aroyo
Ambrose Slone
Neil Houlsby
Xuehan Xiong
Zhen Yang
Elena Gribovskaya
Jonas Adler
Mateo Wirth
Lisa Lee
Music Li
Thais Kagohara
Jay Pavagadhi
Sophie Bridgers
Anna Bortsova
Sanjay Ghemawat
Tianqi Liu
Richard Powell
Vijay Bolina
Mariko Iinuma
Polina Zablotskaia
James Besley
Da-Woon Chung
Timothy Dozat
Ramona Comanescu
Xiance Si
Jeremy Greer
Guolong Su
M. Polacek
Raphael Lopez Kaufman
Simon Tokumine
Hexiang Hu
Elena Buchatskaya
Yingjie Miao
Mohamed Elhawaty
Aditya Siddhant
Nenad Tomasev
Jinwei Xing
Christina Greer
Helen Miller
Shereen Ashraf
Aurko Roy
Zizhao Zhang
Ada Ma
Angelos Filos
Milos Besta
Rory Blevins
Ted Klimenko
Chih-Kuan Yeh
Soravit Changpinyo
Jiaqi Mu
Oscar Chang
Mantas Pajarskas
Carrie Muir
Vered Cohen
Krishna S Haridasan
Amit Marathe
Steven Stenberg Hansen
Sholto Douglas
Rajkumar Samuel
Mingqiu Wang
Sophia Austin
Chang Lan
Jiepu Jiang
Justin Chiu
Jaime Alonso Lorenzo
Lars Lowe Sjosund
S'ebastien Cevey
Zach Gleicher
Thi Avrahami
Anudhyan Boral
Hansa Srinivasan
Vittorio Selo
Rhys May
Konstantinos Aisopos
L'eonard Hussenot
Livio Baldini Soares
Kate Baumli
Michael B. Chang
Adria Recasens
Benjamin Caine
Alexander Pritzel
Filip Pavetic
Fabio Pardo
Anita Gergely
Justin Frye
Vinay Venkatesh Ramasesh
Dan Horgan
Nora Kassner
Subhrajit Roy
Ethan Dyer
V'ictor Campos
Alex Tomala
Yunhao Tang
Dalia El Badawy
Elspeth White
Basil Mustafa
Oran Lang
Abhishek Jindal
Sharad Mandyam Vikram
Zhitao Gong
Sergi Caelles
Ross Hemsley
Gregory Thornton
Fangxiaoyu Feng
Wojciech Stokowiec
Ce Zheng
Phoebe Thacker
cCauglar Unlu
Zhishuai Zhang
Mohammad Saleh
James Svensson
Maxwell L. Bileschi
Piyush Patil
Roman Ring
Katerina Tsihlas
Arpi Vezer
Marco Selvi
Toby Shevlane
Mikel Rodriguez
Tom Kwiatkowski
Samira Daruki
Keran Rong
Allan Dafoe
Nicholas Fitzgerald
Keren Gu-Lemberg
Mina Khan
Lisa Anne Hendricks
Marie Pellat
Vladimir Feinberg
James Cobon-Kerr
Tara N. Sainath
Maribeth Rauh
Sayed Hadi Hashemi
Richard Ives
Yana Hasson
YaGuang Li
Eric Noland
Yuan Cao
Nathan Byrd
Le Hou
Qingze Wang
Thibault Sottiaux
Michela Paganini
Jean-Baptiste Lespiau
Alexandre Moufarek
Samer Hassan
Kaushik Shivakumar
Joost Van Amersfoort
Amol Mandhane
Pratik M. Joshi
Matthew Tung
Andy Brock
Hannah Rachel Sheahan
Vedant Misra
Cheng Li
Nemanja Raki'cevi'c
Mostafa Dehghani
Fangyu Liu
Sid Mittal
Junhyuk Oh
Seb Noury
Eren Sezener
Fantine Huot
Matthew Lamm
Nicola De Cao
Charlie Chen
Gamaleldin Elsayed
Ed Huai-hsin Chi
Mahdis Mahdieh
Ian F. Tenney
Nan Hua
Ivan Petrychenko
Patrick Kane
Dylan Scandinaro
Rishub Jain
Jonathan Uesato
Romina Datta
Adam Sadovsky
Oskar Bunyan
Dominik Rabiej
Shimu Wu
John Zhang
Gautam Vasudevan
Edouard Leurent
Mahmoud Alnahlawi
Ionut-Razvan Georgescu
Nan Wei
Ivy Zheng
Betty Chan
Pam G Rabinovitch
Piotr Stańczyk
Ye Zhang
David Steiner
Subhajit Naskar
Michael Azzam
Matthew Johnson
Adam Paszke
Chung-Cheng Chiu
Jaume Sanchez Elias
Afroz Mohiuddin
Faizan Muhammad
Jin Miao
Andrew Lee
Nino Vieillard
Sahitya Potluri
Jane Park
Elnaz Davoodi
Jiageng Zhang
Jeff Stanway
Drew Garmon
Abhijit Karmarkar
Zhe Dong
Giant Correlated Gap and Possible Room-Temperature Correlated States in Twisted Bilayer MoS_{2}.
Fanfan Wu
Qiaoling Xu
Qinqin Wang
Yanbang Chu
Li Li
Jieying Liu
Jinpeng Tian
Yiru Ji
Le Liu
Yalong Yuan
Zhiheng Huang
Jiaojiao Zhao
Xiaozhou Zan
Kenji Watanabe
Takashi Taniguchi
Dongxia Shi
Gangxu Gu
Yang Xu
Lede Xian … (see 3 more)
Wei Yang
Luojun Du
Guangyu Zhang
Moiré superlattices have emerged as an exciting condensed-matter quantum simulator for exploring the exotic physics of strong electronic co… (see more)rrelations. Notable progress has been witnessed, but such correlated states are achievable usually at low temperatures. Here, we report evidence of possible room-temperature correlated electronic states and layer-hybridized SU(4) model simulator in AB-stacked MoS_{2} homobilayer moiré superlattices. Correlated insulating states at moiré band filling factors v=1, 2, 3 are unambiguously established in twisted bilayer MoS_{2}. Remarkably, the correlated electronic state at v=1 shows a giant correlated gap of ∼126  meV and may persist up to a record-high critical temperature over 285 K. The realization of a possible room-temperature correlated state with a large correlated gap in twisted bilayer MoS_{2} can be understood as the cooperation effects of the stacking-specific atomic reconstruction and the resonantly enhanced interlayer hybridization, which largely amplify the moiré superlattice effects on electronic correlations. Furthermore, extreme large nonlinear Hall responses up to room temperature are uncovered near correlated electronic states, demonstrating the quantum geometry of moiré flat conduction band.
Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing
Weili Nie
Chengpeng Wang
Zhuoran Qiao
Ling Liu
Chaowei Xiao
Animashree Anandkumar
There is increasing adoption of artificial intelligence in drug discovery. However, existing studies use machine learning to mainly utilize … (see more)the chemical structures of molecules but ignore the vast textual knowledge available in chemistry. Incorporating textual knowledge enables us to realize new drug design objectives, adapt to text-based instructions and predict complex biological activities. Here we present a multi-modal molecule structure–text model, MoleculeSTM, by jointly learning molecules’ chemical structures and textual descriptions via a contrastive learning strategy. To train MoleculeSTM, we construct a large multi-modal dataset, namely, PubChemSTM, with over 280,000 chemical structure–text pairs. To demonstrate the effectiveness and utility of MoleculeSTM, we design two challenging zero-shot tasks based on text instructions, including structure–text retrieval and molecule editing. MoleculeSTM has two main properties: open vocabulary and compositionality via natural language. In experiments, MoleculeSTM obtains the state-of-the-art generalization ability to novel biochemical concepts across various benchmarks. Machine learning methods in cheminformatics have made great progress in using chemical structures of molecules, but a large portion of textual information remains scarcely explored. Liu and colleagues trained MoleculeSTM, a foundation model that aligns the structure and text modalities through contrastive learning, and show its utility on the downstream tasks of structure–text retrieval, text-guided editing and molecular property prediction.
Unagi: Deep Generative Model for Deciphering Cellular Dynamics and In-Silico Drug Discovery in Complex Diseases
Yumin Zheng
Jonas C. Schupp
Taylor S Adams
Geremy Clair
Aurelien Justet
Farida Ahangari
Xiting Yan
Paul Hansen
Marianne Carlon
Emanuela Cortesi
Marie Vermant
Robin Vos
De Sadeleer J Laurens
Ivan O Rosas
Ricardo Pineda
John Sembrat
Melanie Königshoff
John E McDonough
Bart M. Vanaudenaerde … (see 2 more)
Wim A Wuyts
Naftali Kaminski
Human diseases are characterized by intricate cellular dynamics. Single-cell sequencing provides critical insights, yet a persistent gap rem… (see more)ains in computational tools for detailed disease progression analysis and targeted in-silico drug interventions. Here, we introduce UNAGI, a deep generative neural network tailored to analyze time-series single-cell transcriptomic data. This tool captures the complex cellular dynamics underlying disease progression, enhancing drug perturbation modeling and discovery. When applied to a dataset from patients with Idiopathic Pulmonary Fibrosis (IPF), UNAGI learns disease-informed cell embeddings that sharpen our understanding of disease progression, leading to the identification of potential therapeutic drug candidates. Validation via proteomics reveals the accuracy of UNAGI’s cellular dynamics analyses, and the use of the Fibrotic Cocktail treated human Precision-cut Lung Slices confirms UNAGI’s predictions that Nifedipine, an antihypertensive drug, may have antifibrotic effects on human tissues. UNAGI’s versatility extends to other diseases, including a COVID dataset, demonstrating adaptability and confirming its broader applicability in decoding complex cellular dynamics beyond IPF, amplifying its utility in the quest for therapeutic solutions across diverse pathological landscapes.
Pseudo-random Instance Generators in C++ for Deterministic and Stochastic Multi-commodity Network Design Problems
Eric Larsen
Serge Bisaillon
Jean-François Cordeau
Network design problems constitute an important family of combinatorial optimization problems for which numerous exact and heuristic algorit… (see more)hms have been developed over the last few decades. Two central problems in this family are the multi-commodity, capacitated, fixed charge network design problem (MCFNDP) and its stochastic counterpart, the two-stage MCFNDP with recourse. These are standard problems that often serve as work benches for devising and testing models and algorithms in stylized but close-to-realistic settings. The purpose of this paper is to introduce two flexible, high-speed generators capable of simulating a wide range of settings for both the deterministic and stochastic MCFNDPs. We hope that, by facilitating systematic experimentation with new and larger sets of instances, these generators will lead to a more thorough assessment of the performance achieved by exact and heuristic solution methods in both deterministic and stochastic settings. We also hope that making these generators available will promote the reproducibility and comparability of published research.
RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain
Sangeet Sagar
Bernd Kiefer
Ivana Kruijff-Korbayová
Josef van Genabith
Despite the recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional … (see more)speech in noisy and reverberant acoustic environments. This poses a particular challenge in the search and rescue (SAR) domain, where transcribing conversations among rescue team members is crucial to support real-time decision-making. The scarcity of speech data and associated background noise in SAR scenarios make it difficult to deploy robust speech recognition systems. To address this issue, we have created and made publicly available a German speech dataset called RescueSpeech. This dataset includes real speech recordings from simulated rescue exercises. Additionally, we have released competitive training recipes and pre-trained models. Our study highlights that the performance attained by state-of-the-art methods in this challenging scenario is still far from reaching an acceptable level.
Speech Emotion Diarization: Which Emotion Appears When?
Yingzhi Wang
Alaa Nfissi
Alya Yacoubi
Speech Emotion Recognition (SER) typically relies on utterance-level solutions. However, emotions conveyed through speech should be consider… (see more)ed as discrete speech events with definite temporal boundaries, rather than attributes of the entire utterance. To reflect the fine-grained nature of speech emotions, we propose a new task: Speech Emotion Diarization (SED). Just as Speaker Diarization answers the question of "Who speaks when?", Speech Emotion Diarization answers the question of "Which emotion appears when?". To facilitate the evaluation of the performance and establish a common benchmark for researchers, we introduce the Zaion Emotion Dataset (ZED), an openly accessible speech emotion dataset that includes non-acted emotions recorded in real-life conditions, along with manually-annotated boundaries of emotion segments within the utterance. We provide competitive baselines and open-source the code and the pre-trained models.
TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch
Jeff Hwang
Moto Hira
Caroline Chen
Xiaohui Zhang
Zhaoheng Ni
Guangzhi Sun
Pingchuan Ma
Ruizhe Huang
Vineel Pratap
Yuekai Zhang
Anurag Kumar
Chin-Yun Yu
Chuang Zhu
Chunxi Liu
Jacob Kahn
Mirco Ravanaelli
Peng Sun
Shinji Watanabe
Yangyang Shi
Yumeng Tao … (see 4 more)
Robin Scheibler
Samuele Cornell
Sean Kim
Stavros Petridis
TorchAudio is an open-source audio and speech processing library built for PyTorch. It aims to accelerate the research and development of au… (see more)dio and speech technologies by providing well-designed, easy-to-use, and performant PyTorch components. Its contributors routinely engage with users to understand their needs and fulfill them by developing impactful features. Here, we survey TorchAudio’s development principles and contents and highlight key features we include in its latest version (2.1): self-supervised learning pre-trained pipelines and training recipes, high-performance CTC decoders, speech recognition models and training recipes, advanced media I/O capabilities, and tools for performing forced alignment, multi-channel speech enhancement, and reference-less speech assessment. For a selection of these features, through empirical studies, we demonstrate their efficacy and show that they achieve competitive or state-of-the-art performance.
FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring
Forests are vital to ecosystems, supporting biodiversity and essential services, but are rapidly changing due to land use and climate change… (see more). Understanding and mitigating negative effects requires parsing data on forests at global scale from a broad array of sensory modalities, and using them in diverse forest monitoring applications. Such diversity in data and applications can be effectively addressed through the development of a large, pre-trained foundation model that serves as a versatile base for various downstream tasks. However, remote sensing modalities, which are an excellent fit for several forest management tasks, are particularly challenging considering the variation in environmental conditions, object scales, image acquisition modes, spatio-temporal resolutions, etc. With that in mind, we present the first unified Forest Monitoring Benchmark (FoMo-Bench), carefully constructed to evaluate foundation models with such flexibility. FoMo-Bench consists of 15 diverse datasets encompassing satellite, aerial, and inventory data, covering a variety of geographical regions, and including multispectral, red-green-blue, synthetic aperture radar and LiDAR data with various temporal, spatial and spectral resolutions. FoMo-Bench includes multiple types of forest-monitoring tasks, spanning classification, segmentation, and object detection. To enhance task and geographic diversity in FoMo-Bench, we introduce TalloS, a global dataset combining satellite imagery with ground-based annotations for tree species classification across 1,000+ categories and hierarchical taxonomic levels. Finally, we propose FoMo-Net, a pre-training framework to develop foundation models with the capacity to process any combination of commonly used modalities and spectral bands in remote sensing.