Publications

Harnessing Pre-trained Generalist Agents for Software Engineering Tasks
Paulina Stevia Nouwou Mindom
Amin Nikanjam
Nowadays, we are witnessing an increasing adoption of Artificial Intelligence (AI) to develop techniques aimed at improving the reliability,… (voir plus) effectiveness, and overall quality of software systems. Deep reinforcement learning (DRL) has recently been successfully used for automation in complex tasks such as game testing and solving the job-shop scheduling problem. However, these specialized DRL agents, trained from scratch on specific tasks, suffer from a lack of generalizability to other tasks and they need substantial time to be developed and re-trained effectively. Recently, DRL researchers have begun to develop generalist agents, able to learn a policy from various environments and capable of achieving performances similar to or better than specialist agents in new tasks. In the Natural Language Processing or Computer Vision domain, these generalist agents are showing promising adaptation capabilities to never-before-seen tasks after a light fine-tuning phase and achieving high performance. This paper investigates the potential of generalist agents for solving SE tasks. Specifically, we conduct an empirical study aimed at assessing the performance of two generalist agents on two important SE tasks: the detection of bugs in games (for two games) and the minimization of makespan in a scheduling task, to solve the job-shop scheduling problem (for two instances). Our results show that the generalist agents outperform the specialist agents with very little effort for fine-tuning, achieving a 20% reduction of the makespan over specialized agent performance on task-based scheduling. In the context of game testing, some generalist agent configurations detect 85% more bugs than the specialist agents. Building on our analysis, we provide recommendations for researchers and practitioners looking to select generalist agents for SE tasks, to ensure that they perform effectively.
Neural manifolds and learning regimes in neural-interface tasks
Alexandre Payeur
Amy L. Orsborn
GROOD: GRadient-aware Out-Of-Distribution detection in interpolated manifolds
Mostafa ElAraby
Sabyasachi Sahoo
Yann Pequignot
Paul Novello
A landmark environmental law looks ahead
Robert L. Fischman
J. B. Ruhl
Brenna R. Forester
Tanya M. Lama
Marty Kardos
Grethel Aguilar Rojas
Nicholas A. Robinson
Patrick D. Shirey
Gary A. Lamberti
Amy W. Ando
Stephen Palumbi
Michael Wara
Mark W. Schwartz
Matthew A. Williamson
Tanya Berger-Wolf
Sara Beery
Justin Kitzes
David Thau
Devis Tuia … (voir 8 de plus)
Daniel Rubenstein
Caleb R. Hickman
Julie Thorstenson
Gregory E. Kaebnick
James P. Collins
Athmeya Jayaram
Thomas Deleuil
Ying Zhao
Less or More From Teacher: Exploiting Trilateral Geometry For Knowledge Distillation
Chengming Hu
Haolun Wu
Xuan Li
Chen Ma
Xi Chen
Jun Yan
Boyu Wang
When Nash Meets Stackelberg
Gabriele Dragotto
Felipe Feijoo
Sriram Sankaranarayanan
Capture the Flag: Uncovering Data Insights with Large Language Models
Issam Hadj Laradji
Perouz Taslakian
Sai Rajeswar
Valentina Zantedeschi
Alexandre Lacoste
David Vazquez
The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. Howev… (voir plus)er, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasoning and code generation techniques. We propose a new evaluation methodology based on a"capture the flag"principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset. We further propose two proof-of-concept agents, with different inner workings, and compare their ability to capture such flags in a real-world sales dataset. While the work reported here is preliminary, our results are sufficiently interesting to mandate future exploration by the community.
CODA: an open-source platform for federated analysis and machine learning on distributed healthcare data
Louis Mullie
Jonathan Afilalo
Patrick Archambault
Rima Bouchakri
Kip Brown
Yiorgos Alexandros Cavayas
Alexis F Turgeon
Denis Martineau
François Lamontagne
Martine Lebrasseur
Renald Lemieux
Jeffrey Li
Michaël Sauthier
Pascal St-Onge
An Tang
William Witteman
Michael Chassé
Extended Lyman-alpha emission towards the SPT2349-56 protocluster at $z=4.3$
Yordanka Apostolovski
Manuel Aravena
Timo Anguita
Matthieu Béthermin
James R. Burgoyne
Scott Chapman
C. Breuck
Anthony R Gonzalez
Max Gronke
Lucia Guaita
Ryley Hill
Sreevani Jarugula
E. Johnston
M. Malkan
Desika Narayanan
Cassie Reuter
Manuel Solimano
Justin Spilker
Nikolaus Sulzenauer … (voir 5 de plus)
Joaquin Vieira
Joaquin Daniel Vieira
David Vizgan
Axel Wei
Axel Weiß
Deep spectroscopic surveys with the Atacama Large Millimeter/submillimeter Array (ALMA) have revealed that some of the brightest infrared so… (voir plus)urces in the sky correspond to concentrations of submillimeter galaxies (SMGs) at high redshift. Among these, the SPT2349-56 protocluster system is amongst the most extreme examples given its high source density and integrated star formation rate. We conducted a deep Lyman-alpha line emission survey around SPT2349-56 using the Multi-Unit Spectroscopic Explorer (MUSE) at the Very Large Telescope (VLT) in order to characterize this uniquely dense environment. Taking advantage of the deep three-dimensional nature of this survey, we performed a sensitive search for Lyman-alpha emitters (LAEs) toward the core and northern extension of the protocluster, which correspond to the brightest infrared regions in this field. Using a smoothed narrowband image extracted from the MUSE datacube around the protocluster redshift, we searched for possible extended structures. We identify only three LAEs at
Towards Machines that Trust: AI Agents Learn to Trust in the Trust Game
Ardavan S. Nobandegani
Thomas Shultz
Widely considered a cornerstone of human morality, trust shapes many aspects of human social interactions. In this work, we present a theore… (voir plus)tical analysis of the
Gemini: A Family of Highly Capable Multimodal Models
Gemini Team Google Rohan Anil
Sebastian Borgeaud
Yonghui Wu
Jean-Baptiste Alayrac
Jiahui Yu
Radu Soricut
J. Schalkwyk
Andrew M. Dai
Anja Hauth
Katie Millican
David Silver
Slav Petrov
Melvin Johnson
Ioannis Antonoglou
Julian Schrittwieser
Amelia Glaese
Jilin Chen
Emily Pitler
Timothy P. Lillicrap
Angeliki Lazaridou … (voir 480 de plus)
Orhan Firat
James L. Molloy
Michael Acheson Isard
Paul R. Barham
Tom Hennigan
Benjamin Lee
Malcolm Reynolds
Yuanzhong Xu
Ryan Doherty
Eli Collins
Clemens Meyer
Eliza Rutherford
Erica Moreira
Kareem W. Ayoub
Megha Goel
George Tucker
Enrique Piqueras
M. Krikun
Iain Barr
Nikolay Savinov
Ivo Danihelka
Becca Roelofs
Anais White
Anders Johan Andreassen
Tamara von Glehn
Lakshman N. Yagati
Mehran Kazemi
Lucas Gonzalez
Misha Khalman
Jakub Sygnowski
Alexandre Fréchette
Charlotte Smith
Laura Culp
Lev Proleev
Yi Luan
Xi Chen
James Lottes
Nathan Schucher
Federico Lebron
Alban Rrustemi
Natalie Clay
Phil Crone
Tomas Kocisky
Jeffrey Zhao
Bartek Perz
Dian Yu
Heidi Howard
Adam E. Bloniarz
Jack W. Rae
Han Lu
Laurent Sifre
Marcello Maggioni
Fred Alcober
Dan Garrette
Megan Barnes
Shantanu Thakoor
Jacob Austin
Gabriel Barth-Maron
William Wong
Rishabh Joshi
Rahma Chaabouni
Deeni Fatiha
Arun Ahuja
Ruibo Liu
Yunxuan Li
Sarah Cogan
Jeremy Chen
Chao Jia
Chenjie Gu
Qiao Zhang
Jordan Grimstad
Ale Jakse Hartman
Martin J. Chadwick
Gaurav Singh Tomar
Xavier Garcia
Evan Senter
Emanuel Taropa
Thanumalayan Sankaranarayana Pillai
Jacob Devlin
Michael Laskin
Diego de Las Casas
Dasha Valter
Connie Tao
Lorenzo Blanco
Adrià Puigdomènech Badia
David Reitter
Mianna Chen
Jenny Brennan
Clara E. Rivera
Sergey Brin
Shariq N Iqbal
Gabriela Surita
Jane Labanowski
Abhishek Rao
Stephanie Winkler
Emilio Parisotto
Yiming Gu
Kate Olszewska
Yujing Zhang
Ravichandra Addanki
Antoine Miech
Annie Louis
Laurent El Shafey
Denis Teplyashin
Geoff Brown
Elliot Catt
Nithya Attaluri
Jan Balaguer
Jackie Xiang
Pidong Wang
Zoe C. Ashwood
Anton Briukhov
Albert Webson
Sanjay Ganapathy
Smit Sanghavi
Ajay Kannan
Ming-Wei Chang
Axel Stjerngren
Josip Djolonga
Yuting Sun
Ankur Bapna
Matthew Aitchison
Pedram Pejman
Henryk Michalewski
Tianhe Yu
Cindy Wang
J Christopher Love
Junwhan Ahn
Dawn Bloxwich
Kehang Han
Peter Conway Humphreys
Thibault Sellam
James Bradbury
Varun Godbole
Sina Samangooei
Bogdan Damoc
Alex Kaskasoli
S'ebastien M. R. Arnold
Vijay Vasudevan
Shubham Agrawal
Jason Riesa
Dmitry Lepikhin
Richard Tanburn
Srivatsan Srinivasan
Hyeontaek Lim
Sarah Hodkinson
Pranav Shyam
Johan Ferret
Steven Hand
Ankush Garg
T. Paine
Jian Li
Yujia Li
Minh Giang
Alexander Neitz
Zaheer Abbas
Sarah York
Machel Reid
Elizabeth Cole
Aakanksha Chowdhery
Dipanjan Das
Dominika Rogozi'nska
Vitaly Nikolaev
Pablo G. Sprechmann
Zachary Nado
Lukáš Žilka
Flavien Prost
Luheng He
Marianne Monteiro
Gaurav Mishra
Christoper A. Welty
Joshua Newlan
Dawei Jia
Miltiadis Allamanis
Clara Huiyi Hu
Raoul de Liedekerke
Justin Gilmer
Carl Saroufim
Shruti Rijhwani
Shaobo Hou
Disha Shrivastava
Anirudh Baddepudi
Alex Goldin
Adnan Ozturel
Albin Cassirer
Yunhan Xu
Daniel Sohn
Devendra Singh Sachan
Reinald Kim Amplayo
Craig Swanson
Dessie Petrova
Shashi Narayan
Arthur Guez
Siddhartha Brahma
Jessica Landon
Miteyan Patel
Ruizhe Zhao
Kevin Villela
Luyu Wang
Wenhao Jia
Matthew Rahtz
Mai Gim'enez
Legg Yeung
Hanzhao Lin
James Keeling
Petko Georgiev
Diana Mincu
Boxi Wu
Salem Haykal
Rachel Saputro
Kiran N. Vodrahalli
James Qin
Zeynep Cankara
Abhanshu Sharma
Nicholas Fernando
Will Hawkins
Behnam Neyshabur
Solomon Kim
Adrian Hutter
Priyanka Agrawal
Alex Castro-Ros
George van den Driessche
Tao Wang
Fan Yang
Shuo-yiin Chang
Paul Komarek
Ross McIlroy
Mario Luvci'c
Guodong Zhang
Wael Farhan
Michael Sharman
Paul Natsev
Paul Michel
Yong Cheng
Yamini Bansal
Siyuan Qiao
Kris Cao
Siamak Shakeri
Christina Butterfield
Justin Chung
Paul Kishan Rubenstein
Shivani Agrawal
Arthur Mensch
Kedar Soparkar
Karel Lenc
Timothy Chung
Aedan Pope
Lorenzo Maggiore
Jackie Kay
Priya Jhakra
Shibo Wang
Joshua Maynez
Mary Phuong
Taylor Tobin
Andrea Tacchetti
Maja Trebacz
Kevin Robinson
Yash Katariya
Sebastian Riedel
Paige Bailey
Kefan Xiao
Nimesh Ghelani
Lora Aroyo
Ambrose Slone
Neil Houlsby
Xuehan Xiong
Zhen Yang
Elena Gribovskaya
Jonas Adler
Mateo Wirth
Lisa Lee
Music Li
Thais Kagohara
Jay Pavagadhi
Sophie Bridgers
Anna Bortsova
Sanjay Ghemawat
Zafarali Ahmed
Tianqi Liu
Richard Powell
Vijay Bolina
Mariko Iinuma
Polina Zablotskaia
James Besley
Da-Woon Chung
Timothy Dozat
Ramona Comanescu
Xiance Si
Jeremy Greer
Guolong Su
M. Polacek
Raphael Lopez Kaufman
Simon Tokumine
Hexiang Hu
Elena Buchatskaya
Yingjie Miao
Mohamed Elhawaty
Aditya Siddhant
Nenad Tomašev
Jinwei Xing
Christina Greer
Helen Miller
Shereen Ashraf
Aurko Roy
Zizhao Zhang
Ada Ma
Angelos Filos
Milos Besta
Rory Blevins
Ted Klimenko
Chih-Kuan Yeh
Soravit Changpinyo
Jiaqi Mu
Oscar Chang
Mantas Pajarskas
Carrie Muir
Vered Cohen
Charline Le Lan
Krishna S Haridasan
Amit Marathe
Steven Hansen
Sholto Douglas
Rajkumar Samuel
Mingqiu Wang
Sophia Austin
Chang Lan
Jiepu Jiang
Justin Chiu
Jaime Alonso Lorenzo
Lars Lowe Sjosund
S'ebastien Cevey
Zach Gleicher
Thi Avrahami
Anudhyan Boral
Hansa Srinivasan
Vittorio Selo
Rhys May
Konstantinos Aisopos
L'eonard Hussenot
Livio Baldini Soares
Kate Baumli
Michael B. Chang
Adria Recasens
Benjamin Caine
Alexander Pritzel
Filip Pavetic
Fabio Pardo
Anita Gergely
Justin Frye
Vinay Venkatesh Ramasesh
Dan Horgan
Kartikeya Badola
Nora Kassner
Subhrajit Roy
Ethan Dyer
V'ictor Campos
Alex Tomala
Yunhao Tang
Dalia El Badawy
Elspeth White
Basil Mustafa
Oran Lang
Abhishek Jindal
Sharad Mandyam Vikram
Zhitao Gong
Sergi Caelles
Ross Hemsley
Gregory Thornton
Fangxiaoyu Feng
Wojciech Stokowiec
Ce Zheng
Phoebe Thacker
cCauglar Unlu
Zhishuai Zhang
Mohammad Saleh
James Svensson
Maxwell L. Bileschi
Piyush Patil
Ankesh Anand
Roman Ring
Katerina Tsihlas
Arpi Vezer
Marco Selvi
Toby Shevlane
Mikel Rodriguez
Tom Kwiatkowski
Samira Daruki
Keran Rong
Allan Dafoe
Nicholas Fitzgerald
Keren Gu-Lemberg
Mina Khan
Lisa Anne Hendricks
Marie Pellat
Vladimir Feinberg
James Cobon-Kerr
Tara N. Sainath
Maribeth Rauh
Sayed Hadi Hashemi
Richard Ives
Yana Hasson
YaGuang Li
Eric Noland
Yuan Cao
Nathan Byrd
Le Hou
Qingze Wang
Thibault Sottiaux
Michela Paganini
Jean-Baptiste Lespiau
Alexandre Moufarek
Samer Hassan
Kaushik Shivakumar
Joost Van Amersfoort
Amol Mandhane
Pratik M. Joshi
Anirudh Goyal
Matthew Tung
Andy Brock
Hannah Rachel Sheahan
Vedant Misra
Cheng Li
Nemanja Raki'cevi'c
Mostafa Dehghani
Fangyu Liu
Sid Mittal
Junhyuk Oh
Seb Noury
Eren Sezener
Fantine Huot
Matthew Lamm
Nicola De Cao
Charlie Chen
Gamaleldin Elsayed
Ed Huai-hsin Chi
Mahdis Mahdieh
Ian F. Tenney
Nan Hua
Ivan Petrychenko
Patrick Kane
Dylan Scandinaro
Rishub Jain
Jonathan Uesato
Romina Datta
Adam Sadovsky
Oskar Bunyan
Dominik Rabiej
Shimu Wu
John Zhang
Gautam Vasudevan
Edouard Leurent
Mahmoud Alnahlawi
Ionut-Razvan Georgescu
Nan Wei
Ivy Zheng
Betty Chan
Pam G Rabinovitch
Piotr Stańczyk
Ye Zhang
David Steiner
Subhajit Naskar
Michael Azzam
Matthew Johnson
Adam Paszke
Chung-Cheng Chiu
Jaume Sanchez Elias
Afroz Mohiuddin
Faizan Muhammad
Jin Miao
Andrew Lee
Nino Vieillard
Sahitya Potluri
Jane Park
Elnaz Davoodi
Jiageng Zhang
Jeff Stanway
Drew Garmon
Abhijit Karmarkar
Zhe Dong
Studying the Practices of Testing Machine Learning Software in the Wild
Moses Openja
Armstrong Foundjem
Zhen Ming Jiang
Mouna Abidi
Ahmed E. Hassan
Background: We are witnessing an increasing adoption of machine learning (ML), especially deep learning (DL) algorithms in many software sys… (voir plus)tems, including safety-critical systems such as health care systems or autonomous driving vehicles. Ensuring the software quality of these systems is yet an open challenge for the research community, mainly due to the inductive nature of ML software systems. Traditionally, software systems were constructed deductively, by writing down the rules that govern the behavior of the system as program code. However, for ML software, these rules are inferred from training data. Few recent research advances in the quality assurance of ML systems have adapted different concepts from traditional software testing, such as mutation testing, to help improve the reliability of ML software systems. However, it is unclear if any of these proposed testing techniques from research are adopted in practice. There is little empirical evidence about the testing strategies of ML engineers. Aims: To fill this gap, we perform the first fine-grained empirical study on ML testing practices in the wild, to identify the ML properties being tested, the followed testing strategies, and their implementation throughout the ML workflow. Method: First, we systematically summarized the different testing strategies (e.g., Oracle Approximation), the tested ML properties (e.g., Correctness, Bias, and Fairness), and the testing methods (e.g., Unit test) from the literature. Then, we conducted a study to understand the practices of testing ML software. Results: In our findings: 1) we identified four (4) major categories of testing strategy including Grey-box, White-box, Black-box, and Heuristic-based techniques that are used by the ML engineers to find software bugs. 2) We identified 16 ML properties that are tested in the ML workflow.