Publications

Block Contextual MDPs for Continual Learning
Shagun Sodhani
Franziska Meier
Amy Zhang
In reinforcement learning (RL), when defining a Markov Decision Process (MDP), the environment dynamics is implicitly assumed to be stationa… (see more)ry. This assumption of stationarity, while simplifying, can be unrealistic in many scenarios. In the continual reinforcement learning scenario, the sequence of tasks is another source of nonstationarity. In this work, we propose to examine this continual reinforcement learning setting through the Block Contextual MDP (BC-MDP) framework, which enables us to relax the assumption of stationarity. This framework challenges RL algorithms to handle both nonstationarity and rich observation settings and, by additionally leveraging smoothness properties, enables us to study generalization bounds for this setting. Finally, we take inspiration from adaptive control to propose a novel algorithm that addresses the challenges introduced by this more realistic BC-MDP setting, allows for zero-shot adaptation at evaluation time, and achieves strong performance on several nonstationary environments.
Grow-and-Clip: Informative-yet-Concise Evidence Distillation for Answer Explanation
Yuyan Chen
Yanghua Xiao
Interpreting the predictions of existing Question Answering (QA) models is critical to many real-world intelligent applications, such as QA … (see more)systems for healthcare, education, and finance. However, existing QA models lack interpretability and provide no feedback or explanation for end-users to help them understand why a specific prediction is the answer to a question. In this research, we argue that the evidences of an answer is critical to enhancing the interpretability of QA models. Unlike previous research that simply extracts several sentence(s) in the context as evidence, we are the first to explicitly define the concept of evidence as the supporting facts in a context which are informative, concise, and readable. Besides, we provide effective strategies to quantitatively measure the informativeness, conciseness and readability of evidence. Furthermore, we propose Grow-and-Clip Evidence Distillation (GCED) algorithm to extract evidences from the contexts by trade-off informativeness, conciseness, and readability. We conduct extensive experiments on the SQuAD and TriviaQA datasets with several baseline models to evaluate the effect of GCED on interpreting answers to questions. Human evaluation are also carried out to check the quality of distilled evidences. Experimental results show that automatic distilled evidences have human-like informativeness, conciseness and readability, which can enhance the interpretability of the answers to questions.
Metrics Reloaded - A new recommendation framework for biomedical image analysis validation
Annika Reinke
Lena Maier-Hein
Evangelia Christodoulou
Ben Glocker
Patrick Scholz
Fabian Isensee
Jens Kleesiek
Michal Kozubek
Mauricio Reyes
Michael Alexander Riegler
Manuel Wiesenfarth
Michael Baumgartner
Matthias Eisenmann
Doreen Heckmann-Notzel
Ali Emre Kavur
Tim Radsch
Minu D. Tizabi
Laura Acion
Michela Antonelli
Spyridon Bakas
Peter Bankhead
Arriel Benis
M. Jorge Cardoso
Veronika Cheplygina
Beth A Cimini
Gary S. Collins
Keyvan Farahani
Bram van Ginneken
Fred A Hamprecht
Daniel A. Hashimoto
Michael M. Hoffman
Merel Huisman
Pierre Jannin
Charles Kahn
Alexandros Karargyris
Alan Karthikesalingam
Hannes Kenngott
Annette Kopp-Schneider
Anna Kreshuk
Tahsin Kurc
Bennett Landman
Geert Litjens
Amin Madani
Klaus Maier-Hein
Anne Martel
Peter Mattson
Erik Meijering
Bjoern Menze
David Moher
Karel G.M. Moons
Henning Müller
Brennan Nichyporuk
Felix Nickel
Jens Petersen
Nasir Rajpoot
Nicola Rieke
Julio Saez-Rodriguez
Clara I. Sánchez
Shravya Shetty
Maarten van Smeden
Carole H. Sudre
Ronald M. Summers
Abdel A. Taha
Sotirios A. Tsaftaris
Ben Van Calster
Gael Varoquaux
Paul F Jaeger
Meaningful performance assessment of biomedical image analysis algorithms depends on objective and appropriate performance metrics. There ar… (see more)e major shortcomings in the current state of the art. Yet, so far limited attention has been paid to practical pitfalls associated when using particular metrics for image analysis tasks. Therefore, a number of international initiatives have collaborated to offer researchers with guidance and tools for selecting performance metrics in a problem-aware manner. In our proposed framework, the characteristics of the given biomedical problem are first captured in a problem fingerprint, which identifies properties related to domain interests, the target structure(s), the input datasets, and algorithm output. A problem category-specific mapping is applied in the second step to match fingerprints to metrics that reflect domain requirements. Based on input from experts from more than 60 institutions worldwide, we believe our metric recommendation framework to be useful to the MIDL community and to enhance the quality of biomedical image analysis algorithm validation.
Tell Me How to Survey: Literature Review Made Simple with Automatic Reading Path Generation
Jiayuan Ding
Tong Xiang
Zijing Ou
Wangyang Zuo
Ruihui Zhao
Chenhua Lin
Yefeng Zheng
Recent years have witnessed the dramatic growth of paper volumes with plenty of new research papers published every day, especially in the a… (see more)rea of computer science. How to glean papers worth reading from the massive literature to do a quick survey or keep up with the latest advancement about a specific research topic has become a challenging task. Existing academic search engines return relevant papers by individually calculating the relevance between each paper and query. However, such systems usually omit the prerequisite chains of a research topic and cannot form a meaningful reading path. In this paper, we introduce a new task named Reading Path Generation (RPG) which aims at automatically producing a path of papers to read for a given query. To serve as a research benchmark, we further propose SurveyBank, a dataset consisting of large quantities of survey papers in the field of computer science as well as their citation relationships. Furthermore, we propose a graph-optimization-based approach for reading path generation which takes the relationship between papers into account. Extensive evaluations demonstrate that our approach outperforms other baselines. A real-time Reading Path Generation (RePaGer) system has been also implemented with our designed model. Our source code and SurveyBank dataset can be found here11https://github.com/JiayuanDing100/Reading-Path-Generation.
From inter‐brain connectivity to inter‐personal psychiatry
Social Neuro AI: Social Interaction as the “Dark Matter” of AI
Samuele Bolotta
Capacity Variation in the Many-to-one Stable Matching
Federico Bobbio
Alfredo Torrico
Deep Learning Prediction of Response to Disease Modifying Therapy in Primary Progressive Multiple Sclerosis (P1-1.Virtual)
Jean-Pierre R. Falet
Joshua D. Durso-Finley
Brennan Nichyporuk
Julien Schroeter
Francesca Bovis
Maria-Pia Sormani
Douglas Arnold
Retrieval-Enhanced Machine Learning
Hamed Zamani
Mostafa Dehghani
Donald Metzler
Michael Bendersky
Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of use… (see more)rs of information access systems to include task-driven machines, such as machine learning models. In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability. We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
AmbieGen tool at the SBST 2022 Tool Competition
Dmytro Humeniuk
Giuliano Antoniol
AmbieGen is a tool for generating test cases for cyber-physical systems (CPS). In the context of SBST 2022 CPS tool competition, it has been… (see more) adapted to generating virtual roads to test a car lane keeping assist system. AmbieGen leverages a two objective NSGA-II algorithm to produce the test cases. It has achieved the highest final score, accounting for the test case efficiency, effectiveness and diversity in both testing configurations.
Challenges in Machine Learning Application Development: An Industrial Experience Report
Md Saidur Rahman
Emilio Rivera
Yann‐Gaël Guéhéneuc
Bernd Lehnert
Challenges in Machine Learning Application Development: An Industrial Experience Report
Md. Saidur Rahman
Emilio Martínez Rivera
Yann‐Gaël Guéhéneuc
Bernd Lehnert
SAP is the market leader in enterprise application software offering an end-to-end suite of applications and services to enable their custom… (see more)ers worldwide to operate their business. Especially, retail customers of SAP deal with millions of sales transactions for their day-to-day business. Transactions are created during retail sales at the point of sale (POS) terminals and those transactions are then sent to some central servers for validations and other business operations. A considerable proportion of the retail transactions may have inconsistencies or anomalies due to many technical and human errors. SAP provides an automated process for error detection but still requires a manual process by dedicated employees using workbench software for correction. However, manual corrections of these errors are time-consuming, labor-intensive, and might be prone to further errors due to incorrect modifications. Thus, automated detection and correction of transaction errors are very important regarding their potential business values and the improvement in the business workflow. In this paper, we report on our experience from a project where we develop an AI-based system to automatically detect transaction errors and propose corrections. We identify and discuss the challenges that we faced during this collaborative research and development project, from two distinct perspectives: Software Engineering and Machine Learning. We report on our experience and insights from the project with guidelines for the identified challenges. We collect developers’ feedback for qualitative analysis of our findings. We believe that our findings and recommendations can help other researchers and practitioners embarking into similar endeavours. CCS CONCEPTS • Software and its engineering → Programming teams.