Publications

How do AI systems fail socially?: an engineering risk analysis approach

Failure Mode and Effect Analysis (FMEA) has been used as an engineering risk assessment tool since 1949. FMEAs are effective in preemptively… (see more) identifying and addressing how a device or process might fail in operation and are often used in the design of high-risk technology applications such as military, automotive industry and medical devices. In this work, we explore whether FMEAs can serve as a risk assessment tool for machine learning practitioners, especially in deploying systems for high-risk applications (e.g. algorithms for recidivism assessment). In particular, we discuss how FMEAs can be used to identify social and ethical failures of Artificial Intelligent Systemss (AISs), recognizing that FMEAs have the potential to uncover a broader range of failures. We first propose a process for developing a Social FMEAs (So-FMEAs) by building on the existing FMEAs framework and a recently published definition of Social Failure Modes by Millar. We then demonstrate a simple proof-of-concept, So-FMEAs for the COMPAS algorithm, a risk assessment tool used by judges to make recidivism-related decisions for convicted individuals. Through this preliminary investigation, we illustrate how a traditional engineering risk management tool could be adapted for analyzing social and ethical failures of AIS. Engineers and designers of AISs can use this new approach to improve their system's design and perform due diligence with respect to potential ethical and social failures.

2021-10-27

2021 IEEE International Symposium on Ethics in Engineering, Science and Technology (ETHICS) (published)

doi.org

Rademacher Random Projections with Tensor Networks

Beheshteh T. Rakhshan

Guillaume Rabusseau

Random projection (RP) have recently emerged as popular techniques in the machine learning community for their ability in reducing the dimen… (see more)sion of very high-dimensional tensors. Following the work in [30], we consider a tensorized random projection relying on Tensor Train (TT) decomposition where each element of the core tensors is drawn from a Rademacher distribution. Our theoretical results reveal that the Gaussian low-rank tensor represented in compressed form in TT format in [30] can be replaced by a TT tensor with core elements drawn from a Rademacher distribution with the same embedding size. Experiments on synthetic data demonstrate that tensorized Rademacher RP can outperform the tensorized Gaussian RP studied in [30]. In addition, we show both theoretically and experimentally, that the tensorized RP in the Matrix Product Operator (MPO) format is not a Johnson-Lindenstrauss transform (JLT) and therefore not a well-suited random projection map

2021-10-25

ArXiv (preprint)

arxiv.org

Generating GitHub Repository Descriptions: A Comparison of Manual and Automated Approaches

Jazlyn Hellman

Eunbee Jang

Christoph Treude

Chenzhun Huang

Jin L.C. Guo

Given the vast number of repositories hosted on GitHub, project discovery and retrieval have become increasingly important for GitHub users.… (see more) Repository descriptions serve as one of the first points of contact for users who are accessing a repository. However, repository owners often fail to provide a high-quality description; instead, they use vague terms, the purpose of the repository is poorly explained, or the description is omitted entirely. In this work, we examine the current practice of writing GitHub repository descriptions. Our investigation leads to the proposal of the LSP (Language, Software technology, and Purpose) template to formulate good descriptions for GitHub repositories that are clear, concise, and informative. To understand the extent to which current automated techniques can support generating repository descriptions, we compare the performance of state-of-the-art text summarization methods on this task. Finally, our user study with GitHub users reveals that automated summarization can adequately be used for default description generation for GitHub repositories, while the descriptions which follow the LSP template offer the most effective instrument for communicating with GitHub users.

2021-10-24

ArXiv (preprint)

arxiv.org

CACHE (Critical Assessment of Computational Hit-finding Experiments): A public-private partnership benchmarking initiative to enable the development of computational methods for hit-finding

Suzanne Ackloo

Rima Al-awar

Rommie E. Amaro

Cheryl H. Arrowsmith

Hatylas Azevedo

Robert A. Batey

Yoshua Bengio

Ulrich A.K. Betz

Cristian G. Bologa

John D. Chodera

Wendy D. Cornell

Ian Dunham

Gerhard F. Ecker

Kristina Edfeldt

Aled M. Edwards

Michael K. Gilson

Claudia R. Gordijo

Gerhard Hessler

Alexander Hillisch

Anders Hogner … (see 19 more)

John J. Irwin

Johanna M. Jansen

Daniel Kuhn

Andrew R. Leach

Alpha A. Lee

Uta Lessel

John Moult

Ingo Muegge

Tudor I. Oprea

Benjamin G. Perry

Patrick Riley

Kumar Singh Saikatendu

Vijayaratnam Santhakumar

Matthieu Schapira

Cora Scholten

Matthew H. Todd

Masoud Vedadi

Andrea Volkamer

Timothy M. Willson

Computational approaches in drug discovery and development hold great promise, with artificial intelligence methods undergoing widespread co… (see more)ntemporary use, but the experimental validation of these new approaches is frequently inadequate. We are initiating Critical Assessment of Computational Hit-finding Experiments (CACHE) as a public benchmarking project that aims to accelerate the development of small molecule hit-finding algorithms by competitive assessment. Compounds will be identified by participants using a wide range of computational methods for dozens of protein targets selected for different types of prediction scenarios, as well as for their potential biological or pharmaceutical relevance. Community-generated predictions will be tested centrally and rigorously in an experimental hub(s), and all data, including the chemical structures of experimentally tested compounds, will be made publicly available without restrictions. The ability of a range of computational approaches to find novel compounds will be evaluated, compared, and published. The overarching goal of CACHE is to accelerate the development of computational chemistry methods by providing rapid and unbiased feedback to those developing methods, with an ancillary and valuable benefit of identifying new compound-protein binding pairs for biologically interesting targets. The initiative builds on the power of crowd sourcing and expands the open science paradigm for drug discovery.

2021-10-21

Nature Reviews Chemistry (published)

doi.org

Probabilistic fine-tuning of pruning masks and PAC-Bayes self-bounded learning

Soufiane Hayou

Bo He

Gintare Karolina Dziugaite

2021-10-21

ArXiv (preprint)

arxiv.org

Evaluation of real-life use of Point-Of-Care Rapid Antigen TEsting for SARS-CoV-2 in schools (EPOCRATES)

Ana C. Blanchard

Marc Desforges

Annie-Claude Labbé

Cat Tuong Nguyen

Yves Petit

Dominic Besner

Kate Zinszer

Olivier Séguin

Zineb Laghdir

Kelsey Adams

Marie-Ève Benoit

Geneviève Leduc

Jean Longtin

Ioannis Ragoussis

David L. Buckeridge

Caroline Quach

We evaluated the use of rapid antigen detection tests (RADT) for the diagnosis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-… (see more)2) infection in school settings to determine RADT’s performance compared to PCR. In this real-world, prospective observational cohort study, high-school students and staff were recruited from two high-schools in Montreal (Canada) and followed from January 25 th to June 10 th , 2021. Twenty-five percent of asymptomatic participants were tested weekly by RADT (nasal) and PCR (gargle). Class contacts of cases were tested. Symptomatic participants were tested by RADT (nasal) and PCR (nasal and gargle). The number of cases and outbreaks were compared to other high schools in the same area. Overall, 2,099 students and 286 school staff members consented to participate. The overall RADT’s specificity varied from 99.8 to 100%, with a lower sensitivity, varying from 28.6% in asymptomatic to 83.3% in symptomatic participants. Secondary cases were identified in 10 of 35 classes. Returning students to school after a 7-day quarantine, with a negative PCR on D6-7 after exposure, did not lead to subsequent outbreaks. Of cases for whom the source was known, 37 of 57 (72.5%) were secondary to household transmission, 13 (25%) to intra-school transmission and one to community contacts between students in the same school. RADT did not perform well as a screening tool in asymptomatic individuals. Reinforcing policies for symptom screening when entering schools and testing symptomatic individuals with RADT on the spot may avoid subsequent significant exposures in class. Rapid antigen tests were compared to standard PCR to diagnose SARS-CoV-2 infections in high-school students. They performed better in symptomatic individuals. Rapid antigen detection tests (RADT) are often used to diagnose respiratory pathogens at the point-of-care. Their performance characteristics vary, but they usually have high specificity and moderate sensitivity compared with PCR. RADT sensitivity ranged from 28.6% in asymptomatic individuals to 83.3% in symptomatic individuals. Return to school after 7 days of quarantine was safe in exposed students. Secondary cases were identified in 28% of classes with an index case.

2021-10-13

medRxiv (preprint)

doi.org

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

A. Chandar

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (see more) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-12

ArXiv (preprint)

openreview.net

A cognitive fingerprint in human random number generation

Marc-Andre Schulz

Sebastian Baier

Benjamin Timmermann

Danilo Bzdok

Karsten Witt

Most work in the neurosciences collapses data from multiple subjects to obtain robust statistical results. This research agenda ignores t… (see more)hat even in healthy subjects brain structure and function are known to be highly variable. Recently, Finn and colleagues showed that the brain's functional organisation is unique to each individual and can yield human-specific connectome fingerprints. This raises the question whether unique functional brain architecture may reflect a unique implementation of cognitive processes and problem solving - i.e. "Can we identify single individuals based on how they think?". The present study addresses the general question of interindividual differences in the specific context of human random number generation. We analyzed the deployment of recurrent patterns in the pseudorandom sequences to develop an identification scheme based on subject-specific volatility patterns. We demonstrate that individuals can be reliably identified based on how they how they generate randomness patterns alone. We moreover show that this phenomenon is driven by individual preference and inhibition of patterns, together forming a cognitive fingerprint.

2021-10-11

Scientific Reports (published)

doi.org

Diagnosing as autistic people increasingly distant from prototypes lead neither to clinical benefit nor to the advancement of knowledge

Laurent Mottron

Danilo Bzdok

2021-10-11

Molecular Psychiatry (published)

doi.org

Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning

Nan Rosemary Ke

Aniket Rajiv Didolkar

Danilo Jimenez Rezende

Yoshua Bengio

Michael Curtis Mozer

Christopher Pal

Inducing causal relationships from observations is a classic problem in machine learning. Most work in causality starts from the premise tha… (see more)t the causal variables themselves are observed. However, for AI agents such as robots trying to make sense of their environment, the only observables are low-level variables like pixels in images. To generalize well, an agent must induce high-level variables, particularly those which are causal or are affected by causal variables. A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure. However, we note that existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs which are impossible to manipulate parametrically (e.g., number of nodes, sparsity, causal chain length, etc.). In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them. In order to systematically probe the ability of methods to identify these variables and structures, we design a suite of benchmarking RL environments. We evaluate various representation learning algorithms from the literature and find that explicitly incorporating structure and modularity in models can help causal induction in model-based reinforcement learning.

2021-10-10

NeurIPS.cc/2021/Track/Datasets_and_Benchmarks/Round2 (published)

openreview.net

DoMoBOT: An AI-Empowered Bot for Automated and Interactive Domain Modelling

Rijul Saini

Gunter Mussbacher

Jin L.C. Guo

Jörg Kienzle

Domain modelling transforms informal requirements written in natural language in the form of problem descriptions into concise and analyzabl… (see more)e domain models. As the manual construction of these domain models is often time-consuming, error-prone, and labor-intensive, several approaches already exist to automate domain modelling. However, the current approaches suffer from lower accuracy of extracted domain models and the lack of support for system-modeller interactions. To better assist modellers, we introduce DoMoBOT, a web-based Domain Modelling BOT. Our proposed bot combines artificial intelligence techniques such as natural language processing and machine learning to extract domain models with higher accuracy. More importantly, our bot incorporates a set of features to bring synergy between automated model extraction and bot-modeller interactions. During these interactions, the bot presents multiple possible solutions to a modeller for modelling scenarios present in a given problem description. The bot further enables modellers to switch to a particular solution and updates the other parts of the domain model proactively. In this tool demo paper, we demonstrate how the implementation and architecture of DoMoBOT support the paradigm of automated and interactive domain modelling for assisting modellers.

2021-10-09

2021 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C) (published)

doi.org

Impact of Aliasing on Generalization in Deep Convolutional Networks

Cristina Vasconcelos

Hugo Larochelle

Vincent Dumoulin

Rob Romijnders

Nicolas Roux

Ross Goroshin

We investigate the impact of aliasing on generalization in Deep Convolutional Networks and show that data augmentation schemes alone are una… (see more)ble to prevent it due to structural limitations in widely used architectures. Drawing insights from frequency analysis theory, we take a closer look at ResNet and EfficientNet architectures and review the trade-off between aliasing and information loss in each of their major components. We show how to mitigate aliasing by inserting non-trainable low-pass filters at key locations, particularly where networks lack the capacity to learn them. These simple architectural changes lead to substantial improvements in generalization on i.i.d. and even more on out-of-distribution conditions, such as image classification under natural corruptions on ImageNet-C [11] and few-shot learning on Meta-Dataset [26]. State-of-the art results are achieved on both datasets without introducing additional trainable parameters and using the default hyper-parameters of open source codebases.

2021-10-09

2021 IEEE/CVF International Conference on Computer Vision (ICCV) (published)

doi.org

arxiv.org

Mila Ventures Founder in Residence

TRAIL: Responsible AI for Professionals and Leaders

AI Advantage: Productivity in Public Service

Publications

Mila Ventures Founder in Residence

TRAIL: Responsible AI for Professionals and Leaders

AI Advantage: Productivity in Public Service

Popular keywords:

Publications