Publications

Notational Programming for Notebook Environments: A Case Study with Quantum Circuits
Ian A. Arawjo
Anthony DeArmas
Michael Roberts
Shrutarshi Basu
Tapan S. Parikh
We articulate a vision for computer programming that includes pen-based computing, a paradigm we term notational programming. Notational pro… (see more)gramming blurs contexts: certain typewritten variables can be referenced in handwritten notation and vice-versa. To illustrate this paradigm, we developed an extension, Notate, to computational notebooks which allows users to open drawing canvases within lines of code. As a case study, we explore quantum programming and designed a notation, Qaw, that extends quantum circuit notation with abstraction features, such as variable-sized wire bundles and recursion. Results from a usability study with novices suggest that users find our core interaction of implicit cross-context references intuitive, but suggests further improvements to debugging infrastructure, interface design, and recognition rates. Throughout, we discuss questions raised by the notational paradigm, including a shift from ‘recognition’ of notations to ‘reconfiguration’ of practices and values around programming, and from ‘sketching’ to writing and drawing, or what we call ‘notating.’
Notational Programming for Notebook Environments: A Case Study with Quantum Circuits
Anthony DeArmas
Michael Roberts
Shrutarshi Basu
Tapan Parikh
We articulate a vision for computer programming that includes pen-based computing, a paradigm we term notational programming. Notational pro… (see more)gramming blurs contexts: certain typewritten variables can be referenced in handwritten notation and vice-versa. To illustrate this paradigm, we developed an extension, Notate, to computational notebooks which allows users to open drawing canvases within lines of code. As a case study, we explore quantum programming and designed a notation, Qaw, that extends quantum circuit notation with abstraction features, such as variable-sized wire bundles and recursion. Results from a usability study with novices suggest that users find our core interaction of implicit cross-context references intuitive, but suggests further improvements to debugging infrastructure, interface design, and recognition rates. Throughout, we discuss questions raised by the notational paradigm, including a shift from ‘recognition’ of notations to ‘reconfiguration’ of practices and values around programming, and from ‘sketching’ to writing and drawing, or what we call ‘notating.’
Low-Rank Representation of Reinforcement Learning Policies
We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional… (see more) embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.
Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model
Yuesong Zou
Ahmad Pesaranghader
Ziyang Song
Aman Verma
Invariant representation driven neural classifier for anti-QCD jet tagging
Taoli Cheng
Learning Latent Structural Causal Models
Jithendaraa Subramanian
Yashas Annadani
Ivaxi Sheth
Nan Rosemary Ke
Tristan Deleu
Stefan Bauer
Causal learning has long concerned itself with the accurate recovery of underlying causal mechanisms. Such causal modelling enables better e… (see more)xplanations of out-of-distribution data. Prior works on causal learning assume that the high-level causal variables are given. However, in machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. In such settings, the entire Structural Causal Model (SCM) -- structure, parameters, \textit{and} high-level causal variables -- is unobserved and needs to be learnt from low-level data. We treat this problem as Bayesian inference of the latent SCM, given low-level data. For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions. Experiments are performed on synthetic datasets and a causally generated image dataset to demonstrate the efficacy of our approach. We also perform image generation from unseen interventions, thereby verifying out of distribution generalization for the proposed causal model.
Machine learning-based incremental learning in interactive domain modelling
Rijul Saini
Gunter Mussbacher
Jörg Kienzle
Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot
Yandong Ji
Zhongyu Li
Yinan Sun
Xue Bin Peng
Sergey Levine
Koushil Sreenath
We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Dev… (see more)eloping algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability during the control of a dynamic legged robot. Moreover, we need to consider motion planning to shoot the hard-to-model deformable ball rolling on the ground with uncertain friction to a desired location. In this paper, we propose a hierarchical framework that leverages deep reinforcement learning to train (a) a robust motion control policy that can track arbitrary motions and (b) a planning policy to decide the desired kicking motion to shoot a soccer ball to a target. We deploy the proposed framework on an A1 quadrupedal robot and enable it to accurately shoot the ball to random targets in the real world.
Towards Clinical Phenotyping at Scale with Serious Games in Mixed Reality
Mariem Hafsia
Romain Trachel
Context: Mental healthcare systems are facing an ever-growing demand for appropriate assessment and intervention. Unfortunately, services ar… (see more)e often centralized, overloaded, and inaccessible, resulting in greater institutional and social inequities. Therefore, there is an urgent need to establish easy-to-implement methods for early diagnosis and personalized follow-up. In recent years, serious games have started to offer such a clinical tool at scale. Problem: There are critical challenges to the development of secure and inclusive serious games for clinical research. First, the quality of the data and features analyzed must be well defined early in the research process in order to draw meaningful conclusions. Second, algorithms must be aligned with the purpose of the research while not perpetuating bias. Finally, the technologies used must be widely accessible and sufficiently engaging for users. Focus of the paper: To tackle these challenges, we designed a participatory project that combines three innovative technologies: Mixed Reality, Serious Gaming, and Machine Learning. We analyze preliminary data with a focus on the identification of the players and the measurement of classical biases such as sex and environment of data collection. Method: We co-developed with patients and their families, as well as clinicians, a serious game in mixed reality specifically designed for evaluation and therapeutic intervention in autism. Preliminary data were collected from neurotypical individuals with a mixed reality headset. Relevant behavioral features were extracted and used to train several classification algorithms for player identification. Results: We were able to classify players above chance with slightly higher accuracy of neural networks. Interestingly, the accuracy was significantly higher when players were separated by sex. Furthermore, the uncontrolled condition showed better levels of accuracy than the controlled condition. This could mean that the data are richer when the player interacts freely with the game. Our proof of concept cannot exclude the possibility that this last result is linked to the experimental setup. Future development will clarify this point with a larger sample size and the use of deep learning algorithms. Implications: We show that serious games in mixed reality can be a valuable tool to collect clinical data. Our preliminary results highlight important biases to consider for future studies, especially for the sex and context of data collection. Next, we will evaluate the usability, accessibility, and tolerability of the device and the game in autistic children. In addition, we will evaluate the psychometric properties of the serious game, especially for patient stratification. This project aims to develop a platform for the diagnosis and therapy of autism, which can eventually be easily extended to other conditions and settings such as the evaluation of depression or stroke rehabilitation. Such a tool can offer novel possibilities for the study, evaluation, and treatment of mental conditions at scale, and thus ease the burden on healthcare systems.
Attention for Compositional Modularity
Oleksiy Ostapenko
Pau Rodriguez
Alexandre Lacoste
Modularity and compositionality are promising inductive biases for addressing longstanding problems in machine learning such as better syste… (see more)matic generalization, as well as better transfer and lower forgetting in the context of continual learning. Here we study how attention-based module selection can help achieve composi-tonal modularity – i.e. decomposition of tasks into meaningful sub-tasks which are tackled by independent architectural entities that we call modules. These sub-tasks must be reusable and the system should be able to learn them without additional supervision. We design a simple experimental setup in which the model is trained to solve mathematical equations with multiple math operations applied sequentially. We study different attention-based module selection strategies, inspired by the principles introduced in the recent literature. We evaluate the method’s ability to learn modules that can recover the underling sub-tasks (operation) used for data generation, as well as the ability to generalize compositionally. We find that meaningful module selection (i.e. routing) is the key to compositional generalization. Further, without access to the privileged information about which part of the input should be used for module selection, the routing component performs poorly for samples that are compositionally out of training distribution. We find that the the main reason for this lies in the routing component, since many of the tested methods perform well OOD if we report the performance of the best performing path at test time. Additionally, we study the role of the number of primitives, the number of training points and bottlenecks for modular specialization.
Early Detection of Sexual Predators with Federated Learning
Khaoula Chehbouni
Gilles Caporossi
Martine De Cock
The rise in screen time and the isolation brought by the different containment measures implemented during the COVID-19 pandemic have led to… (see more) an alarming increase in cases of online grooming. Online grooming is defined as all the strategies used by predators to lure children into sexual exploitation. Previous attempts made in industry and academia on the detection of grooming rely on accessing and monitoring users’ private conversations through the training of a model centrally or by sending personal conversations to a global server. We introduce a first, privacy-preserving, cross-device, federated learning framework for the early detection of sexual predators, which aims to ensure a safe online environment for children while respecting their privacy.
Identifiable Deep Generative Models via Sparse Decoding
Gemma Elyse Moran
Yixin Wang
David Blei
We develop the sparse VAE for unsupervised representation learning on high-dimensional data. The sparse VAE learns a set of latent factors … (see more)(representations) which summarize the associations in the observed data features. The underlying model is sparse in that each observed feature (i.e. each dimension of the data) depends on a small subset of the latent factors. As examples, in ratings data each movie is only described by a few genres; in text data each word is only applicable to a few topics; in genomics, each gene is active in only a few biological processes. We prove such sparse deep generative models are identifiable: with infinite data, the true model parameters can be learned. (In contrast, most deep generative models are not identifiable.) We empirically study the sparse VAE with both simulated and real data. We find that it recovers meaningful latent factors and has smaller heldout reconstruction error than related methods.