Publications

Unsupervised Model-based Pre-training for Data-efficient Reinforcement Learning from Pixels
Sai Rajeswar
Pietro Mazzaglia
Tim Verbelen
Alexandre Piché
Bart Dhoedt
Alexandre Lacoste
Reinforcement learning (RL) aims at autonomously performing complex tasks. To this end, a reward signal is used to steer the learning proces… (see more)s. While successful in many circumstances, the approach is typically data hungry, requiring large amounts of task-specific interaction between agent and environment to learn efficient behaviors. To alleviate this, unsupervised RL proposes to collect data through self-supervised interaction to accelerate task-specific adaptation. However, whether current unsupervised strategies lead to improved generalization capabilities is still unclear, more so when the input observations are high-dimensional. In this work, we advance the field by closing the performance gap in the Unsupervised RL Benchmark, a collection of tasks to be solved in a data-efficient manner, after interacting with the environment in a self-supervised way. Our approach uses unsupervised exploration for collecting experience to pre-train a world model. Then, when fine-tuning for downstream tasks, the agent leverages the learned model and a hybrid planner to efficiently adapt for the given tasks, achieving comparable results to task-specific base-lines, while using 20x less data. We extensively evaluate our work, comparing several exploration methods and improving the fine-tuning process by studying the interactions between the learned components. Furthermore, we investigate the limitations of the pre-trained agent, gaining insights into how these influence the decision process and shedding light on new research directions.
Clustering units in neural networks: upstream vs downstream information
Richard D Lange
Konrad Paul Kording
It has been hypothesized that some form of"modular"structure in artificial neural networks should be useful for learning, compositionality, … (see more)and generalization. However, defining and quantifying modularity remains an open problem. We cast the problem of detecting functional modules into the problem of detecting clusters of similar-functioning units. This begs the question of what makes two units functionally similar. For this, we consider two broad families of methods: those that define similarity based on how units respond to structured variations in inputs ("upstream"), and those based on how variations in hidden unit activations affect outputs ("downstream"). We conduct an empirical study quantifying modularity of hidden layer representations of simple feedforward, fully connected networks, across a range of hyperparameters. For each model, we quantify pairwise associations between hidden units in each layer using a variety of both upstream and downstream measures, then cluster them by maximizing their"modularity score"using established tools from network science. We find two surprising results: first, dropout dramatically increased modularity, while other forms of weight regularization had more modest effects. Second, although we observe that there is usually good agreement about clusters within both upstream methods and downstream methods, there is little agreement about the cluster assignments across these two families of methods. This has important implications for representation-learning, as it suggests that finding modular representations that reflect structure in inputs (e.g. disentanglement) may be a distinct goal from learning modular representations that reflect structure in outputs (e.g. compositionality).
Studying the Practices of Deploying Machine Learning Projects on Docker
Moses Openja
Bhagya Chembakottu
Heng Li
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Samuel Sokota
Ryan D’orazio
J. Z. Kolter
Nicolas Loizou
Marc Lanctot
Noam Brown
Christian Kroer
This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gra… (see more)dient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games. These virtues include: 1) Being the first quantal response equilibria solver to achieve linear convergence for extensive-form games with first order feedback; 2) Being the first standard reinforcement learning algorithm to achieve empirically competitive results with CFR in tabular settings; 3) Achieving favorable performance in 3x3 Dark Hex and Phantom Tic-Tac-Toe as a self-play deep reinforcement learning algorithm.
The distribution, ecology and predicted habitat use of the Critically Endangered angelshark (Squatina squatina) in coastal waters of Wales and the central Irish Sea
Joanna Barker
Jake Davies
Monika Goralczyk
Surshti Patel
John O'Connor
Jim Evans
Jackson Wesley Evans
Rowland Sharp
Matthew Gollock
Fenella R. Wood
Frank N. Wood
James Rosindell
Charlie Bartlett
Brett J. Garner
Dafydd Jones
D. J. Jones
Declan Quigley
Ben Wray
Billy Wray
Abstract The angelshark (Squatina squatina) has the northernmost range of any angel shark species, but there is limited information on its d… (see more)istribution, habitat use and ecology at higher latitudes. To address this, Angel Shark Project: Wales gathered 2231 S. squatina records and 142 anecdotal resources from fishers, coastal communities and archives. These spanned the coastal waters of Wales and the central Irish Sea and were dated from 1812 to 2020, with 97.62% of records within 11.1 km (6 nm) of the coast. Commercial, recreational and charter boat fishers provided the majority of S. squatina records (97.18%), with significantly more sightings from three decades (1970s, 1980s and 1990s) and in the months of September, June, August and July (in descending order). The coastal area between Bardsey Island and Strumble Head had the most S. squatina records (n = 1279), with notable concentrations also found in Carmarthen Bay, Conwy Bay and the Outer Severn Estuary. Species distribution models (SDM) identified four environmental variables that had significant influence on S. squatina distribution, depth, chlorophyll‐a concentration, sea surface temperature (SST) and salinity, and these varied between the quarters (Q) of the year. SDM model outputs predicted a larger congruous area of suitable habitat in Q3 (3176 km2) compared to Q2 (2051 km2), with suitability along the three glacial moraines (Sarn Badrig, Sarn‐y‐Bwch and Sarn Cynfelyn) strongly presented. Comparison of modelled environmental variables at the location of S. squatina records for each Q identified reductions in depth and salinity, and increases in chlorophyll‐a and SST when comparing Q2 or Q3 with Q1 or Q4. This shift may suggest S. squatina are making seasonal movements to shallow coastal waters in Q2 and Q3. This is supported by 23 anecdotal resources and may be driven by reproductive behaviour, as there were 85 records of S. squatina individuals ≤60 cm in the dataset, inferred as recently born or juvenile life‐history stages. The results have helped fill significant evidence gaps identified in the Wales Angelshark Action Plan and immediate next research steps are suggested.
Leveraging Integer Linear Programming to Learn Optimal Fair Rule Lists
Julien Ferry
Sébastien Gambs
Marie-José Huguet
Mohamed
Siala
On Neural Architecture Inductive Biases for Relational Tasks
Current deep learning approaches have shown good in-distribution generalization performance, but struggle with out-of-distribution generaliz… (see more)ation. This is especially true in the case of tasks involving abstract relations like recognizing rules in sequences, as we find in many intelligence tests. Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems. Building on this work, we further explore and formalize the advantages afforded by 'partitioned' representations of relations and sensory details, and how this inductive bias can help recompose learned relational structure in newly encountered settings. We introduce a simple architecture based on similarity scores which we name Compositional Relational Network (CoRelNet). Using this model, we investigate a series of inductive biases that ensure abstract relations are learned and represented distinctly from sensory data, and explore their effects on out-of-distribution generalization for a series of relational psychophysics tasks. We find that simple architectural choices can outperform existing models in out-of-distribution generalization. Together, these results show that partitioning relational representations from other information streams may be a simple way to augment existing network architectures' robustness when performing out-of-distribution relational computations.
Few-shot Question Generation for Personalized Feedback in Intelligent Tutoring Systems
Devang Kulshreshtha
Muhammad Shayan
Robert Belfer
Iulian V. Serban
Ekaterina Kochmar
Sequential Density Estimation via NCWFAs Sequential Density Estimation via Nonlinear Continuous Weighted Finite Automata
Weighted finite automata (WFAs) have been widely applied in many fields. One of the classic problems for WFAs is probability distribution es… (see more)timation over sequences of discrete symbols. Although WFAs have been extended to deal with continuous input data, namely continuous WFAs (CWFAs), it is still unclear how to approximate density functions over sequences of continuous random variables using WFA-based models, due to the limitation on the expressiveness of the model as well as the tractability of approximating density functions via CWFAs. In this paper, we propose a nonlinear extension to the CWFA model to first improve its expressiveness, we refer to it as the nonlinear continuous WFAs (NCWFAs). Then we leverage the so-called RNADE method, which is a well-known density estimator based on neural networks, and propose the RNADE-NCWFA model. The RNADE-NCWFA model computes a density function by design. We show that this model is strictly more expressive than the Gaussian HMM model, which CWFA cannot approximate. Empirically, we conduct a synthetic experiment using Gaussian HMM generated data. We focus on evaluating the model's ability to estimate densities for sequences of varying lengths (longer length than the training data). We observe that our model performs the best among the compared baseline methods.
Interacting brains revisited: A cross‐brain network neuroscience perspective
Christian Gerloff
Kerstin Konrad
Christina Büsing
Vanessa Reindl
Technologically-assisted communication attenuates inter-brain synchrony
Linoy Schwartz
Jonathan Levy
Yaara Endevelt-Shapira
Amir Djalovski
Olga Hayut
Ruth Pinkenson Feldman
How Can Digital Mental Health Enhance Psychiatry?
Emilie Stern
Jean-Arthur MICOULAUD FRANCHI
Jeverson Moreira
Stephane Mouchabac
Julia Maruani
Pierre Philip
Michel Lejoyeux
Pierre A. GEOFFROY
The use of digital technologies is constantly growing around the world. The wider-spread adoption of digital technologies and solutions in t… (see more)he daily clinical practice in psychiatry seems to be a question of when, not if. We propose a synthesis of the scientific literature on digital technologies in psychiatry and discuss the main aspects of its possible uses and interests in psychiatry according to three domains of influence that appeared to us: 1) assist and improve current care: digital psychiatry allows for more people to have access to care by simply being more accessible but also by being less stigmatized and more convenient; 2) develop new treatments: digital psychiatry allows for new treatments to be distributed via apps, and practical guidelines can reduce ethical challenges and increase the efficacy of digital tools; and 3) produce scientific and medical knowledge: digital technologies offer larger and more objective data collection, allowing for more detection and prevention of symptoms. Finally, ethical and efficacy issues remain, and some guidelines have been put forth on how to safely use these solutions and prepare for the future.