Prakash Panangaden

Christopher G. Lucas

David Abel

Stefano V Albrecht

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning

Samuel Garcin

Trevor McInroe

Pablo Samuel Castro

Christopher G. Lucas

David Abel

Stefano V Albrecht

Extracting relevant information from a stream of high-dimensional observations is a central challenge for deep reinforcement learning agents… (see more). Actor-critic algorithms add further complexity to this challenge, as it is often unclear whether the same information will be relevant to both the actor and the critic. To this end, we here explore the principles that underlie effective representations for an actor and for a critic. We focus our study on understanding whether an actor and a critic will benefit from a decoupled, rather than shared, representation. Our primary finding is that when decoupled, the representations for the actor and critic systematically specialise in extracting different types of information from the environment---the actor's representation tends to focus on action-relevant information, while the critic's representation specialises in encoding value and dynamics information. Finally, we demonstrate how these insights help select representation learning objectives that play into the actor's and critic's respective knowledge specialisations, and improve performance in terms of agent returns.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Optimal Approximate Minimization of One-Letter Weighted Finite Automata

Clara Lacroce

Borja Balle

Guillaume Rabusseau

2024-11-08

Mathematical Structures in Computer Science (published)

Conditions on Preference Relations that Guarantee the Existence of Optimal Policies

Jonathan Colaco Carr

Doina Precup

2024-04-18

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (published)

Polynomial Lawvere Logic

Giorgio Bacci

Radu Mardare

Gordon D. Plotkin

2024-02-05

ArXiv (preprint)

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Sahand Rezaei-Shoshtari

Rosie Zhao

David Meger

Doina Precup

Sum and Tensor of Quantitative Effects

Giorgio Bacci

Radu Mardare

Gordon Plotkin

2024-01-01

Log. Methods Comput. Sci. (published)

Behavioural pseudometrics for continuous-time diffusions

Linan Chen

Florence Clerc

2023-12-27

ArXiv (preprint)

Propositional Logics for the Lawvere Quantale

Giorgio Bacci

Radu Mardare

Gordon Plotkin

2023-11-23

Electronic Notes in Theoretical Informatics and Computer Science (published)

Behavioural equivalences for continuous-time Markov processes

Linan Chen

Florence Clerc

2023-03-30

Mathematical Structures in Computer Science (published)

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

Pablo Samuel Castro

Tyler Kastner

Mark Rowland

We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We define a ne… (see more)w metric under this lens that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective enables us to provide new theoretical results, including value-function bounds and low-distortion finite-dimensional Euclidean embeddings, which are crucial when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice.

2023-01-01

Trans. Mach. Learn. Res. (published)

openreview.net

Augmenting Human Selves Through Artificial Agents – Lessons From the Brain

Georg Northoff

Maia Fraser

John Griffiths

Dimitris A. Pinotsis

Rosalyn Moran

Karl Friston

Much of current artificial intelligence (AI) and the drive toward artificial general intelligence (AGI) focuses on developing machines for f… (see more)unctional tasks that humans accomplish. These may be narrowly specified tasks as in AI, or more general tasks as in AGI – but typically these tasks do not target higher-level human cognitive abilities, such as consciousness or morality; these are left to the realm of so-called “strong AI” or “artificial consciousness.” In this paper, we focus on how a machine can augment humans rather than do what they do, and we extend this beyond AGI-style tasks to augmenting peculiarly personal human capacities, such as wellbeing and morality. We base this proposal on associating such capacities with the “self,” which we define as the “environment-agent nexus”; namely, a fine-tuned interaction of brain with environment in all its relevant variables. We consider richly adaptive architectures that have the potential to implement this interaction by taking lessons from the brain. In particular, we suggest conjoining the free energy principle (FEP) with the dynamic temporo-spatial (TSD) view of neuro-mental processes. Our proposed integration of FEP and TSD – in the implementation of artificial agents – offers a novel, expressive, and explainable way for artificial agents to adapt to different environmental contexts. The targeted applications are broad: from adaptive intelligence augmenting agents (IA’s) that assist psychiatric self-regulation to environmental disaster prediction and personal assistants. This reflects the central role of the mind and moral decision-making in most of what we do as humans.

2022-06-23

Frontiers in Computational Neuroscience (published)