Kory Wallace Mathewson

Membre industriel associé

Chercheur scientifique, DeepMind

Sujets de recherche

Apprentissage par renforcement

Traitement du langage naturel

Site web

Google Scholar

Publications

Adaptive patch foraging in deep reinforcement learning agents

Nathan Wispinski

Andrew Butcher

Kory Wallace Mathewson

Craig S Chapman

Matthew Botvinick

Patrick M. Pilarski

Patch foraging is one of the most heavily studied behavioral optimization challenges in biology. However, despite its importance to biologic… (voir plus)al intelligence, this behavioral optimization problem is understudied in artificial intelligence research. Patch foraging is especially amenable to study given that it has a known optimal solution, which may be difficult to discover given current techniques in deep reinforcement learning. Here, we investigate deep reinforcement learning agents in an ecological patch foraging task. For the first time, we show that machine learning agents can learn to patch forage adaptively in patterns similar to biological foragers, and approach optimal patch foraging behavior when accounting for temporal discounting. Finally, we show emergent internal dynamics in these agents that resemble single-cell recordings from foraging non-human primates, which complements experimental and theoretical work on the neural mechanisms of biological foraging. This work suggests that agents interacting in complex environments with ecologically valid pressures arrive at common solutions, suggesting the emergence of foundational computations behind adaptive, intelligent behavior in both biological and artificial agents.

2023-04-12

TMLR (accepté)

doi.org

openreview.net

Revisiting Populations in multi-agent Communication

Paul Michel

Mathieu Rita

Kory Wallace Mathewson

Olivier Tieleman

Angeliki Lazaridou

Despite evidence from cognitive sciences that larger groups of speakers tend to develop more structured languages in human communication, sc… (voir plus)aling up to populations has failed to yield significant benefits in emergent multi-agent communication. In this paper we advocate for an alternate population-level training paradigm for referential games based on the idea of "partitioning" the agents into sender-receiver pairs and limiting co-adaptation across pairs. We show that this results in optimizing a different objective at the population level, where agents maximize (1) their respective "internal" communication accuracy and (2) some measure of alignment between agents. In experiments, we find that this leads to the emergence of languages that are significantly more compositional. Moreover, when agents are trained in populations that are not fully connected (ie. not all agent pairs interact at training time), this approach reduces multi-linguality and improves zero-shot communication with new agents (ie. agents are able to communicate successfully with other agents outside their training partners).

2023-02-01

ICLR.cc/2023/Conference (poster)

openreview.net

Avantage IA

Bourse Mila en politiques de l'IA

Priorités stratégiques

Avantage IA

Bourse Mila en politiques de l'IA

Kory Wallace Mathewson

Publications

Avantage IA

Bourse Mila en politiques de l'IA

Priorités stratégiques

Avantage IA

Bourse Mila en politiques de l'IA

Mots-clés populaires:

Kory Wallace Mathewson

Publications