Mila > News > Neural-AI Reading Group: Winter semester 2021 recap

13 Sep 2021

Neural-AI Reading Group: Winter semester 2021 recap

Mila News︱Maximilian Puelma Touzel

The group of speakers we’ve heard from so far this year have strikingly diverse backgrounds. NeuroAI is, after all, a catch-all term for research at the intersection of neuroscience and artificial intelligence (AI). It’s an area that attracts a wide spectrum of researchers who rely on the theory and applications of learning phenomena. These are computer scientists, engineers, physicists, psychologists, experimental neuroscientists, applied mathematicians (especially those from statistics and optimization), and even philosophers (see e.g., SENAI). Moreover, the Neuro and AI in NeuroAI are interpreted broadly: you won’t find a lot of extreme positions in the traditionally dichotomous debates, such as over the primacy of behaviour vs. neural activity or symbolic vs. connectionist AI. That inclusiveness is refreshing.

While probably not news to most working in NeuroAI, there are (at least!) two concrete endeavors that I think make it a stimulating research space for this diverse crowd:

(1) AI-to-Neuro leverages the theoretical foundations of machine learning to formalize learning processes and apply the theory to explain learning phenomena in brains.

(2) Neuro-to-AI identifies impressive learning capabilities in brains and tries to distill our understanding of them into algorithmic advances that can overcome existing limitations of AI. 

In this blog post, I’m recapping our reading group’s winter semester by highlighting a (highly biased!) sampling of our talks that serve as instances of these two pursuits. Together, they demonstrate the wide appeal of NeuroAI and the shared interest that makes community events like ours so eclectic.

A modern neuroAI topic: Efficient, approximate learning algorithms in brains and machines 

Modern deep learning methods use models in which the variables describing nodes in an artificial neural net (ANN) interact continuously. Neurons communicate through spikes, a feature that can offer lower power consumption and higher robustness to noise. Yet until recently, spiking recurrent neural networks have been difficult to train. This is partly because spikes are discrete pulses whose non-differentiability destabilizes the traditional gradient-based learning algorithms used to train ANNs. However, a modern perspective in computational neuroscience downplays this distinction between continuous or discrete interactions and focuses instead on efficient and robust approximations to gradient-based learning. One of the experts of gradient-based learning with spiking ANNs is computational neuroscientist Guillaume Bellec. With collaborators, he’s developed eligibility propagation (e-prop), which uses slow adaptation variables inspired by ion channel types in real neurons to propagate gradient information further, leading to a more powerful learning algorithm. This work leverages a crude but surprisingly effective approximation on a decomposition of the gradient that neglects off-diagonal terms of the non-local part. This latter idea appears central to the state of the art, e.g., also appearing independently in the SNAP algorithm from DeepMind. Bellec also presented CLAPP, which approximates gradient descent using only computations local to each layer via a contrastive predictive coding scheme. These are all exciting new instances of the Neuro-to-AI research direction.

Figure 1. Adapted with permission from “A solution to the learning dilemma for recurrent networks of spiking neurons,” by G. Bellec, F. Scherr, et al, 2020, Nature Communications, 11. One of the questions addressed by contemporary neuroAI is ‘how might brains implement approximate versions of gradient descent on some loss function, E?’.

NeuroAI topics with long-standing histories have often gone through multiple rounds of cross pollination between neuroscience and AI. Take the wake-sleep algorithm, for example, originally designed from variational Bayesian methods and inspired by the increasing abstraction exhibited by sensory representations along the multiple stages of the ventral processing stream in the brain. This algorithm has since been adapted to various forms in neuro-inspired machine learning (c.f. learning DDCs). Colin Bredenberg, a computational neuroscience PhD student at NYU who recently did a research stay with Mila Professor Guillaume Lajoie and me, presented his previous work on pushing the wake-sleep algorithm back into neuroscience, i.e., going in the AI-to-Neuro direction. He showed that the algorithm could be made less implausible as an actual model for abstraction learning in brains via an online version that interleaves the forward and backward computation between successive layers.  

Leaving behind the tired debate of whether the brain does or does not do gradient descent, these two works presented in our reading group are examples of a growing area of research that demonstrates a joint target for neuroscience and artificial intelligence: discovering efficient and robust approximations of gradient descent.

AI in the wild

At one of Mila’s recent tea talks, MIT researcher Professor Leslie Kaelbling challenged the audience to consider the practical problems and evaluation metrics that arise when AI is deployed in the wild. In this instance, NeuroAI is useful because there are putative evolutionary solutions to many of these practical problems somewhere inside the brains of many animals. For example, continual learning presents the problem of avoiding catastrophic forgetting. Briefly, this is the need for deployed learning systems to generalize knowledge to new tasks while also not forgetting how to solve the original tasks they have learned in the process. It is a current focus of Mila’s own Irina Rish and her CERC-funded group in autonomous AI. The ongoing neuroscience research into memory consolidation via hippocampal replay and artificial intelligence work on algorithms for continual learning makes this problem an excellent target within neuroAI research. We had a variety of talks on the topic this past semester. Computational cognitive neuroscientist, Timo Flesch, presented rich vs. lazy learning wherein nonlinear mixed selectivity and task-specific representations contribute distinct effects to catastrophic forgetting in continual multi-task RL depending on task similarity. Google’s Rishabh Agarwal gave a talk about contrastive behavioral similarity embeddings for generalization in reinforcement learning. Here, policies are evaluated in an abstracted space that includes sequence information. As a final example, Nicolas Deperrois, a PhD student from Prof. Walter Senn’s group at the University of Bern, talked about the distinction between episodic and semantic memory in declarative knowledge and proposed distinct forms of memory consolidation for each. They proposed an architecture and distinct learning phases in which semantic representations arise from REM-like dreaming and non-REM episodic replay. Together, these are just some examples of the synergizing work at the frontier of challenging task settings in contemporary neuroscience and artificial intelligence research. 

Out-of-the-wild AI

The mountain of biophysical detail in brains presents a needle-in-a-haystack challenge for the Neuro-to-AI research direction: find and understand the specific details that allow us to excel where current AI algorithms falter. This problem is like going into the Amazon and sampling plant material in the hopes of finding exotic compounds with potent functional properties. Where do you look? What do you look for? How do you distill what you’ve found in a way that applications can take advantage of? In neuroAI, there are precedents of successful examples that we can learn from. A prominent example is the origin of convolutional neural networks and weight sharing, which were inspired by how we understood layer-wise processing in the visual system. Vision is by far the most studied sense. What innovations hide in the workings of other sensory systems? In olfaction, for example, nearly random connections between layers are observed. Could this fact be related to the surprising discovery in artificial neural networks that even randomly set feedback weights are sufficient for learning?

We don’t even have to look beyond vision for new, deep insights into cognition that could inform the next generation of AI algorithms. How, for example, do our brains learn and compose programs to solve entire classes of tasks. In particular, which schemas (e.g., organizing objects in our visual field from left to right) are ingrained as hard-wired biases and which are instead learned through experience? This semester, one of our last talks was given by Steven Piantadosi, a Berkeley Psychology faculty member who presented his work on program learning across cultures. Through anthropological fieldwork with culturally isolated populations, his team found no inherent bias in the directionality of the cognitive schemas built by the subjects in comparison to adults from Western cultures. In what ways would artificial intelligence that had such symmetry built into their program learning improve over existing AI? Yet another fascinating question for NeuroAI research.

I hope you enjoyed this flash summary of our Neural-AI Reading Group! We look forward to hearing more exciting research as we start up the reading group again this fall. We would love to hear from you about any cool neuroAI work that you think we should showcase. Stay tuned for MAIN, our annual NeuroAI conference that will take place before Christmas.