Can AI learn cultural transmission? đź’¬.

Human intelligence is especially dependent on our ability to acquire knowledge efficiently from other humans (culture). Can AI learn to teach itself through cultural transmission?

The Research

In a paper published in nature researchers from DeepMind posited that, “since cultural transmission is an ever-present feature of human social behaviour,” and since it “is the process that underlies the evolution of culture”, it “is a skill that an artificially intelligent agent should possess to facilitate beneficial human-AI interaction.”

One important element of cultural transmission is imitation. “In humans, imitation provides the kind of high-fidelity cultural transmission needed for the cultural evolution of tools and technology.”

Imitation of Movement

The researchers chose an interesting case study for cultural imitation - the copying of body movement (e.g. jumping, crouching, navigating obstacles). “It is a particularly impressive form of cultural transmission”, say the researchers, “because it requires solving the challenging ‘correspondence problem - instantaneously translating a sensory observation of another [expert’s] motor behaviour into a motor reproduction of that behaviour oneself.”

In order to study body movement as a method of cultural imitation, the researchers needed to create an environment in which an AI agent (which they dubbed MEDAL) and environment could co-adapt, through reinforcement learning. They did so with a simulation in Unity called “Goal Cycle 3D”. The agents observed the 3D world using LIDAR

Unity, a software typically used for creating 3D games, was ideal, in that the researchers could leverage its capability for procedural generation, rigid-body physics, and motor control, to provide “a rich diversity of task variants” for reinforcement learning.

In the environment, hard-coded “demonstration” bots drop in and out of the simulation to be observed by agents during training, allowing the AI agents to learn to observe and then imitate when the demonstrator is absent. Success is determined by the agent’s ability to improve performance.

The results

“We find that [AI] agents recall demonstrations long after the expert has departed. Introspecting the agent’s “brain”, we find strikingly interpretable neurons responsible for encoding social information. MEDAL displays an “experimentation” behaviour in this phase, which involves using hypothesis-testing to infer the correct cycle without reference to the bot, followed by exploiting that correct cycle more efficiently than the bot does.”

In other words, the student learned to outperform the teacher.

Social Neurons

At the time of writing, AI is widely still considered a “black box” for its lack of interpretability into why it makes the decisions it does. So what’s fascinating to me about this research (to the extent that I understand research paper speak) is that they claim to have found individual neurons in the network responsible for identifying agency, which they call the “social neuron.”

Perhaps just as interesting to note is the fact that it is simply being present in the learning environment, not being in the presence of the “teacher” or receiving a reward, which causes that neuron to fire.

In reading this I find myself having to fight not to overgeneralize to the realm of human beings, but I can’t help but wonder - what elements are most crucial in human learning? The teacher? The goal of getting good grades? Or could it be just simply being in the classroom?

What else was learned?

  • Teachers got attached to students (or teaching?) - “The progressive increase in duration of expert dropout (ED) mirrors the development of secure attachment in humans.”

  • Humans’ bias to biological movement corresponds to their loss of attention (Inversion effect in newborns’ sensitivity to biological motion, Bardi, Regolin, Simolin, Dev. Pyschol.)

Yet to explore

  • Incorporating multiple teachers / teaching styles, vs. just a single individual.

  • Greater depth and breadth of strategic tasks

  • Experiment with agents learning cultural capabilities from scratch, vs. from a preprogrammed expert. Humans imitate over abstract representations of the task at hand, including beliefs, desires and intentions. “We would expect this approach to generate efficient open-ended learning.”

Can AI teach itself?

What would happen if AI began imitating other AI? Could it solve significant real problems? And perhaps more importantly, what is the human cost?

Other Resources

While the actual code is proprietary, the authors have made several code repos available:

Open Source MPO Algorithm

Open Source Implementation of the 2D Variant of Goal Cycle