How You (and a Computer) Learn the ‘Rules of the Game’

In 1848, the 25-year-old Phineas Gage was working on a railroad in Vermont, packing explosive powder into a hole with an iron tamper. Unexpectedly, the powder exploded, sending the tamper backwards through Gage’s skull and brain.

That he survived is a miracle, but astonishingly he even seemed capable of functioning effectively, maintaining normal memory, speech, and motor skills. Those that knew him, however, thought he was anything but the same, with friends remarking he was “no longer Gage.”

His physician, John Harlow, noted: 

“…his equilibrium, or balance, so to speak, between his intellectual faculties and animal propensities seems to have been destroyed. He is fitful, irreverent, indulging in the grossest profanity (which was not previously his custom), manifesting but little deference for his fellows, impatient of restraint or advice when it conflicts with his desires.”

The injury mostly damaged the left side of Gage’s frontal lobes, including portions of the prefrontal cortex—the wrinkled outer layer at the front of the brain. Since his accident we’ve uncovered further details about the critical role the prefrontal cortex plays. Insights into its function might offer inspiration to those working on intelligent machines

Rules

When people have suffered damage to the prefrontal cortex, one thing they’re unable to do very well is the Wisconsin Card Sorting Test. In this test, participants are given a selection of cards with symbols on them and asked to categorize them. However, they are not told by what criteria to match the cards—it could be by the number of the symbols, the shape, or the color.

Participants must start by trial and error, with the researcher pointing out whether a particular attempt is right or wrong. After every ten cards, the criteria by which to sort the cards changes, and without notice. The participant simply finds their previous rule now leading to all sorts of errors, and must adjust.  

The test measures cognitive flexibility—participants must recognize that their old rule no longer applies and which rule is the best one to replace it with. As a 2002 study notes, “Normal humans have little difficulty with this task. By contrast, humans with prefrontal damage can learn the first sorting criterion (a relatively simple mapping between a stimulus attribute and a response) but then are unable to escape it.”

Without a functional prefrontal cortex, people get stuck in a single mode of interpretation, they lack cognitive flexibility. The researchers, led by Earl Miller, write that this region houses our internal representation of ‘the rules of the game.’ 

“What function does the ability to abstract a rule serve? It is a form of generalization that permits a shortcut in learning, thereby allowing the animal to maximize the amount of reward available from a particular situation.”

Perhaps a good example would be playing video games. At first, you have to learn what each button represents in the particular game you’re playing. Then you change games, to an entirely different genre—say, from a racing game to a first-person shooter. Now you have to relearn the button system.

After a while, you start to get the hang of things, and can effortlessly switch between these games and their controls. What’s more, you realize, that when you try new racing games and new first-person shooters, that you can probably substitute the rules of the previous games without too many errors, helping to save the time and energy of learning afresh.

Such an ability seems of likely importance in the machine learning realm, where the concept of transfer learning—using knowledge from one problem to solve a different yet related problem—remains one obstacle to what some believe to be truly intelligent computers.

Artificial

A group of researchers at DeepMind recreated the Harlow experiment for an algorithm they designed. The Harlow experiment consists of monkeys being shown two unfamiliar objects, one of which leads to a reward. In each trial the position of the objects are randomized, so the monkey must learn that it isn’t the location that marks the reward, but the object itself.

Soon thereafter, the objects are replaced by new ones, and the monkey must again learn where the reward is. In this unfamiliar condition the monkey will recognize that, like in the previous test, the object is likely more important than the location. In essence, the monkey changed rules. And so did the team’s algorithm.

“… we found that our ‘meta-RL agent’ appeared to learn in a manner analogous to the animals in the Harlow Experiment.”

While it is not uncommon for insights in neuroscience to aid the design of computers, this experiment may have helped shed some light into how the brain can make use of these rules quickly and dynamically. In their design, they used deep reinforcement learning techniques as a representation of dopamine, a neurotransmitter in the brain, to build a recurrent neural network which played the part of the prefrontal cortex.

“Dopamine is traditionally understood to strengthen synaptic links in the prefrontal system, reinforcing particular behaviours. … However, in our experiments the weights of the neural network were frozen, meaning they couldn’t be adjusted during the learning process, yet, the meta-RL agent was still able to solve and adapt to new tasks.”

Rather than rely only on physical changes between synapses, the researchers think that information is coded in the dopamine system that flows to and through the prefrontal cortex. The two systems complement each other and allow us to learn “on two timescales.”

“We propose that dopamine’s role goes beyond just using reward to learn the value of past actions and that it plays an integral role, specifically within the prefrontal cortex area, in allowing us to learn efficiently, rapidly and flexibly on new tasks.”

As we inch closer to understanding how the brain works, we get ever closer to building intelligent machines. Sometimes, too, the efforts put forth in building those machines lead to insights into the brain. It will be interesting to see which we achieve first.

. . .

Enjoy the article? Become a patron to help support my writing

Become a Patron!

. . .

Share the word

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *