Human beats SOTA Go AI by learning an adversarial policy

Link post

See also article in Financial Times

Apparently, a human (Kellin Pelrine, a solid player but not even a Go professional) was able to beat some state-of-the-art Go AIs (KataGo and Leela Zero) by learning to play an adversarial policy found using RL. Notice that he studied the policy before the match and didn’t receive any AI advice during play.

I’m not surprised adversarial policies for Go AIs are possible, this is in line with previous results about RL and adversarial examples more generally. I am surprised this adversarial policy is teachable to humans without colossal effort.

This is some evidence against the “scaling hypothesis”, i.e. evidence that something non-trivial and important is missing from modern deep learning in order to reach AGI. The usual counterargument to the argument from adversarial examples is: maybe if we could directly access a human brain, we could find adversarial examples against humans. I can believe that it’s possible to defeat a Go professional by some extremely weird strategy that causes them to have a seizure or something in that spirit. But, is there a way to do this that another human can learn to use fairly easily? This stretches credulity somewhat.

Notice also that (AFAIK) there’s no known way to inoculate an AI against an adversarial policy without letting it play many times against it (after which a different adversarial policy can be found). Whereas even if there’s some easy way to “trick” a Go professional, they probably wouldn’t fall for it twice.