Neither. In the interests of understanding, however, I’m willing to elaborate slightly.
Take a good, close look at the specific rules Eliezer set down in the 2002 paper. Think about what the words used to define those rules mean, and then compare and contrast with Eliezer’s statements about what he means by them.
If he was exploiting psychological weaknesses or merely being charismatic, I can guarantee that anyone following a trivially simple method can refrain from letting him out. If he had a strong argument, it becomes merely very likely. And in either case, the method stays completely within the rules as Eliezer set them out—but not what he appears to have intended.
Neither. In the interests of understanding, however, I’m willing to elaborate slightly.
Take a good, close look at the specific rules Eliezer set down in the 2002 paper. Think about what the words used to define those rules mean, and then compare and contrast with Eliezer’s statements about what he means by them.
If he was exploiting psychological weaknesses or merely being charismatic, I can guarantee that anyone following a trivially simple method can refrain from letting him out. If he had a strong argument, it becomes merely very likely. And in either case, the method stays completely within the rules as Eliezer set them out—but not what he appears to have intended.
One rule in particular holds a critical weakness.