Sure, but it shouldn’t be that difficult for a human who’s been forced to ingest the entire AI Alignment forum.
Yeah, that’s what I’d been referring to. Sorry, should’ve clarified it to mean “competently zero-shotting”, rather than Claude’s rather… embarrassing performance so far. (Also it’s not quite zero-shotting given that Pokémon is likely very well-represented in its training data. The “hard” version of this benchmark is beating games that came out after its knowledge cutoff.)
I’m including stuff like cabbage/sheep/wolf and boy/surgeon riddles; not sure how it’s supposed to use tools to solve those.
i spent a couple of weeks not being able to immediately say that 9.9 is > 9.11, and it still occasionally takes me a moment. very weird bug
Yeah, humans’ System 1 reasoning seems vulnerable to this attack as well.
Sure, but it shouldn’t be that difficult for a human who’s been forced to ingest the entire AI Alignment forum.
Yeah, that’s what I’d been referring to. Sorry, should’ve clarified it to mean “competently zero-shotting”, rather than Claude’s rather… embarrassing performance so far. (Also it’s not quite zero-shotting given that Pokémon is likely very well-represented in its training data. The “hard” version of this benchmark is beating games that came out after its knowledge cutoff.)
I’m including stuff like cabbage/sheep/wolf and boy/surgeon riddles; not sure how it’s supposed to use tools to solve those.
Yeah, humans’ System 1 reasoning seems vulnerable to this attack as well.