green_leaf comments on Moving Goalposts: Modern Transformer Based Agents Have Been Weak ASI For A Bit Now

green_leaf 29 Dec 2025 5:20 UTC
1 point
0
That said, if I’m skimming that arxiv paper correctly, it implies that GPT-4.5 was being reliably declared “the actual human” 73% of the time compared to actual humans… potentially implying that actual humans were getting a score of 27% “human” against GPT-4.5?!?!
It was declared 73% of the time to be a human, unlike humans, who were declared <73% of the time to be human, which means it passed the test.