niplav comments on Daniel Kokotajlo’s Shortform

niplav 9 Dec 2025 14:02 UTC
6 points
4
It’s too bad the BALROG benchmark isn’t being updated with the newest models. Nethack is both really hard, gives a floating point score, and is text-based, so if a model is vision-impaired (like the Claudes) there’s less contamination through “the model just can’t see where it is”.