lc has argued that the measured tasks are unintentionally biased towards ones where long-term memory/context length doesn’t matter:
https://www.lesswrong.com/posts/hhbibJGt2aQqKJLb7/shortform-1#vFq87Ge27gashgwy9
lc has argued that the measured tasks are unintentionally biased towards ones where long-term memory/context length doesn’t matter:
https://www.lesswrong.com/posts/hhbibJGt2aQqKJLb7/shortform-1#vFq87Ge27gashgwy9