I mostly agree with this, that LLM’s are human-like in many ways and have good answers to moral questions more than I had expected + understand your words and intent.
Imagine that the best AI was mainly defined by the cleverness of its algorithm, and compute didn’t matter that much.
I wonder how true this actually is with harnesses. There are things like LLM’s not successfully multiplying two digit numbers or not being able to manually do a long (boring) task like Towers of Hanoi without losing coherence, and this is a blatantly obvious tool call moment. Poor memory and needing to be reminded is also a classic LLM issue that a harness could improve. There are also videos of looped LLMs having higher performance, even Meta-Harness. It seems likely ARC-AGI 3 LLM’s may also need a harness-like thing as well. I don’t know if it’s “hard to apply breakthroughs from papers all at once” (??) for a small-time user or company with a lot less compute but its ‘algorithm’ matters a lot I think.
I mostly agree with this, that LLM’s are human-like in many ways and have good answers to moral questions more than I had expected + understand your words and intent.
I wonder how true this actually is with harnesses. There are things like LLM’s not successfully multiplying two digit numbers or not being able to manually do a long (boring) task like Towers of Hanoi without losing coherence, and this is a blatantly obvious tool call moment. Poor memory and needing to be reminded is also a classic LLM issue that a harness could improve. There are also videos of looped LLMs having higher performance, even Meta-Harness. It seems likely ARC-AGI 3 LLM’s may also need a harness-like thing as well. I don’t know if it’s “hard to apply breakthroughs from papers all at once” (??) for a small-time user or company with a lot less compute but its ‘algorithm’ matters a lot I think.
(also holy crap are you 307th from mata)
Haha yep! It’s funny how often I run into people from prismata in the rationalist community.
Yeah there is still lots of room for cleverness to improve AI performance, in harnesses and also in the training process.