I don’t claim to have any secret knowledge or exceptional intuition about what will scale well. Everything I link to is already public, and already quite well known (test time training was the number 2 pinned post on r/singularity, hierarchical reasoning model is blowing up X/twitter, Arc-agi without pretraining was front page on hacker news). In fact, commenters keep bringing up links that I already featured in the main post, suggesting that these developments are well known even within the lesswrong memesphere.
With that in mind, my goal was less to warn about any individual architecture but to point out the overall trend of alternative architecture work I’m observing, and some reasons I expect the trend to continue. I want to make sure that AI safety does not over index on LLM safety and ignore other avenues of risk. Basically exactly what you said here:
If you want the community to pay attention to the fact that other architectures might scale well- that’s okay, feel free to talk about it.
Hey Mikhail,
I don’t claim to have any secret knowledge or exceptional intuition about what will scale well. Everything I link to is already public, and already quite well known (test time training was the number 2 pinned post on r/singularity, hierarchical reasoning model is blowing up X/twitter, Arc-agi without pretraining was front page on hacker news). In fact, commenters keep bringing up links that I already featured in the main post, suggesting that these developments are well known even within the lesswrong memesphere.
With that in mind, my goal was less to warn about any individual architecture but to point out the overall trend of alternative architecture work I’m observing, and some reasons I expect the trend to continue. I want to make sure that AI safety does not over index on LLM safety and ignore other avenues of risk. Basically exactly what you said here:
Hope this makes sense.