Exactly! The frontier labs have the compute and incentive to push capabilities forward, while randos on lesswrong are instead more likely to study alignment in weak open source models
I think that we have both the bitter lesson that transformers will continue to gain capabilities with scale and also that there are optimizations that will apply to intelligent models generally and orthogonally to computing scale. The latter details seem dangerous to publicize widely in case we happen to be in the world of a hardware overhang allowing AGI or RSI (which I think could be achieved easier/sooner by a “narrower” coding agent and then leading rapidly to AGI) on smaller-than-datacenter clusters of machines today.
Exactly! The frontier labs have the compute and incentive to push capabilities forward, while randos on lesswrong are instead more likely to study alignment in weak open source models
I think that we have both the bitter lesson that transformers will continue to gain capabilities with scale and also that there are optimizations that will apply to intelligent models generally and orthogonally to computing scale. The latter details seem dangerous to publicize widely in case we happen to be in the world of a hardware overhang allowing AGI or RSI (which I think could be achieved easier/sooner by a “narrower” coding agent and then leading rapidly to AGI) on smaller-than-datacenter clusters of machines today.