This (purported) leak is incredibly scary if you’re worried about AI risk due to AI capabilities developing too fast.
Firstly, it suggests that open-source models are improving rapidly because people are able to iterate on top of each other’s improvements and try out a much larger number of experiments than a small team at a single company possibly could.
Secondly, it suggests that data size is less important than you might think from recent scaling laws, as you can achieve high performance with “small, highly curated datasets”. This suggests a world where dangerous capabilities are much further distributed.
Thirdly, it proposes that Google should open-source its AI technology. Obviously, this is only the opinion of one researcher and it currently seems unlikely to occur, but if Google was to pursue this path, this would lead to a significant shortening of timelines.
I’m worried that up until now, this community has been too focused on the threat of big companies pushing capabilities ahead and not focused enough on the threat posed by open-source AI. I would love to see more discussions of regulations in order to mitigate this risk. I suspect it would be possible to significantly hamper these projects by making the developers of these projects potentially liable for any resulting misuse.