[Question] What is our current best infohazard policy for AGI (safety) research?

I struggle to see what the infohazard policy for AGI safety research writing should be. For example, it has been pointed out multiple times that GPT-3 paper was an infohazard, because so many people realised this kind of model was possible, and developed its equivalents, making big progress towards AGI overall. On the other hand, imagine we would still know nothing about LLMs on the capability level of GPT-3 and higher. So many independent research with these models would have not been done. Safety teams at OpenAI and other leading orgs would do something, but the alignment community complains that the size of these teams is inadequate to the si,e of their capability teams, so we should sort of want AGI labs to publish? Or we also think that the number of AGI wannabies outside the leading AGI labs who want to build something impressive upon the insights from the leading AGI labs and don’t concern about safety pretty much at all is still many times larger than the number of AGI safety researchers who use GPT-3 and other proto-AGI results in “differentially right way”?

Similar weighting of harm and benefit should be performed regarding the community involvement: it’s known that publishing of GPT-3 paper was a wake-up call for many researchers, and, I believe, have directly led to founding of Conjecture. Similarly, publishing of the many impressive results earlier this year, primarily by Google and DeepMind, was a wake-up call for many other people concerned about the x-risk, including me. But it’s harder to estimate the corresponding effect on the excitement of another groups of people, and estimate the numbers of people who decided to become involved and push towards AGI, as the result of these publications. (I believe John Carmack is an example of such a person.)

What are the best resources on this topic? Why they are not promoted as a policy on LessWrong?

Related to the question is the idea that the conceptual/​inferential, social, organisational, and political division between the AGI capabilities R&D community and the alignment community is deeply problematic: what we should have is a single community with a singular goal of developing a transformative AI that won’t lead to a catastrophe.

No comments.