I see. My specific update from this post was to slightly reduce how much I care about protecting against high-risk AI related CBRN threats, which is a topic I spent some time thinking about last month.
I think it is generous to say that legible problems remaining open will necessarily gate model deployment, even in those organizations conscientious enough to spend weeks doing rigorous internal testing. Releases have been rushed ever since applications moved from physical CDs to servers, because of the belief that users can serve as early testers for bugs, and that critical issues can be patched by pushing a new update. This blog post by Steve Yegge from ~20 years ago comes to mind: https://sites.google.com/site/steveyegge2/its-not-software. I would include LLM assistants in the category of “servware”.
I would argue that we are likely dropping the ball on both legible and illegible problems, but I agree that making illegible problems more legible is likely to be high leverage. I believe that the Janus/cyborgism cluster has no shortage of illegible problems, and consider https://nostalgebraist.tumblr.com/post/785766737747574784/the-void to be a good example of work that attempts to grapple with illegible problems.
I think it is generous to say that legible problems remaining open will necessarily gate model deployment, even in those organizations conscientious enough to spend weeks doing rigorous internal testing.
In this case you can apply a modified form of my argument, by replacing “legible safety problems” with “safety problems that are actually likely to gate deployment”, and then the conclusion would be that working on such safety problems are of low or negative EV for the x-risk concerned.
I see. My specific update from this post was to slightly reduce how much I care about protecting against high-risk AI related CBRN threats, which is a topic I spent some time thinking about last month.
I think it is generous to say that legible problems remaining open will necessarily gate model deployment, even in those organizations conscientious enough to spend weeks doing rigorous internal testing. Releases have been rushed ever since applications moved from physical CDs to servers, because of the belief that users can serve as early testers for bugs, and that critical issues can be patched by pushing a new update. This blog post by Steve Yegge from ~20 years ago comes to mind: https://sites.google.com/site/steveyegge2/its-not-software. I would include LLM assistants in the category of “servware”.
I would argue that we are likely dropping the ball on both legible and illegible problems, but I agree that making illegible problems more legible is likely to be high leverage. I believe that the Janus/cyborgism cluster has no shortage of illegible problems, and consider https://nostalgebraist.tumblr.com/post/785766737747574784/the-void to be a good example of work that attempts to grapple with illegible problems.
In this case you can apply a modified form of my argument, by replacing “legible safety problems” with “safety problems that are actually likely to gate deployment”, and then the conclusion would be that working on such safety problems are of low or negative EV for the x-risk concerned.