I’m very iffy about this. From the prospects of the Pentagon rolling out worldwide LLM-powered surveillance, to corporations rolling out super-persuasive ads, to labs saying “woohoo a new alignment technique, now we can race harder and keep the same risk”, it looks more and more likely that difficulties of AI development (including misalignment) are an important brake and our lease on life. If one day AI stops messing up and hallucinating and so on, that day we’re fucked.
I’m not convinced that the labs are able to justify their race behaviour because they have access to alignment research. It seems to me that the race is justified by money, gestures to national security, or a vague bias towards building rather than some guarantee that this technology helps humanity. If AI stops making obvious mistakes, it might become slightly harder to galvanise public support for a pause, but I doubt overall it would have a high impact on the trajectory of the technology.
I think of the push towards AI safety companies as improving access to capital, information, and integration for safety work. The ability to integrate monitoring solutions into sensitive industries, catch jailbreaks at runtime, or continually evaluate models is extremely valuable, and I claim that this is easier to do using a for-profit model.
I’m very iffy about this. From the prospects of the Pentagon rolling out worldwide LLM-powered surveillance, to corporations rolling out super-persuasive ads, to labs saying “woohoo a new alignment technique, now we can race harder and keep the same risk”, it looks more and more likely that difficulties of AI development (including misalignment) are an important brake and our lease on life. If one day AI stops messing up and hallucinating and so on, that day we’re fucked.
I’m not convinced that the labs are able to justify their race behaviour because they have access to alignment research. It seems to me that the race is justified by money, gestures to national security, or a vague bias towards building rather than some guarantee that this technology helps humanity. If AI stops making obvious mistakes, it might become slightly harder to galvanise public support for a pause, but I doubt overall it would have a high impact on the trajectory of the technology.
I think of the push towards AI safety companies as improving access to capital, information, and integration for safety work. The ability to integrate monitoring solutions into sensitive industries, catch jailbreaks at runtime, or continually evaluate models is extremely valuable, and I claim that this is easier to do using a for-profit model.