Why are AI safety people doing capabilities work? It happened a few times already, usually with senior people (tho I think it might happen with others as well) and some people are saying it’s because they want money and stuff or get “corrupted”. Maybe there is like a mindkilling argument behind the AI safety case, a crux so deep we fail to articulate it clearly and people who spent significant amount of time thinking about AI safety just reject it at some level.
Also possible. Well honestly, I don’t have much data, I don’t have anything to point to a concrete scenario, but I mean more or less: Antrophic, OpenAI and Mechanize (people from Epoch) - they more or less started as safety focused labs or were concerned about safety at some point (also can’t point to anything concrete), turned to work on capabilities at some point.
Maybe they started working on AI safety because a 50% chance that a solution is necessary was enough to make working on it do the most expected good, and then they despaired of solving AI safety.
My theory is that safety ai folk are taught that a rules framework is how to provide oversight over the ai...like the idea that you can define constraints, logic gates, or formal objectives, and keep the system within bounds, like a classic control theory… but then they start to understand that ai are narrative inference machines, and not reasoning machines. They dont obey logic as much as narrative form. So they start to look into capabilities as a way to create safety through narrative restriction. A protagonist that is good for the 9 chapters will likely be good in chapter 10.
My theory is that safety ai folk are taught that a rules framework is how to provide oversight over the ai...like the idea that you can define constraints, logic gates, or formal objectives, and keep the system within bounds, like a classic control theory…
I don’t know anyone in AI safety who have missed that fact that NNs are not GOFAI.
I expect that one of those arguments is something along the lines of overnight intelligence explosion. It has to do with superintelligence, with no steps between it, and that we are unable to control it.
Why are AI safety people doing capabilities work? It happened a few times already, usually with senior people (tho I think it might happen with others as well) and some people are saying it’s because they want money and stuff or get “corrupted”. Maybe there is like a mindkilling argument behind the AI safety case, a crux so deep we fail to articulate it clearly and people who spent significant amount of time thinking about AI safety just reject it at some level.
Who do you have in mind, and what work? The line between safety and capabilities is blurry, and everyone disagrees about where it is.
Other reasons could be:
They needed a job and could not get a safety job, and the skill they learned landed them a capabilities job.
They where never that concerned with safety to start with, but just used the free training and career support provided by the safety people.
Also possible. Well honestly, I don’t have much data, I don’t have anything to point to a concrete scenario, but I mean more or less: Antrophic, OpenAI and Mechanize (people from Epoch) - they more or less started as safety focused labs or were concerned about safety at some point (also can’t point to anything concrete), turned to work on capabilities at some point.
Maybe they started working on AI safety because a 50% chance that a solution is necessary was enough to make working on it do the most expected good, and then they despaired of solving AI safety.
My theory is that safety ai folk are taught that a rules framework is how to provide oversight over the ai...like the idea that you can define constraints, logic gates, or formal objectives, and keep the system within bounds, like a classic control theory… but then they start to understand that ai are narrative inference machines, and not reasoning machines. They dont obey logic as much as narrative form. So they start to look into capabilities as a way to create safety through narrative restriction. A protagonist that is good for the 9 chapters will likely be good in chapter 10.
I don’t know anyone in AI safety who have missed that fact that NNs are not GOFAI.
I expect that one of those arguments is something along the lines of overnight intelligence explosion. It has to do with superintelligence, with no steps between it, and that we are unable to control it.