Daniel Kokotajlo comments on AI Views Snapshots

Daniel Kokotajlo 14 Dec 2023 19:25 UTC
2 points
0
1. I’m not claiming that. Current examples of power-seeking agents include ChaosGPT and more generally most versions of AutoGPT that are given ambitious goals and lots of autonomy.
2. I do endorse this. I agree that smaller more optimized and specialized models are less generally intelligent and therefore less dangerous. I don’t think the fact that lots of people are trying to optimize AI seriously undermines my claims.
3. Not sure what you are getting at here. I think the majority of atoms will belong to power-seeking AIs eventually, but I am not making any claims about what the military will decide to do.
- Gerald Monroe 14 Dec 2023 20:22 UTC
  2 points
  0
  Parent
  1. Would you agree chaos GPT has a framework where it has a long running goal and humans have provided it the resources to run to achieve that goal? The goal assigned itself leads to power seeking, you wouldn’t expect such behavior spontaneously to happen with all goals. For example, ’make me the most money possible” and “get me the most money this trading day via this trading interface” are enormously different. Do you think a STEM+ model will power seek if given the latter goal?
  Like is our problem actually the model scheming against us or is the issue that some humans will misuse models and they will do their assigned tasks well.
  1. It undermines your claims if there exist multiple models, A and At, where the t model costs ¹⁄₁₀ as much to run and performs almost as well on the STEM+ benchmark. You are essentially claiming either humans wont prefer the sparsest model that does the job, fairly well optimized models will still power seek, or.. maybe compute will be so cheap humans just don’t care? Like Eliezers short story where toasters and sentient. I think I agree with you in principal that bad outcomes could happen, this disagreement is whether economic forces, etc, will prevent them.
  2. I am saying that for the outcome “the majority of the atoms belong to power seekers” this requires either the military stupidly gives weapons to power seeking machines (like in T3) or a weaker but smarter network of power seeking machines will be able to defeat the military. For the latter claim you quickly end up in arguments over things like the feasibility of mnt anytime soon, since there has to be some way for a badly out-resourced AI to win. “I don’t know how it does it but it’s smarter than us” then hits the issue of “why didn’t the military see through the plan using their own AI?”.