if you’re managing a factory, I can say “Rohin, I want you to make me a lot of paperclips this month, but if I find out you’ve increased production capacity or upgraded machines, I’m going to fire you”. You don’t even have to behave greedily – you can plan for possible problems and prevent them, without upgrading your production capacity from where it started.
I think this is a natural concept and is distinct from particular formalizations of it.
edit: consider the three plans
Make 10 paperclips a day
Make 10 paperclips a day, but take over the planet and control a paperclip conglomerate which could turn out millions of paperclips each day, but which in fact never does.
take over the planet and make millions of paperclips each day.
Seems like that only makes sense because you specified that “increasing production capacity” and “upgrading machines” are the things that I’m not allowed to do, and those are things I have a conceptual grasp on. And even then—am I allowed to repair machines that break? What about buying a new factory? What if I force workers to work longer hours? What if I create effective propaganda that causes other people to give you paperclips? What if I figure out that by using a different source of steel I can reduce the defect rate? I am legitimately conceptually uncertain whether these things count as “increasing production capacity / upgrading machines”.
As another example, what does it mean to optimize for “curing cancer” without becoming more able to optimize for “curing cancer”?
Sorry, forgot to reply. I think these are good questions, and I continue to have intuitions that there’s something here, but I want to talk about these points more fully in a later post. Or, think about it more and then explain why I agree with you.
if you’re managing a factory, I can say “Rohin, I want you to make me a lot of paperclips this month, but if I find out you’ve increased production capacity or upgraded machines, I’m going to fire you”. You don’t even have to behave greedily – you can plan for possible problems and prevent them, without upgrading your production capacity from where it started.
I think this is a natural concept and is distinct from particular formalizations of it.
edit: consider the three plans
Make 10 paperclips a day
Make 10 paperclips a day, but take over the planet and control a paperclip conglomerate which could turn out millions of paperclips each day, but which in fact never does.
take over the planet and make millions of paperclips each day.
Seems like that only makes sense because you specified that “increasing production capacity” and “upgrading machines” are the things that I’m not allowed to do, and those are things I have a conceptual grasp on. And even then—am I allowed to repair machines that break? What about buying a new factory? What if I force workers to work longer hours? What if I create effective propaganda that causes other people to give you paperclips? What if I figure out that by using a different source of steel I can reduce the defect rate? I am legitimately conceptually uncertain whether these things count as “increasing production capacity / upgrading machines”.
As another example, what does it mean to optimize for “curing cancer” without becoming more able to optimize for “curing cancer”?
Sorry, forgot to reply. I think these are good questions, and I continue to have intuitions that there’s something here, but I want to talk about these points more fully in a later post. Or, think about it more and then explain why I agree with you.