, I interpret it to mean a scenario where first a programmer comes up with a shoddy formalization of paperclip maximization that he thinks is safe but actually isn’t, and then writes that formalization into the AI.
Well, you’d normally define a paperclip counter function that takes as it’s input the state of some really shoddy simulator of Newtonian physics and material science, and then use some “AI” optimization software to find what sort of actions within this simulator produce simulated paperclips from the simulated spool of simulated steel wire with minimum use of electricity and expensive machinery. You also have some viewer for that simulator.
You need to define some context to define paperclip maximization in. Easy way to define a paperclip is as a piece of wire bent in a specific shape. Easy way to define a wire is just to have it as some abstract object with specific material properties, which you just have an endless supply of, coming out of a black box.
Well, you’d normally define a paperclip counter function that takes as it’s input the state of some really shoddy simulator of Newtonian physics and material science, and then use some “AI” optimization software to find what sort of actions within this simulator produce simulated paperclips from the simulated spool of simulated steel wire with minimum use of electricity and expensive machinery. You also have some viewer for that simulator.
You need to define some context to define paperclip maximization in. Easy way to define a paperclip is as a piece of wire bent in a specific shape. Easy way to define a wire is just to have it as some abstract object with specific material properties, which you just have an endless supply of, coming out of a black box.