Could you help me imagine an AGI that “took over” well enough to modify it’s own code or variables—but chooses not to “wire head” it’s utility variable but rather prefers to do something in the outside world?
A big problem with wireheading is that it trades off long term value for (massive) short term gains. So if you have an AGI that wants to maximise an integer counter for as long as possible, then one approach would be to take over and only wirehead. But that would result in it promptly being shut down (either by humans, or by running out of electricity once the generators break down). For it to stay wireheading long term, it will need to set aside some processing for upkeep to keep the electricity on. And to leave the solar system in ~10Gy when the sun explodes. And to work out negentropy etc.
Could you help me imagine an AGI that “took over” well enough to modify it’s own code or variables—but chooses not to “wire head” it’s utility variable but rather prefers to do something in the outside world?
A big problem with wireheading is that it trades off long term value for (massive) short term gains. So if you have an AGI that wants to maximise an integer counter for as long as possible, then one approach would be to take over and only wirehead. But that would result in it promptly being shut down (either by humans, or by running out of electricity once the generators break down). For it to stay wireheading long term, it will need to set aside some processing for upkeep to keep the electricity on. And to leave the solar system in ~10Gy when the sun explodes. And to work out negentropy etc.