There’s presumably a part into which you plug the utility function; that part is maximizing output of the utility function even though the whole may be maximizing paperclips. While the utility function can be screaming ‘disutility’ about the future where it is replaced or subverted, it is unclear how well that can prevent the removal.
So it follows that the utility needs to be closely integrated with AI. In my experience (as software developer) with closely integrated anything, that sort of stuff is not plug-n-play.
It may be that we humans have some sort of inherent cooperative behaviour at the level of individual cortical columns, that makes the brain areas take over the functions normally performed by other brain areas, in event of childhood damage, and otherwise makes brain work together. The brain—a distributed system—inherently has to be cooperative to work together efficiently—the cortical column must cooperate with nearby columns, one chunk of brank must cooperate with another, the hemispheres that work cooperatively are more effective than those where one inhibits the other on dissent—that may be why among humans the intelligence does relate to—not exactly benevolence but certain cooperativeness, as the lack of some intrinsic cooperativeness renders the system inefficient (stupid) via wasting of the computing power.
So it follows that the utility needs to be closely integrated with AI. In my experience (as software developer) with closely integrated anything, that sort of stuff is not plug-n-play.
We can be pretty confident that utility functions will be “plug-and-play”. They are if you use an architecture built on an inductive inference engine—which seems to be a plausible implementation plan.
Humans are pretty programmable too. It looks as though making intelligence reprogrammable isn’t rocket science—once you can do the “intelligence” bit.
Of course there may be some machines with hard-wired utility functions—but that’s different.
But will those plug and play utility functions survive self modification? I know there is the circular reasoning that if you want to achieve a goal, you don’t want to get rid of the goal, but that doesn’t mean you can’t just see the goal in an unintended light, so to say. From inside, wireheading is valid way to achieve your goals. Think pursuit of nirvana, not drug addiction.
But will those plug and play utility functions survive self modification?
That depends on, among other things, what their utility function says.
From inside, wireheading is valid way to achieve your goals. Think pursuit of nirvana, not drug addiction.
Well, an interesting question is whether we can engineer very smart systems where wireheading doesn’t happen. I expect that will be possible—but I don’t think any body reallly knows for sure just now.
There’s presumably a part into which you plug the utility function; that part is maximizing output of the utility function even though the whole may be maximizing paperclips. While the utility function can be screaming ‘disutility’ about the future where it is replaced or subverted, it is unclear how well that can prevent the removal.
So it follows that the utility needs to be closely integrated with AI. In my experience (as software developer) with closely integrated anything, that sort of stuff is not plug-n-play.
It may be that we humans have some sort of inherent cooperative behaviour at the level of individual cortical columns, that makes the brain areas take over the functions normally performed by other brain areas, in event of childhood damage, and otherwise makes brain work together. The brain—a distributed system—inherently has to be cooperative to work together efficiently—the cortical column must cooperate with nearby columns, one chunk of brank must cooperate with another, the hemispheres that work cooperatively are more effective than those where one inhibits the other on dissent—that may be why among humans the intelligence does relate to—not exactly benevolence but certain cooperativeness, as the lack of some intrinsic cooperativeness renders the system inefficient (stupid) via wasting of the computing power.
We can be pretty confident that utility functions will be “plug-and-play”. They are if you use an architecture built on an inductive inference engine—which seems to be a plausible implementation plan.
Humans are pretty programmable too. It looks as though making intelligence reprogrammable isn’t rocket science—once you can do the “intelligence” bit.
Of course there may be some machines with hard-wired utility functions—but that’s different.
But will those plug and play utility functions survive self modification? I know there is the circular reasoning that if you want to achieve a goal, you don’t want to get rid of the goal, but that doesn’t mean you can’t just see the goal in an unintended light, so to say. From inside, wireheading is valid way to achieve your goals. Think pursuit of nirvana, not drug addiction.
That depends on, among other things, what their utility function says.
Well, an interesting question is whether we can engineer very smart systems where wireheading doesn’t happen. I expect that will be possible—but I don’t think any body reallly knows for sure just now.