a: increase in aggression when under influence of certain, but not all, drugs.
b: scarcity of drugs.
None of those apply to wireheading. A wirehead only needs to obtain a couple milliwatts of electrical power. The wireheaded AI doesn’t even need to care about length of its existence, it’s the self preservation instinct that we got which makes us see ‘utility*time’ in any utility.
It’s scarcity that might cause problems for machines. If utility is represented by some kind of BigNum, hedonism demands that ever-expanding computing power would be needed to maximise it. Perhaps there are ways of making machines that wirehead sefely—for example by having a short planning horizon and a utility ceiling. However, with intelligent machines, there are issues like the “minion” problem—where the machine builds minions to delegate its work to. Minions might not have the same safety features as their ancestors. A machine that wireheads safely might help—but it could still cause problems.
Indeed. I mentioned that in the post, “which may in itself be very dangerous if the numbers are variable-length; the AI may want to maximize it’s RAM to store the count of imaginary paperclips—yet the big numbers processing can similarly be subverted to achieve same result without extra RAM” .
The potential of accidental RAM maximizer terrifies me. It can easily happen by accident that the AI would want to maximize something in the map rather than something in the territory. It does seem dubious though that the AI implemented without big numbers would see purpose in implementing bignums into itself to get numerically larger bliss. At that point it can as well implement concept of infinity.
It does seem dubious though that the AI implemented without big numbers would see purpose in implementing bignums into itself to get numerically larger bliss.
More problematical than self-modification is building minions—to delegate the work too. In some cases there may seem to be no obvious, pressing reason not to use a BigNum to represent utility in a minion.
But such minions are as much of a potential risk to the AI creating them as they are to me; if the AI creating minions is smarter than me then it should see the bignum issue and either reason it not to be a problem, or avoid bignums in minions.
I think it’s a bit silly to even have real valued or integer valued utility. We do comparisons between possible futures; if one future is all around better (on every ‘utility’ dimension) than other, we jump on it (would you rather have vanilla icecream with tea or nothing tomorrow), if there’s not a clear winner we sit and think what would we prefer in a trade-off (would you rather have vanilla icecream with tea or chocolate cake with coffee? Suppose you prefer icecream to cake but prefer coffee to tea).
Calculation of utility in a most slow way before comparison is just, slow. When you implement comparison on real valued functions that are godawfully slow to calculate—and i actually did that task for computer graphics where i want to draw solid rock where a function is below zero or empty air elsewhere, for purpose of procedural terrain modelling—you rewrite your function to output result of comparison, so that you can make it exit once it is known that result is ‘somewhere below zero’.
But such minions are as much of a potential risk to the AI creating them as they are to me; if the AI creating minions is smarter than me then it should see the bignum issue and either reason it not to be a problem, or avoid bignums in minions.
It seems as though that depends a lot on what the machine’s planning horizon looks like. BigNums are a long-term problem. Short-term thinking could approve their use.
I think it’s a bit silly to even have real valued or integer valued utility. [...] Calculation of utility in a most slow way before comparison is just, slow.
Utility functions are simple. Trying to compare utilities before they have been calculated in an optimisation. It is true that it is faster—but many programmers use optimisation sparingly these days—and “premature optimisation” is the name of a common programming mistake.
The utility function, by it’s very nature, is very expensive to calculate to high precision (the higher the precision the more expensive it is), and the AI, also by it’s very nature, is something that acts more optimally if it works faster. Computation is not free.
With regards to programmers using optimizations sparingly, that’s largely thanks to doing tasks where the speed does not matter. At same time it is not clear that disregard for speed considerations has resulted in improvement to the maintainability long term as the excess speed allows programmers to pile up complexity.
Furthermore, the tools are capable of greater and greater level of optimization. A more reflective programming language can allow to automatically process the code and make all evaluations produce the results at the precision that is necessary. Indeed, it is already the case that many new advanced programming languages implement ‘lazy evaluation’, where the values are calculated when they are used; the logical next step is to calculate bits of precision only if they matter. edit: actually, I think Haskell already does this. Or maybe not. In any case it’s not even such a difficult addition.
t same time it is not clear that disregard for speed considerations has resulted in improvement to the maintainability long term as the excess speed allows programmers to pile up complexity.
IMO, that is pretty clear. Programmers are often encouraged to optimise for maintainability—rather than performance—since buggy unreadable code can cause pretty major time and revenue loss.
Automated optimisation is not quite so much of a problem.
Indeed, one of the reasons for not optimising code yourself is that it often makes it more complicated—and one of the side effects of that is that it makes it more difficult for machines to subsequently optimise the code.
What I see more and more often is buggy overcomplicated horrors that you can only create if you have very powerful computers to run those horrors.
It is true that optimizations can result in unmaintainable code, but it is untrue that unmaintainable code typically comes from optimizations. The unmaintainable code typically comes from incompetence, and also tends to be horrifically inefficient. Hell, if you look at some project like Linux distro, you’ll observe reverse correlation—the worst, least maintainable pieces of code are the horribly inefficient perl scripts, and the ultra optimized routines are shining examples of clarity in comparison.
Furthermore, in the area of computer graphics (where speed very much does matter), the code generally becomes more optimized over time—the low level optimizations are performed by the compiler, and the high level optimizations such as early rejection, by programmers.
With regards to AIs in particular, early rejection heuristics are a staple of practical AIs, and the naive utility maximizers are a staple of naive textbook examples presented with massive “that’s not how you do it in real world” warnings.
With regards to AIs in particular, early rejection heuristics are a staple of practical AIs [...]
That’s often tree pruning—not anything to do with evaluation. You chop out unpromising branches using the regular evaluation function.
and the naive utility maximizers are a staple of naive textbook examples presented with massive “that’s not how you do it in real world” warnings.
I don’t know about that. Anyway, the modern technique is to build simple software, and then to profile it if it turns out to run too slowly—before deciding where optimisations need to be made.
That’s often tree pruning—not anything to do with evaluation. You chop out unpromising branches using the regular evaluation function.
The branch search often is the utility function. E.g. take chess for example, typical naive AI is your utility maximizer—tries to maximize some board advantage several moves ahead—calculates the utilities of moves now (by recursion) then picks the move with largest utility.
Add some probabilistic reasoning and the branch pruning of improbable branches equals computing the final utility less accurately.
I don’t know about that. Anyway, the modern technique is to build simple software, and then to profile it if it turns out to run too slowly—before deciding where optimisations need to be made.
One needs to distinguish between algorithmic optimizations and low level optimizations. Interesting algorithmic optimizations, especially those that cut down the big-O complexity, can’t even be applied post-hoc. The software—the interfaces between components for instance—has to be designed to permit such optimizations.
That’s often tree pruning—not anything to do with evaluation. You chop out unpromising branches using the regular evaluation function.
The branch search often is the utility function. E.g. take chess for example, typical naive AI is your utility maximizer—tries to maximize some board advantage—calculates the utilities of moves then picks one with largest utility.
Best to keep the distinction between the tree of possible futures and the utilities of those futures clear, IMHO.
Add some probabilistic reasoning and the branch pruning of improbable branches equals computing the final utility less accurately.
Branch pruning is a pretty fundamental optimisation, IMO. What can often usefully be be deferred is the use of manually-generated cut-down approximations of utility functions.
Well, in this context, it is the utility number that is being fed to comparison to make a decision, that is being discussed. This particular utility is a sum of world utilities over possible futures resulting from the decision (weighted by their probabilities). It is as expensive to calculate as accurate you want it to be.
Even the single future world’s utility is being calculated over a future world, and for real world problems the future world is itself an inaccurate estimate, and which can take almost arbitrarily long time to calculate for sufficiently high accuracy.
Bottom line is that a machine within the real world that is making predictions about the real world is always going to be ‘too slow’ to do it accurately.
Well, in this context, it is the utility number that is being fed to comparison to make a decision, that is being discussed. This particular utility is a sum of world utilities over possible futures resulting from the decision (weighted by their probabilities). It is as expensive to calculate as accurate you want it to be.
Utilities can be associated with actions or with world states. Utilities associated with world states are like the “position evaluator” in chess programs. That’s what I have been considering as the domain of the utillity function in our discussion here. Those are where utility functions optimisation could most easily easily be premature.
Utilities associated with actions are harder to calculate—as you explain.
I guess I was unclear. I mean that basing decisions upon comparison of some real numbers is pretty silly, the real numbers being the action utilities. One could instead compare trees and effectively stop branching when it is clear enough one tree is larger than other. This also provides a way to eliminate bias due to one tree being pruned more than the other.
The world utilities too are expensive to calculate for the worlds that are hard to simulate.
The world utilities too are expensive to calculate for the worlds that are hard to simulate.
So, to recap, the way this is supposed to work is that organisms predict their future sense-data using inductive inference. They don’t predict their entire future world, just their own future perceptions of it. Their utility function then becomes their own projected future happiness. All intelligent agents do something like this. The cost of simulating the universe they are in thus turns out to be a big non-issue.
Predicting future sense requires simulation of a fairly big chunk of the world compared to the agent itself.
We don’t do that, even. We take two trees, and evaluate them together comparing as we go, so that we don’t need to sum values of things that are not different between the two, and don’t need to branch off into identical branches. This way we evaluate change caused by an action—the difference between worlds—rather than the worlds. One can quit evaluation once one is sufficiently certain that difference >0 or difference<0 .
One thing we certainly don’t do when we are comparing two alternatives, is coming up with a number for one, then coming up with a number for another, then comparing. No, we write list of pros and cons of each, side to side, to ensure non biased comparison and to ensure we can stop working earlier.
Predicting future sense requires simulation of a fairly big chunk of the world compared to the agent itself.
We don’t do that, even [...]
Sure we do. Our minds are constantly predicting the future. We predict the future and then update our predictions (and discard inaccurate ones) when surprising sense data comes in. The predictions cover practically all sense data. That’s how we know when our model is wrong when encountering surprising sense data—since we have already made our predictions.
We take two trees, and evaluate them together comparing as we go, so that we don’t need to sum values of things that are not different between the two, and don’t need to branch off into identical branches. This way we evaluate change caused by an action—the difference between worlds—rather than the worlds.
Well, hang on. I’m not saying humans don’t optimise this type of task! Humans are a cobbled-together, unmaintainable mess, though. Surely they illustrate how NOT to build a mind. There are all kinds of drawbacks to only calculating relative utilities—for one you can’t easily store them and compare their utility with those of other courses of action. Is it even worth doing? I do not know—which is why I propose profiling before optimising.
I meant, we don’t so much predict ‘the future world’ as the changes to it, to cut on the amount that we need to simulate.
Is it even worth doing? I do not know—which is why I propose profiling before optimising.
What if I know? I am a software developer. I propose less expensive method for deciding on the algorithmic optimizations: learn from existing software such as chess AIs (which are packed with algorithmic optimizations).
edit: also, you won’t learn from profiling that high level optimization is worth doing. Suppose you write a program that eliminates duplicate entries from a file, and you did it the naive way: comparing each to each, O(n^2) . You may find out via profiling whenever most of the time is spent reading the entries, or comparing them, and you may spend time optimizing those, but you won’t learn that you can sort entries first to eliminate the duplicates efficiently. Same goes for things like e.g. raytracers in computer graphics. Practical example from a programming contest: the contestants had 10 seconds to render image with a lot of light reflection inside ellipsoids (the goal was accuracy of output). The reference image was done using straightforward photon mapping—randomly shot photons from light sources—run over time of ~10 hours. The noise is proportional to 1/sqrt(n) ; it converges slowly. The top contestants, myself included, fired photons in organized patterns; the result converged as 1/n . The n being way large even in single second, the contestants did beat the contest organizer’s reference image by far. It would of took months for the contest organizers solution to beat result of contestants in 10 seconds. (edit: the contest sort of failed in result though because the only way to rank images was to compare them to contest organizer’s solution)
The profiler—well, sure, the contest organizers could of ran profiler instead of ‘optimizing prematurely’, and could of found out that their refraction is where they spent most time (or ray ellipsoid intersection or whatever else), and they could of optimized those, for unimportant speed gain. The truth is, they did not even know that their method was too slow, without seeing the superior method (they wouldn’t even have thought so if told, nor could have been convinced with the reasoning that the contestants had used to determine the method to use).
The drug addicts are unsafe for 2 reasons:
a: increase in aggression when under influence of certain, but not all, drugs.
b: scarcity of drugs.
None of those apply to wireheading. A wirehead only needs to obtain a couple milliwatts of electrical power. The wireheaded AI doesn’t even need to care about length of its existence, it’s the self preservation instinct that we got which makes us see ‘utility*time’ in any utility.
It’s scarcity that might cause problems for machines. If utility is represented by some kind of BigNum, hedonism demands that ever-expanding computing power would be needed to maximise it. Perhaps there are ways of making machines that wirehead sefely—for example by having a short planning horizon and a utility ceiling. However, with intelligent machines, there are issues like the “minion” problem—where the machine builds minions to delegate its work to. Minions might not have the same safety features as their ancestors. A machine that wireheads safely might help—but it could still cause problems.
Indeed. I mentioned that in the post, “which may in itself be very dangerous if the numbers are variable-length; the AI may want to maximize it’s RAM to store the count of imaginary paperclips—yet the big numbers processing can similarly be subverted to achieve same result without extra RAM” .
The potential of accidental RAM maximizer terrifies me. It can easily happen by accident that the AI would want to maximize something in the map rather than something in the territory. It does seem dubious though that the AI implemented without big numbers would see purpose in implementing bignums into itself to get numerically larger bliss. At that point it can as well implement concept of infinity.
More problematical than self-modification is building minions—to delegate the work too. In some cases there may seem to be no obvious, pressing reason not to use a BigNum to represent utility in a minion.
But such minions are as much of a potential risk to the AI creating them as they are to me; if the AI creating minions is smarter than me then it should see the bignum issue and either reason it not to be a problem, or avoid bignums in minions.
I think it’s a bit silly to even have real valued or integer valued utility. We do comparisons between possible futures; if one future is all around better (on every ‘utility’ dimension) than other, we jump on it (would you rather have vanilla icecream with tea or nothing tomorrow), if there’s not a clear winner we sit and think what would we prefer in a trade-off (would you rather have vanilla icecream with tea or chocolate cake with coffee? Suppose you prefer icecream to cake but prefer coffee to tea).
Calculation of utility in a most slow way before comparison is just, slow. When you implement comparison on real valued functions that are godawfully slow to calculate—and i actually did that task for computer graphics where i want to draw solid rock where a function is below zero or empty air elsewhere, for purpose of procedural terrain modelling—you rewrite your function to output result of comparison, so that you can make it exit once it is known that result is ‘somewhere below zero’.
It seems as though that depends a lot on what the machine’s planning horizon looks like. BigNums are a long-term problem. Short-term thinking could approve their use.
Utility functions are simple. Trying to compare utilities before they have been calculated in an optimisation. It is true that it is faster—but many programmers use optimisation sparingly these days—and “premature optimisation” is the name of a common programming mistake.
The utility function, by it’s very nature, is very expensive to calculate to high precision (the higher the precision the more expensive it is), and the AI, also by it’s very nature, is something that acts more optimally if it works faster. Computation is not free.
With regards to programmers using optimizations sparingly, that’s largely thanks to doing tasks where the speed does not matter. At same time it is not clear that disregard for speed considerations has resulted in improvement to the maintainability long term as the excess speed allows programmers to pile up complexity.
Furthermore, the tools are capable of greater and greater level of optimization. A more reflective programming language can allow to automatically process the code and make all evaluations produce the results at the precision that is necessary. Indeed, it is already the case that many new advanced programming languages implement ‘lazy evaluation’, where the values are calculated when they are used; the logical next step is to calculate bits of precision only if they matter. edit: actually, I think Haskell already does this. Or maybe not. In any case it’s not even such a difficult addition.
IMO, that is pretty clear. Programmers are often encouraged to optimise for maintainability—rather than performance—since buggy unreadable code can cause pretty major time and revenue loss.
Automated optimisation is not quite so much of a problem.
Indeed, one of the reasons for not optimising code yourself is that it often makes it more complicated—and one of the side effects of that is that it makes it more difficult for machines to subsequently optimise the code.
What I see more and more often is buggy overcomplicated horrors that you can only create if you have very powerful computers to run those horrors.
It is true that optimizations can result in unmaintainable code, but it is untrue that unmaintainable code typically comes from optimizations. The unmaintainable code typically comes from incompetence, and also tends to be horrifically inefficient. Hell, if you look at some project like Linux distro, you’ll observe reverse correlation—the worst, least maintainable pieces of code are the horribly inefficient perl scripts, and the ultra optimized routines are shining examples of clarity in comparison.
Furthermore, in the area of computer graphics (where speed very much does matter), the code generally becomes more optimized over time—the low level optimizations are performed by the compiler, and the high level optimizations such as early rejection, by programmers.
With regards to AIs in particular, early rejection heuristics are a staple of practical AIs, and the naive utility maximizers are a staple of naive textbook examples presented with massive “that’s not how you do it in real world” warnings.
That’s often tree pruning—not anything to do with evaluation. You chop out unpromising branches using the regular evaluation function.
I don’t know about that. Anyway, the modern technique is to build simple software, and then to profile it if it turns out to run too slowly—before deciding where optimisations need to be made.
The branch search often is the utility function. E.g. take chess for example, typical naive AI is your utility maximizer—tries to maximize some board advantage several moves ahead—calculates the utilities of moves now (by recursion) then picks the move with largest utility.
Add some probabilistic reasoning and the branch pruning of improbable branches equals computing the final utility less accurately.
One needs to distinguish between algorithmic optimizations and low level optimizations. Interesting algorithmic optimizations, especially those that cut down the big-O complexity, can’t even be applied post-hoc. The software—the interfaces between components for instance—has to be designed to permit such optimizations.
Best to keep the distinction between the tree of possible futures and the utilities of those futures clear, IMHO.
Branch pruning is a pretty fundamental optimisation, IMO. What can often usefully be be deferred is the use of manually-generated cut-down approximations of utility functions.
Well, in this context, it is the utility number that is being fed to comparison to make a decision, that is being discussed. This particular utility is a sum of world utilities over possible futures resulting from the decision (weighted by their probabilities). It is as expensive to calculate as accurate you want it to be.
Even the single future world’s utility is being calculated over a future world, and for real world problems the future world is itself an inaccurate estimate, and which can take almost arbitrarily long time to calculate for sufficiently high accuracy.
Bottom line is that a machine within the real world that is making predictions about the real world is always going to be ‘too slow’ to do it accurately.
Utilities can be associated with actions or with world states. Utilities associated with world states are like the “position evaluator” in chess programs. That’s what I have been considering as the domain of the utillity function in our discussion here. Those are where utility functions optimisation could most easily easily be premature.
Utilities associated with actions are harder to calculate—as you explain.
I guess I was unclear. I mean that basing decisions upon comparison of some real numbers is pretty silly, the real numbers being the action utilities. One could instead compare trees and effectively stop branching when it is clear enough one tree is larger than other. This also provides a way to eliminate bias due to one tree being pruned more than the other.
The world utilities too are expensive to calculate for the worlds that are hard to simulate.
So, to recap, the way this is supposed to work is that organisms predict their future sense-data using inductive inference. They don’t predict their entire future world, just their own future perceptions of it. Their utility function then becomes their own projected future happiness. All intelligent agents do something like this. The cost of simulating the universe they are in thus turns out to be a big non-issue.
Predicting future sense requires simulation of a fairly big chunk of the world compared to the agent itself.
We don’t do that, even. We take two trees, and evaluate them together comparing as we go, so that we don’t need to sum values of things that are not different between the two, and don’t need to branch off into identical branches. This way we evaluate change caused by an action—the difference between worlds—rather than the worlds. One can quit evaluation once one is sufficiently certain that difference >0 or difference<0 .
One thing we certainly don’t do when we are comparing two alternatives, is coming up with a number for one, then coming up with a number for another, then comparing. No, we write list of pros and cons of each, side to side, to ensure non biased comparison and to ensure we can stop working earlier.
Sure we do. Our minds are constantly predicting the future. We predict the future and then update our predictions (and discard inaccurate ones) when surprising sense data comes in. The predictions cover practically all sense data. That’s how we know when our model is wrong when encountering surprising sense data—since we have already made our predictions.
Well, hang on. I’m not saying humans don’t optimise this type of task! Humans are a cobbled-together, unmaintainable mess, though. Surely they illustrate how NOT to build a mind. There are all kinds of drawbacks to only calculating relative utilities—for one you can’t easily store them and compare their utility with those of other courses of action. Is it even worth doing? I do not know—which is why I propose profiling before optimising.
I meant, we don’t so much predict ‘the future world’ as the changes to it, to cut on the amount that we need to simulate.
What if I know? I am a software developer. I propose less expensive method for deciding on the algorithmic optimizations: learn from existing software such as chess AIs (which are packed with algorithmic optimizations).
edit: also, you won’t learn from profiling that high level optimization is worth doing. Suppose you write a program that eliminates duplicate entries from a file, and you did it the naive way: comparing each to each, O(n^2) . You may find out via profiling whenever most of the time is spent reading the entries, or comparing them, and you may spend time optimizing those, but you won’t learn that you can sort entries first to eliminate the duplicates efficiently. Same goes for things like e.g. raytracers in computer graphics. Practical example from a programming contest: the contestants had 10 seconds to render image with a lot of light reflection inside ellipsoids (the goal was accuracy of output). The reference image was done using straightforward photon mapping—randomly shot photons from light sources—run over time of ~10 hours. The noise is proportional to 1/sqrt(n) ; it converges slowly. The top contestants, myself included, fired photons in organized patterns; the result converged as 1/n . The n being way large even in single second, the contestants did beat the contest organizer’s reference image by far. It would of took months for the contest organizers solution to beat result of contestants in 10 seconds. (edit: the contest sort of failed in result though because the only way to rank images was to compare them to contest organizer’s solution)
The profiler—well, sure, the contest organizers could of ran profiler instead of ‘optimizing prematurely’, and could of found out that their refraction is where they spent most time (or ray ellipsoid intersection or whatever else), and they could of optimized those, for unimportant speed gain. The truth is, they did not even know that their method was too slow, without seeing the superior method (they wouldn’t even have thought so if told, nor could have been convinced with the reasoning that the contestants had used to determine the method to use).