The question was how to justify the opinion that most possible outcomes are bad. My argument was that if you agree that a random outcome is likely bad… that implies that most outcomes are bad.
If most outcomes were good instead, a random outcome would likely be good.
To answer your question: we do not have a mathematical definition of “friendly”, so we will most likely use some heuristic instead. Which heuristic we use, that is one source of randomness. More randomness can be in the implementation details; as a silly example, if we decide that training a LLM on texts of benevolent philosophers is the way to go, it depends on which specific texts we choose. Furthermore, the implementation may contain bugs. Or we may decide that the AI needs to consist of several components, but there are multiple possible solutions how to put those elements together.
There are situations where these sources of randomness don’t matter, because we know what we do. For example, if different companies are making calculators in different ways, it is still quite predictable that they will answer 2+2= with 4. Problem is, with friendly AI we don’t know what we are doing, so we won’t get the feedback when a solution diverges from the ideal. It’s like with the early LLMs, the answer to an arithmetic problem containing numbers with more than one digit was quite random.
You need to specify whether your “random” is merely undetermined, or an undetermined pick from an equiprobable set. Only the latter allows you to equate most and most likely. But equiprobability isn’t a reasonable assumption, because the AIs we build will be guided by our aims and limited by our restrictions.
The MindSpace of the Orthogonality thesis is a set of possibilities. The random potshot version of the OT argument is only one way of turning possibilities into probabilities, and not particularly realistic. While, any of the minds in mindpsace are indeed weird and unfriendly to humans, that does not make it likely that the AIs we will construct will be. we are deliberately seeking to build certainties of mind for one thing, and have certain limitations, for another. Random potshots aren’t analogous to the probability density for action of building a certain type of AI, without knowing just ch about what it would be
what’s that analogy supposed to be analogous to? do you think the process of.value formation in an AI is going to have a random element?
The question was how to justify the opinion that most possible outcomes are bad. My argument was that if you agree that a random outcome is likely bad… that implies that most outcomes are bad.
If most outcomes were good instead, a random outcome would likely be good.
To answer your question: we do not have a mathematical definition of “friendly”, so we will most likely use some heuristic instead. Which heuristic we use, that is one source of randomness. More randomness can be in the implementation details; as a silly example, if we decide that training a LLM on texts of benevolent philosophers is the way to go, it depends on which specific texts we choose. Furthermore, the implementation may contain bugs. Or we may decide that the AI needs to consist of several components, but there are multiple possible solutions how to put those elements together.
There are situations where these sources of randomness don’t matter, because we know what we do. For example, if different companies are making calculators in different ways, it is still quite predictable that they will answer 2+2= with 4. Problem is, with friendly AI we don’t know what we are doing, so we won’t get the feedback when a solution diverges from the ideal. It’s like with the early LLMs, the answer to an arithmetic problem containing numbers with more than one digit was quite random.
You need to specify whether your “random” is merely undetermined, or an undetermined pick from an equiprobable set. Only the latter allows you to equate most and most likely. But equiprobability isn’t a reasonable assumption, because the AIs we build will be guided by our aims and limited by our restrictions.
The MindSpace of the Orthogonality thesis is a set of possibilities. The random potshot version of the OT argument is only one way of turning possibilities into probabilities, and not particularly realistic. While, any of the minds in mindpsace are indeed weird and unfriendly to humans, that does not make it likely that the AIs we will construct will be. we are deliberately seeking to build certainties of mind for one thing, and have certain limitations, for another. Random potshots aren’t analogous to the probability density for action of building a certain type of AI, without knowing just ch about what it would be