That depends what your initial probability is and why. If it already low due to updates on predictions about the system, then updating on “unpredictable” will increase the probability by lowering the strength of those predictions. Since destruction of humanity is rather important, even if the existential AI risk scenario is of low probability it matters exactly how low.
The importance should not weight upon our estimation, unless you proclaim that I should succumb to a bias. Furthermore, it is the destruction of the mankind that is the prediction being made here. Via multitude of assumptions, the most dubious one being that the system will have real-world, physical goal. Number of paperclips is not easy.
On further thought, this is not even necessarily true. The solution space and the model will have to be pre-cut by someone (presumably human engineers) who doesn’t know where the solution actually is. A self-improving system will have to expand both if the solution is outside them in order to find it. A system that can reach a solution even when initially over-constrained is more useful than the one that can’t, and so someone will build it.
Sorry, you are factually wrong as of how the design of automatic tools work. Rest of your argument presses too hard to recruit multitude of importance related biases and cognitive fallacies that were described on this very site.
If you have a trillion optimization systems on a planet running at the same time you have to be really sure that nothing can’t go wrong.
No I don’t, if the systems that work right took all the low hanging fruit from picking by one that goes wrong.
Well, I in turn believe you are applying overzealous anti-anthropomorphization. Which is normally a perfectly good heuristic when dealing with software, but the fact is human intelligence is the only thing in “intelligence” reference class we have, and although AI will almost certainly be different they will not necessarily be different in every possible way. Especially considering the possibility of AI that are either directly base on human-like architecture or even are designed to directly interact with humans, which requires having at least some human-compatible models and behaviours.
You seem to keep forgetting of all the software that is fundamentally different from human mind, but solves the problems very well. The issue reads like a belief in extreme superiority of man over machine, except it is a superiority of anthropomorphized software over all other software.
Any references? I haven’t seen anything that is in any way relevant to the type of optimization that we currently know how to implement. The SI is concerned with notion of some ‘utility function’, which appears very fuzzy and incoherent—what it is, a mathematical function? What does it have at input and what it has at output? The number of paperclips in the universe is given as example of ‘utility function’, but you can’t have ‘universe’ as the input domain to a mathematical function. In the AI the ‘utility function’ is defined on the model rather than the world, and lacking the ‘utility function’ defined on the world, the work on ensuring correspondence of the model and the world is not an instrumental sub-goal arising from maximization of the ‘utility function’ defined on the model. This is rather complicated, technical issue, and to be honest the SI stance looks indistinguishable from confusion that would result from inability to distinguish function of model and the property of the world, and subsequent assumption that correspondence of model and the world is an instrumental goal of any utility maximizer. (Furthermore that sort of confusion would normally be expected as a null hypothesis when evaluating an organization so outside the ordinary criteria of competence)
edit: also, by the way, it it would improve my opinion of this community if, when you think that I am incorrect, you would explain your thought rather than click down vote button. While you may want to signal to me that “i am wrong” by pressing the vote button, that, without other information, is unlikely to change my view on the technical side of the issue. Keep in mind that one can not be totally certain in anything, and while this may be a normal discussion forum that happens to be owned by an AI researcher that is being misunderstood due to poor ability to communicate the key concepts he uses, it might also be a support ground for pseudoscientific research, and the norm of substance-less disagreement would seem to be more probable in the latter than in the former.