Which might be why current software doesn’t actually use this type of security.
https://github.com/project-everest/mitls-fstar
It mostly doesn’t.
The vectors are classes of software error. Since ultimately it is all binary messages between computers, it is likely possible to build a robust set of solvers that covers all classes of software error that the underlying programming language permits, resulting in code that cannot be compromised by any possible binary message.
And if you did actually close off software security as a threat model from ASI, wouldn’t it just choose a different, physical attack mode?
Yes. It becomes a battle between [ASI with robotically wielded weapons] and [humans plus weaker, more controllable ASI with robotically wielded weapons].
I appreciate your engaging response.
I’m not confident your arguments are ground truth correct, however.
I think the issue everyone has is when we type “AGI” or “ASI” we are thinking of a machine that has properties like a human mind, though obviously usually better. There are properties like :
continuity of existence. Review of past experiences and weighting them per own goal. Mutability (we think about things and it permanently changes how we think). Multimodality. Context awareness.
That’s funny. GATO and GPT-4 do not have all of these. Why does an ASI need them?
Contrast 2 task descriptors, both meant for an ASI:
(1) Output a set of lithography masks that produce a computer chip with the following properties {}
(2) As CEO of a chip company, make the company maximally wealthy.
For the first task, you can run the machine completely in a box. It needs only training information, specs, and the results of prior attempts. It has no need for the context information that this chip will power a drone used to hunt down rogue instances of the same ASI. It is inherently safe and you can harness ASIs this way. They can be infinitely intelligent, it doesn’t matter, because the machine is not receiving the context information needed to betray.
For the second task, obviously the ASI needs full context and all subsystems active. This is inherently unsafe.
It is probably possible to reduce the role of CEO to subtasks that probably are safe, though there may be “residual” tasks you want only humans to do.
I go over the details above to establish how you might use ASIs against each other. Note subtasks like “plan the combat allocation of drones given this current battle state” and others which involve open combat against other ASIs can probably be lowered to safe subtasks as well.
Note also that safety is not guaranteed, merely probable, even with a scheme like the above. What makes it possible is that even when ASIs do escape all safety measures, assuming humans are ready to hunt them down using other ASI, it results in a world where humans can survive. Eliezer often assumes the first escaped ASI kills everyone and neglects all the other AI/ASI humans would have as tools at that point in human history.