It seems clear that many high-level programming languages are candidates for use in the first AGI. They have enough power to write that code. It seems clear, however, that the power that those languages have is incompatible with formal safety. SPARK or OCaml are made so that it is easy to prove correctness, which seems useful, but that’s not enough.
For example, we might need memory safety to provide a formal guarantee that the program cannot directly modify the part of memory containing the reward function, or the calculation of the reward. On the other hand, it seems that Turing incompleteness, which allows guaranteeing that a program terminates, would not be necessary.
So—what other (extant or yet-to-be defined) types of language safety will be needed from a language to prevent a hypothetically provably safe AI from being unsafe in practice?