Wow, I really appreciate your careful breakdown of all the reasons your scheme is likely to not work. I think it’s also worth considering the ethical implications of creating and killing so many AGIs along the way, something I would be uncomfortable with doing. There is also an issue of if the AGI will run deterministically such that at each step it will be functionally the same across runs.
An interesting follow-up might be to consider the what would be necessary for AI boxing to work despite the many difficulties with the general approach. Although maybe others have already looked at this?
Wow, I really appreciate your careful breakdown of all the reasons your scheme is likely to not work. I think it’s also worth considering the ethical implications of creating and killing so many AGIs along the way, something I would be uncomfortable with doing. There is also an issue of if the AGI will run deterministically such that at each step it will be functionally the same across runs.
An interesting follow-up might be to consider the what would be necessary for AI boxing to work despite the many difficulties with the general approach. Although maybe others have already looked at this?